Abstract

Nucleotide excision repair (NER) plays a critical role in maintaining the integrity of the genome when damaged by bulky DNA lesions, since inefficient repair can cause mutations and human diseases notably cancer. The structural properties of DNA lesions that determine their relative susceptibilities to NER are therefore of great interest. As a model system, we have investigated the major mutagenic lesion derived from the environmental carcinogen benzo[a]pyrene (B[a]P), 10S (+)-trans-anti-B[a]P- -dG in six different sequence contexts that differ in how the lesion is positioned in relation to nearby guanine amino groups. We have obtained molecular structural data by NMR and MD simulations, bending properties from gel electrophoresis studies, and NER data obtained from human HeLa cell extracts for our six investigated sequence contexts. This model system suggests that disturbed Watson-Crick base pairing is a better recognition signal than a flexible bend, and that these can act in concert to provide an enhanced signal. Steric hinderance between the minor groove-aligned lesion and nearby guanine amino groups determines the exact nature of the disturbances. Both nearest neighbor and more distant neighbor sequence contexts have an impact. Regardless of the exact distortions, we hypothesize that they provide a local thermodynamic destabilization signal for repair.

1. Introduction

Nucleotide excision repair (NER) plays a central role in preserving the genome of prokaryotes and eukaryotes. This versatile repair system removes structurally and chemically diverse bulky DNA lesions, including those induced by exposure to UV light and environmental chemical carcinogens [1, 2]. The vital importance of this mechanism is demonstrated by several human NER-deficiency syndromes including xeroderma pigmentosum (XP), cockayne syndrome (CS), and trichothiodystrophy (TTD) [3]. XP, for example, is characterized by high photosensitivity, hyperpigmentation, premature skin ageing, and proneness to developing skin cancer [4]. Furthermore, the capacity of the NER pathway is important in cancer chemotherapy [5]: NER diminishes the efficacy of chemotherapeutic agents such as cisplatin, which act via the formation of bulky DNA adducts. A better understanding of the mechanisms of recognition of DNA lesions by the NER system may lead to the design of improved chemotherapeutic drugs that can modulate the repair response. Recent findings reveal that polymorphisms in human NER repair genes have an impact on the repair of DNA lesions and cancer susceptibility [6, 7], as well as on chemotherapeutic efficacy [8].

The eukaryotic NER pathway is a biologically complicated process and consists of two sub-pathways with different substrate specificity: global genome NER (GG-NER) [9, 10] and transcription-coupled repair (TCR) [1114]. Both sub-pathways consist of ordered multistep processes, which differ in the early steps, when the DNA lesions are recognized, but converge in the later steps. In GG-NER, the focus of our present interest, the whole genome is scanned for bulky lesions to initiate the repair process. Two independent complexes, one involving the XPC/HR23B/Centrin 2 proteins [1517] and the other involving the DDB1/DDB2 heterodimer [1821], have been implicated in the early steps of base-damage recognition during NER [9]. By contrast, the TCR sub-pathway is activated by a stalled RNA polymerase during transcription [12]. Once the lesion is detected, the two sub-pathways proceed in an essentially identical manner to excise it: the multisubunit transcription factor. TFIIH, containing helicases XPB, and XPD, is recruited to the lesion site, followed by XPA, the single-strand DNA binding protein RPA, and the two nucleases XPG and XPF-ERCC1. Once assembled, a 24–32 oligonucleotide stretch containing the lesion is excised from the damaged strand. This 24–32 oligonucleotide stretch is the hallmark of a successful NER event. Finally, gap resynthesis by DNA polymerases , , and [22] and ligation by DNA ligase I complete the NER process [23].

One remarkable characteristic of the NER pathway is its ability to excise an astounding variety of chemically and structurally diverse lesions [2], and the rates of repair can vary over several orders of magnitude. However, the differences in the structural and thermodynamic properties of the lesions that control the diverse NER efficiencies have remained elusive. It has been suggested that the NER factors do not recognize the lesion itself, but rather the local distortions and destabilizations in the DNA that are associated with it [2430]. A number of different properties of damaged DNA that elicit the NER response have been proposed. These include disruption of Watson-Crick hydrogen bonding [24, 31], kinks in the damaged DNA [32], thermodynamic destabilization [24, 29, 33], diminished base stacking [34, 35], local conformational flexibility [36], and flipped-out bases in the unmodified complementary strand [3740]. A crystal structure of yeast Rad4/Rad23, the homolog of the human NER recognition factor XPC/HR23B, bound to DNA containing a cyclobutane pyrimidine dimer, shows that Rad4/Rad23 inserts a -hairpin through the DNA duplex and expels two mismatched thymines in the undamaged strand out of the duplex to bind with the enzyme (PDB ID 2QSG) [41]. This structure suggests that lesions which thermodynamically destabilize the DNA duplex and facilitate the flipping of base pairs and the intrusion of the beta-hairpin are good substrates to the NER machinery: the more locally destabilized the lesion, the better it is repaired.

The modulation of NER susceptibility for the same lesion by neighboring base sequence context, is however, a relatively unexplored area. If a lesion is better repaired in one sequence context than the other, a lesion-induced mutational hotspot could result. In order to elucidate the relationship between NER efficiency and base sequence-governed DNA distortion and destabilization induced by a bulky DNA adduct, we have employed as a model system the major lesion derived from the cancer-causing compound benzo[a]pyrene (B[a]P) [42]. B[a]P is the most well-studied member in a family of ubiquitous environmental pollutants known as polycyclic aromatic hydrocarbons. The tumorigenic metabolite of B[a]P [43] is the diol epoxide r7, t8-dihydroxy-t9,10-epoxy-7,8,9,10-tetrahydrobenzo[a]pyrene (B[a]PDE). This intermediate reacts with DNA and RNA; the most abundantly stable adduct produced in mammalian cells [4446] is the 10S ( )-trans-anti-B[a]P- -dG adduct ([G*]) (Figure 1(a)), the focus of our work. This adduct, unless removed by DNA repair mechanisms [47], is highly mutagenic [48, 49].

We have investigated the identical 10S ( )-trans-anti-B[a]P- -dG adduct in the six sequence contexts shown in Figure 1(b), utilizing an array of approaches: NER in human HeLa cell extracts, ligation and polyacrylamide gel electrophoresis techniques to assess bending properties of the modified duplexes, and structural studies utilizing high resolution NMR methods as well as unrestrained molecular dynamics (MD) simulations. The position of the B[a]P ring system in the B-DNA minor groove, directed 5′ along the modified strand, was first determined by NMR in the 5′- C[G*]C-I sequence in 1992 [50], but sequence-governed structural details as well as dynamic properties remained to be elucidated. One important motivation for our work was to explore the role of nearby guanine amino groups on the structural properties and NER susceptibilities of these duplexes. The key difference in these duplexes is the presence and positioning of guanines flanking the [G*], either immediately adjacent to the lesion or beyond: the B[a]P rings compete for space with the bulky amino group of guanine on the minor groove side of B-DNA, which we anticipated would differentially impact the structures of the damaged duplexes in a sequence context-dependent manner. A further motivation was to explore the role of differing sequence contexts beyond the lesion that vary in intrinsic flexibility. We hypothesized that subtle but critical structural effects governed by sequence context would manifest themselves by impacting NER efficiencies. Our results determined that sequence context could cause an up to four-fold difference in relative NER susceptibility, with even distant neighbors influencing NER. Locally disturbed Watson-Crick hydrogen bonding and flexible bending are two key sequence-governed structural distortions caused by this lesion that the NER machinery appears to recognize with different efficiencies. More generally, different lesions in varied sequence contexts will cause different kinds of distortions; thus, the extent of the local thermodynamic destabilization will also vary; we hypothesize that it is the extent and type of destabilization that determines the relative NER efficiency.

2. Nearest Neighbor Base Sequence Context Impacts NER of the 10S ( )-trans-anti-B[a]P-N -dG Adduct

The 5′- C[G *] , 5′- G[G *]C , and 5′- I[G *] Sequences. High resolution NMR solution studies have shown that the bulky aromatic B[a]P residue is positioned in the minor groove on the 5′-side of [G*] [51] in the 5′- C[G*]G and 5′- G[G*]C duplexes (Figure 2). However, there are sequence-governed differences in some of the structural features. Specifically, in the 5′- C[G*]G duplex, NMR studies revealed that the C   G base pair on the 5′-side of [G*] is severely disturbed. In the case of the sequence-isomer 5′- G[G*]C duplex, this perturbance is not observed. On the other hand, analyses of MD simulations [51, 52] based on the NMR data revealed significant unwinding near the lesion site combined with an anomalously enlarged Roll (Figure 3), not observed in the 5′- C[G*]G duplex. Polyacrylamide gel electrophoresis techniques revealed an unusual slow electrophoretic mobility of the 5′- G[G*]C duplex, which is a manifestation of a kink [53] that is highly flexible [54]. This flexible bend is caused on a molecular level by the severe untwisting and enlarged Roll determined by MD from the NMR data: DNA bending is largely caused by increased Roll, which is correlated with untwisting [5557]. The underlying structural reasons for the disturbed Watson-Crick hydrogen bond in the 5′- C[G*]G case and the flexible bend in the 5′- G[G*]C duplex were revealed from MD simulations: for 5′- C[G*]G , the bulky amino group on G20 (Figure 3), which is partner to the C on the 5′ side of [G*], is sterically crowded by the B[a]P ring system since both are on the minor groove side, and hence this C5   G20 base pair is episodically denatured (Figure 3(a)); for the 5′- G[G*]C case, the B[a]P rings crowd the G6 amino group, and in this case the crowding is relieved by the severe untwisting accompanied by the increased Roll, which produces the flexible bend observed by gel electrophoresis. Investigations with the 5′- I[G*]C sequence context substantiated the critical role of the guanine amino group since “I” (Figure 1(b)) lacks this group: the gel electrophoretic manifestation of a flexible bend was abolished. The NMR data showed conformational heterogeneity in minor groove conformations [51], and the MD simulations showed episodic denaturation of one of the two hydrogen bonds at the I:C base pair, explaining the heterogeneity.

The repair efficiency relative to 5′- C[G*]C-I , the standard sequence utilized in many NMR and NER studies [53, 58], is 4.1 ± 0.2, 1.7 ± 0.2 and 1.3 ± 0.2 for the 5′- C[G*]G , 5′- G[G*]C and 5′- I[G*]C duplexes, respectively (Figure 4). In the 5′- C[G*]G duplex, dynamic episodic denaturation of Watson–Crick base pairing flanking the lesion on the 5′-side correlates with the greatest NER susceptibility while the flexible bend in 5′- G[G*]C is a less pronounced NER recognition signal, and the disturbance to one hydrogen bond in the 5′- I[G*]C case provides a still lesser signal [52, 53] in this series.

The 5′- C[G *]C-I and 5′- T[G *]T-II Sequence Contexts. The 5′- C[G*]C-II and 5′- T[G*]T-II sequences (Figure 1(b)) are of unusual interest for several reasons. While a single, well-defined minor groove adduct conformation is observed in 5′- C[G*]C duplexes [50], in the 5′- T[G*]T-II sequence context, the minor groove-aligned adduct conformation is heterogeneous [59]. Furthermore, polyacrylamide gel electrophoresis studies showed that the adduct induces a rigid bend in the 5′- C[G*]C-II DNA duplex [60], while in the 5′- T[G*]T-II sequence context, the lesion induces a highly flexible bend [59, 60]. Also, the 5′- T[G*]T-II 11-mer duplex has a lower thermal melting point than the 11-mer 5′- C[G*]C-II duplex (the exact difference depends on sequence length) [61] as expected from the thermodynamic properties of T   A and C   G Watson-Crick base pairs [62, 63]. Molecular insights on these experimental observations [64] were provided by MD simulations for the 5′- T[G*]T-II and 5′- C[G*]C-II duplexes. Consistent with the conformational heterogeneity observed in the NMR studies [59], it was found that the 5′- T[G*]T-II duplex is much more dynamic than the 5′- C[G*]C-II duplex: the highly dynamic base pair on the 5′-side of the lesion exhibits episodic denaturation of one of the two Watson-Crick hydrogen bonds, in agreement with the partial rupturing of this base pair observed by the NMR methods [59]; also, the 5′- T[G*]T-II duplex shows somewhat increased and more dynamic Roll and untwisting compared to the 5′- C[G*]C-II duplex, consistent with the flexible bend observed only for the 5′- T[G*]T-II case; in addition, the B[a]P ring system exhibits greater mobility and the duplex groove dimensions are more variable. The differences are accounted for by a coupled series of properties: the intrinsically weaker stacking of T-G compared to C-G steps allows for greater flexibility in the 5′- T[G*]T-II duplex; the weaker T   A pair, with only two hydrogen bonds, compared to the C   G pair, with three bonds, provides enhanced flexibility; moreover, the absence of guanine amino groups adjacent to the [G*] in the 5′- T[G*]T-II case allows for greater mobility for the B[a]P ring system. Overall, the greater flexibility of the 5′- T[G*]T-II sequence is attributable to the absence of the guanine amino group.

The rates of incision in the human HeLa cell assay relative to 5′- C[G*]C-I is 2.4 0.2 and 1.6 0.2 for the 5′- T[G*]T-II and the 5′- C[G*]C-II duplexes, respectively [53], corresponding to a 1.5 0.2-fold higher-repair efficiency for the 5′- T[G*]T-II case relative to 5′- C[G*]C-II The better repair susceptibility in the 5′- T[G*]T-II case is consistent with the overall enhanced dynamics manifested in various structural properties, notably Watson-Crick hydrogen bonding and bending.

3. Distant Neighbor Base Sequence Context Affects NER of the 10S (+)-trans-anti-B[a]P-N -dG Adduct

The  5′- C[G*]C-I  and  5′- C[G*]C-II sequences (Figure 1(b)) differ in the sequences beyond the nearest neighbors to [G*].

Since different sequence steps are known to be differentially flexible [57, 65], we hypothesized that the same minor groove lesion [50, 64] with different distant neighbors would be differentially repaired. Polyacrylamide gel electrophoresis and self-ligation circularization experiments revealed that the 5′- C[G*]C-II duplex is more bent and suggested that it has more torsional flexibility than the 5′- C[G*]C-I duplex [66]. Our MD simulations revealed the underlying structural origins to this bending difference. The key role is played by the unique -C3-A4-C5- segment in the 5′- C[G*]C-II duplex. The more torsionally flexible bend observed for the 5′- C[G*]C-II duplex originates from the guanine amino group at the C3   G20 pair (Figure 5). This amino group acts as a wedge to open the minor groove; facilitated by the highly deformable local -C3-A4- base step, the amino group allows the B[a]P ring system to better bury its hydrophobic surface within the groove walls. This produces a yet more enlarged minor groove which is coupled with more local untwisting and more enlarged and flexible Roll [67], causing the greater bend in 5′- C[G*]C-II [66] (Figure 5).

The NER efficiencies are 1.6 0.2 times greater in the 5′- C[G*]C-II than in the 5′- C[G*]C-I sequence context [66] showing that distant neighbors to [G*] modulate the NER susceptibility. The greater NER susceptibility for the 5′- C[G*]C-II duplex is explained by its greater bending with enhanced flexibility: the intrinsic minor groove enlargement caused by both the guanine amino groups [55, 68] and the great flexibility of pyrimidine-purine steps, including the C-A step [57, 6972] allow the B[a]P moiety (Figure 5) to more favorably position itself, but at the expense of the greater bend that makes it more repair-susceptible.

4. Understanding Repairability Differences: the Degree of Local Thermodynamic Destabilization Is a Unifying Hypothesis

We have carried out a series of studies with the same 10S (+)-trans-anti-B[a]P- -dG lesion in a number of sequence contexts that differ in how the lesion is positioned in relation to nearby guanine amino groups. Additionally, we have considered differences in intrinsic flexibility of sequences flanking the lesion. These are model systems for gaining understanding of NER lesion recognition factors. We have obtained molecular structural data by NMR and MD simulations, bending properties from gel electrophoresis studies, and NER data from human HeLa cell extracts for all of our investigated sequence contexts (Figure 1(b)). Figure 4 summarizes our key findings and enables us to infer a hierarchy of NER recognition signals for the series of sequences and the single lesion we explored. We point out here that a variety of structural disturbances are found in each case, which are correlated. Examples include impaired Watson-Crick pairing that is accompanied by diminished base stacking, and DNA bending towards the major groove, that is induced by a minor groove lesion and is accompanied by minor groove enlargement. Our present model system suggests that disturbed Watson-Crick base pairing is a better recognition signal than a flexible bend, and that these can act in concert to provide an enhanced signal: for example, for 5′- T[G*]T-II one episodically ruptured Watson-Crick hydrogen bond combined with the flexible bend results in better repair than just one disturbed hydrogen bond as in 5′- I[G*]C , or the flexible bend alone in 5′- G[G*]C (Figure 4). For our system, steric hindrance between the minor groove-aligned lesion and nearby guanine amino groups, if present, determines the exact nature of the disturbances, depending on exactly where the guanine amino groups are situated. The intrinsic flexibility of the specific base steps also plays an important role in causing the differential disturbances. Both the nearest neighbor and the more distant neighbor sequence contexts have an impact.

More globally, different lesions may cause different types of distortions depending on the specific nature of the lesion and its sequence context. However, regardless of exactly what these distortions are, we hypothesize that they must provide a local thermodynamic destabilization signal for repair to ensue, and the greater the extent of destabilization, the better the repair. The destabilization would facilitate the strand separation, base-flipping, and -hairpin insertion by the XPC/HR23B recognition factor [41, 73] needed to initiate NER. In this way, the NER machinery would excise a large variety of lesions with different efficiencies, by recognizing the thermodynamic impact of the lesions rather than the lesions themselves [24, 29, 41, 73]. Lesions that resist NER present a great hazard, as they survive to the replication step and produce a mutagenic outcome; such NER-resistant lesions provide an important opportunity for gaining further understanding of the mechanism utilized by the NER apparatus to recognize different lesions [74].

Abbreviations

B[a]P:benzo[a]pyrene
B[a]PDE:benzo[a]pyrene diol epoxide
(+)-anti-B[a]PDE:(+)-(7R,8S,9S,10R)-7,8-dihydroxy-9,10-epoxy-7,8,9,10 -tetrahydrobenzo[a]pyrene
NER:nucleotide excision repair
MD:molecular dynamics.

Acknowledgments

The experimental portion of this paper was supported by NIH Grant CA-099194 (Nicholas E. Geacintov), and the computational aspects were supported by Grant CA-28038 (S. Broyde). Partial support for computational infrastructure and system’s management was also provided by CA75449 (S. Broyde). Support for this paper to Dinshaw J. Patel. was provided by CA-046533. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.