Abstract

DNA adducts, which block replicative DNA polymerases (DNAPs), are often bypassed by lesion-bypass DNAPs, which are mostly in the Y-Family. Y-Family DNAPs can do non-mutagenic or mutagenic dNTP insertion, and understanding this difference is important, because mutations transform normal into tumorigenic cells. Y-Family DNAP architecture that dictates mechanism, as revealed in structural and modeling studies, is considered. Steps from adduct blockage of replicative DNAPs, to bypass by a lesion-bypass DNAP, to resumption of synthesis by a replicative DNAP are described. Catalytic steps and protein conformational changes are considered. One adduct is analyzed in greater detail: the major benzo[a]pyrene adduct , which is bypassed non-mutagenically (dCTP insertion) by Y-family DNAPs in the IV/ -class and mutagenically (dATP insertion) by V/ -class Y-Family DNAPs. Important architectural differences between IV/ -class versus V/ -class DNAPs are discussed, including insights gained by analyzing ~400 sequences each for bacterial DNAPs IV and V, along with sequences from eukaryotic DNAPs kappa, eta and iota. The little finger domains of Y-Family DNAPs do not show sequence conservation; however, their structures are remarkably similar due to the presence of a core of hydrophobic amino acids, whose exact identity is less important than the hydrophobic amino acid spacing.

1. Introduction

DNA damaging agents (genotoxins) cause mutations that initiate tumor formation, which makes sense given that tumor cells have mutations in key growth control genes that lead to improperly regulated cell growth [1, 2]. The steps leading to mutagenesis vary depending on the genotoxin, but the paradigm in Figure 1 illustrates many of the typical steps using one particularly well-studied chemical carcinogen benzo[a]pyrene [35]. At the apex of this process are DNA adducts, which, if they are not removed by DNA repair, usually block replicative DNA polymerases (DNAPs). To overcome such potentially lethal blockage, cells have DNAPs that do translesion synthesis (TLS) past these DNA lesions/adducts [622].

Cells possess many DNAPs; for example, human cells, yeast (S. cerevisiae) and E. coli have at least fifteen, eight and five, respectively, [622]. Most TLS-DNAPs are in the Y-Family [622], where humans have three template-directed members (hDNAPs , , and ), yeast has one (scDNAP ), and E. coli has two (ecDNAPs IV and V). Y-Family DNAPs have a conserved 350 aa core, which includes the polymerase active site (representative references [2340]). As with all DNA polymerases, Y-Family members resemble a right-hand with thumb, palm, and fingers domains, although their “stubby” fingers and thumb result in more solvent accessible surface around the template/dNTP binding pocket [19], which is undoubtedly the case to accommodate the bulky and/or deforming DNA adducts/lesions that protrude into these open spaces during bypass. Y-Family DNAPs grip DNA with an additional domain, which is usually called the “little finger domain” or the “polymerase-associated domain” (PAD) [2325].

Y-Family DNAPs are found in all three domains of life, bacteria, archaea, and eukaryotes, which undoubtedly reflects the fact that all cells face the same issues when confronting the need to replicate past DNA damage. The pattern of TLS is often strikingly similar in different cell types. For example, human DNAP was originally discovered because its sequence closely resembles E. coli DNAP IV [4143], and dNTP insertion opposite a variety of adducts/lesions is remarkably similar for the DNAP IV/ pair (Table 1), suggesting they are functional orthologs (discussed in [44]). E. coli DNAP V and human DNAP are also functional orthologs, based on their similarity of dNTP insertion opposite a variety of adducts/lesions (Table 1, [44]). Cases have been made that the IV/ -class is present in cells to bypass endogenously generated -dG adducts, and the V/ -class is present to bypass UV-induced photoproducts, as discussed below.

B-Family DNAPs can also be involved in TLS, such as DNAP II in E. coli and REV3 (the polymerase subunit of DNAP ), which is present in most eukaryotes [6, 7, 1315]. B-family TLS-DNAPs are involved in a DNA repair process involving some interstrand DNA cross-links [4549] and in TLS of some adduct/lesions (see below).

Herein, we reflect principally on how structural architecture of Y-family DNAPs might affect their mechanism as it relates to cellular function, in particular why lesion-bypass is sometimes nonmutagenic and other times it is mutagenic. Extensive reviews that focus more on the cell biology, regulation and phenomenology have appeared recently for Y-Family DNAPs from bacteria [21, 22] and eukaryotes [712].

2. Translesion Synthesis DNA Polymerases in E. coli

E. coli has proven to be an excellent model system to study many aspects of the bypass of DNA adducts/lesions by TLS-DNAPs. E. coli has two Y-Family DNA polymerases: DNAP IV (dinB gene, 351 aa, 39.5 kDa) and DNAP V, which consists of one subunit of UmuC (umuC gene, 422 aa, 47.7 kDa) and two subunits of UmuD’ (see below). UmuD’ is derived from UmuD (umuD gene, 139 aa, 15 kDa) following autodigestive removal of its 24 N-terminal aa, when stimulated by RecA* [13, 14, 21, 22, 50]. DNAP II (polB gene, 783 aa, 90 kDa) is a B-Family lesion bypass DNAP. DNAPs II, IV and V are each induced as part of the SOS response, which is triggered by DNA damage and leads to the induction of 40 proteins that help E. coli cope with the damage [50]. The basal and SOS-induced levels are different for each polymerase, where the [uninduced/induced] levels are [ 40/ 280] for DNAP II, [ 250/ 2500] for DNAP IV and [~15/~200] for DNAP V [5153]. It seems likely that each of these TLS-DNAP is present in E. coli principally to overcome the cellular problems presented by a lesion commonly encountered in cells as discussed next.

Although DNAP V replicates undamaged templates with relatively low fidelity (10-3 to 10-4) [54], one striking quality is its ability to accurately bypass UV photoproducts; for example, it inserts dATP opposite TT-CPDs [54]. Analysis of insertion tendencies opposite a variety of adducts/lesions led to the observation that DNAP V may have two insertion modes: (i) correct dNTP insertion, and (ii) default dATP insertion [44]. UV light is a frequently encountered form of DNA damage for which a TLS-DNAP might be important, and since TT-CPDs are the major UV lesion [55], a default dATP insertion mode might help minimize UV mutagenesis. However, the utilization of this second mode in other circumstances may have drawbacks. For example, UV mutagenesis also depends on the umuD/C genes, implying that DNAP V is required for UV mutagenesis,where C T mutations in -PyC sequences predominate, which also implies dATP insertion (discussed in reference [56]). DNAP V is involved in other mutagenesis pathways; for example, it inserts dATP opposite +BP in the G T mutational pathway [57], as discussed below. In fact, the preferential mutagenic insertion of dATP opposite a variety of DNA lesions in E. coli has been called the “A-rule” (see [58, 59] and references therein), and it seems likely that this is attributable to DNAP V’s tendency to insert dATP [44]. Based on lesion-bypass specificity (Table 1), E. coli DNAP V appears to be the functional ortholog of human DNAP [36], which is almost certainly responsible for correct bypass of UV-lesions in human cells and minimizing UV-light mutagenesis that leads to skin cancer [6065].

On its own, UmuC, which is the polymerase subunit of DNAP V, either misfolds or aggregates and is found in inclusion bodies [22, 50, 66]. UmuC copurifies with UmuD’, though the yield is invariably low [22, 50, 66]. RecA is also required for efficient DNAP V activity, and recently, the “DNAP V mutasome” was shown to be a UmuC/UmuD’2/RecA heterotetramer [67, 68]. The RecA monomer is added from the -end of a RecA filament either in cis or in trans, where the former seems intuitively more likely, since UmuC/UmuD’2 would encounter a -cis-RecA at a lesion site, given that RecA filaments coat ss-DNA on the downstream side of a lesion-blocked replication fork. To form RecA filaments on ss-DNA, SSB must first be removed, which is accomplished by RecFOR [69]. Interestingly, some evidence suggests that the RecA eukaryotic homolog Rad51 is able to stimulate DNAP , which is the DNAP V ortholog [70]. -clamp also plays a significant role with DNAP V as discussed below.

DNAP IV replicates undamaged DNA only 5-fold less accurately than the catalytic -subunit of DNAP III [54]. It is prone to making −1 frameshift mutations in homopolymeric runs of six or more G : C base pairs, and base substitutions also result [22, 71]. DNAP IV’s most striking quality is its ability to accurately bypass a variety of N2-dG adducts [7277]. Methylglyoxal is produced nonenzymatically from various cellular trioses and forms N2-(1-carboxyethyl)- -dG as its major stable adduct, which is bypassed accurately by DNAP IV [76]. Oxidative metabolism forms reactive oxygen species that generate lipid peroxidation products that give exocyclic adducts, some of which can ring-open to N2-dG adducts in ds-DNA [78] and might be bypassed by DNAP IV, though this has not been investigated experimentally. These observations have led several groups to speculate that the cellular rationale for the genesis of the IV/ -class of Y-Family DNAPs is the accurate bypass of N2-dG adducts derived from various endogenous mechanisms [72, 76].

No analogous story vis-a-vis adducts/lesions has yet emerged to provide a rationale for the presence of B-family DNAP II in cells, though one possibility is its involvement in an accurate DNA repair pathway for interstrand cross-links [45]. An analogous pathway involving B-family DNAP exists in eukaryotic cells, and a pathway has been proposed [4649]. As discussed below, DNAP II functions in other TLS pathways.

UmuD2C (not UmuD’2C) is thought to slow down normal DNA replication in response to DNA damage, thus allowing additional time for lesion removal, which is considered a DNA damage checkpoint analogous to what happens in eukaryotic cells [79]. Another mechanism to accomplish this was recently described: DNAP II or IV can associate with the DnaB helicase and slow down the replication fork [80].

The TLS-DNAPs also confer selective advantage on E. coli during long periods in stationary phase, the so-called “growth advantage in stationary phase” (GASP) phenotype [81]. Finally, DNAP IV is particularly elevated in stationary phase (~7500/cell) and is implicated in adaptive mutagenesis [82].

3. Eukaryotic Y-Family DNAPs

Extensive reviews of the cell biology, regulation, and phenomenology of eukaryotic Y-Family DNAPs have appeared recently [712]. Herein, we focus on structural considerations that relate to lesion bypass, though we briefly describe each of the four subclasses of eukaryotic Y-Family DNAPs: REV1, DNAP , DNAP , and DNAP .

REV1 is not a traditional template-directed polymerase and does not use base-base hydrogen bonding. Rather REV1 is a dCTP insertase that flips template dGs out of the helix, after which dCTP insertion is directed by hydrogen bonding to a REV1 arginine residue [83]. REV1 seems to play a central role in many lesion bypass events as a structural component, and DNAPs κ, η, and ι each have REV1 binding domains [9].

DNAP κ is the eukaryotic ortholog of DNAP IV (Table 1), and they seem to be present in cells to accurately insert opposite N2-dG adducts [84], for example, DNAP κ deficient cells are sensitized to killing by benzo[a]pyrene, which predominantly forms an N2-dG adducts [85]. DNAP κ uniquely has an N-terminal extension of 100 aa called the “N-clasp” [35]. The N-clasp has three -helices in a U-shape, one of which (~aa30–50) binds on the surface of the fingers domain, the second of which ( aa50–75) links the fingers and thumb domains and lies diagonally across the duplex region of DNA, and the third traverses the thumb domain to the usual site of the N-terminus in Y-Family DNAPs. Removal of the N-clasp significantly decreases DNAP polymerase activity [35]. The presence of the N-clasp has implications for lesion bypass; for example, DNAP does not bypass the N6-dA adduct of benzo[a]pyrene [86], which has been attributed to a steric clash between the N-clasp and the pyrene moiety, as revealed in a molecular modeling study [87]. DNAP structure is considered below.

A major role of DNAP is nonmutagenic bypass of UV lesions, such as TT-CPDs, and humans deficient in DNAP have the cancer-prone syndrome Xeroderma pigmentosum variant (XPV), which leads to a high incidence of UV-induced skin cancer [6065]. Both human and yeast DNAP preferentially insert dATP opposite the -T and -T of a TT-CPD, with misinsertion being higher at the -T, where dGTP is incorporated ~3% of the time [88, 89]. Recently, X-ray structures of a TT-CPD in the active site of yeast DNAP [39] and human DNAP [40] have emerged. These findings are presented in a separate section(Section 8),after certain principles about Y-Family DNAPs structure have been discussed. DNAP also plays a role in accurate bypass of the oxidative lesion 8oxoG [90] and adducts formed by the anticancer drug cis-platinum [91].

The role of DNAP in cells is more enigmatic, though the fact that deficient cells show enhanced sensitivity to oxidative damage may be revealing [92]. One interesting feature of DNAP ι is its propensity to use syn-purines in the template to form Hoogsteen base pairs syn-A : T and syn-G : C[36, 37].A cellular rationale for this is the following. Oxidative damage leads to lipid peroxidation products,which form exocyclic adducts that block the Watson-Crick antiface of the DNA bases. Exocyclic purine adducts in the syn-configuration can still base pair through their Hoogsteen face. The syn-A : T base pair has two hydrogen bonds, just like anti-A : T. However, both syn-G : C and syn-G : T base pairs have only one hydrogen bond, unless a proton is trapped. In a syn-GH+ : C base pair the trapped proton is between N7G and N3C, whose pKa values are relatively high (~3), while a trapped proton in a syn-GH+ : T base pair would be between O6G and O4T, whose pKa values are much lower. Thus, a syn-GH+ : T base pair is expected to be less stable than a syn-GH+ : C base. Evidence for DNAP ι using syn-GH+ : C base pairing exists [37]. 1,N6-etheno-A directs both dCTP and dTTP incorporation, and in this case both require a trapped proton [38], where the pKa values of the relevant atoms trapping the proton are more equal. While such thinking is considered satisfying [12], one finding suggests that the situation can be more complex. DNAP preferentially incorporates dCTP opposite the major adduct of 2-acetylaminofluorene (AAF-C8-dG) [93], Syn-AAF-C8-dG places the bulky AAF-moiety in the minor groove, where molecular modeling showed that it does not fit, while anti-AAF-C8-dG : C pairing, which places the AAF-moiety in the spacious major groove side of DNAP , is possible [94]. The authors propose that purine adducts with bulk on the minor groove side probably use syn-purine pairing, but that purine adducts with bulk on the major groove side probably use antipurine pairing. Anti-AAF-C8-dG : C pairing requires a modest change in sugar pucker (from C -endo to C -exo), as noted in modeling studies with both DNAP [94] and Dpo4 [95].

Base substitution rates on undamaged templates are relatively high with all of these polymerases: yDNAP η ( 10-2), hDNAP (~3.5 10-3), h DNAP (~6 10-4), and hDNAP actually prefers to form template-dT : dGTP; indel mutation rates are all in the same range (~1–2.4 10-3) (reviewed in [7]).

4. Y-Family DNAP Mechanistic Steps

A number of comprehensive reviews have appeared that analyze the structures of Y-Family DNAPs [1012, 32, 33]. In this section, we focus on what is known about protein structural changes that occur during DNA synthesis as probed via X-ray structural analysis and other techniques, principally with Dpo4. The chemistry of catalysis is also considered.

Upon DNA binding to Apo-Dpo4, the thumb/palm/fingers domains do not change their structure dramatically. However, the little finger domain acts like a door, which is open in Apo-Dpo4, and then rotates ~130° to close around DNA; in particular, it binds in the major groove in the duplex region from about L + 3 to L + 8 [33]. This motion is facilitated by the fact that the little finger is connected to the rest of the protein by a simple ten amino acid tether. Once binary-Dpo4 is formed, the palm, fingers and little finger translate ~3.3 Å along the helix as the next template base slides into the active site, which opens the space into which the complementary dNTP binds to give ternary-Dpo4 [32]. The thumb domain, however, does not move in this step, but, rather, moves either before, during, or after the subsequent covalent reaction step. A variety of subtler changes in Dpo4 structure are also reported to accompany these steps [32, 33]. Kinetic studies reveal that Y-Family DNAPs have a rate-determining conformational change before dNTP incorporation [10, 96], and three conformational states E, , and have been reported, where the conformational transition is rate determining, though the nature of these states have not been identified. Recently, hydrogen-deuterium exchange in tandem with mass spectrometry has been used to study conformational changes in Dpo4 brought about by dNTP binding [30]. Correct dNTP binding affects the structure of a loop between the B-helix and the C-helix above the Dpo4 active site. (The positioning of these features can be inferred from the UmuC(V) sequence in Figure 2.) Another conformational change was also detected in the H-helix, which contacts the primer strand and was proposed to move away from the active site in conjunction with ds-DNA movement to permit room for correct dNTP binding. The F-helix also moves, but this motion is not specific for the correct dNTP. In terms of lesion bypass, Dpo4 showed decreased catalytic efficiency with increasing bulk of N2-dG adducts, which was attributed more to the effects of the bulky lesion on the rate of the catalytic step than on the rate of the conformational steps [31].

Several studies have shown that dNTP incorporation is more dependent on base: base hydrogen bonding for Y-Family DNAPs than for DNAPs in other families during the replication of both undamaged and damaged DNA [9799].

The steps in covalent catalysis by Dpo4 have been explored using a combination of molecular modeling/dynamics and ab initio QM/MM minimizations; a novel water-mediated and substrate-assisted mechanism was proposed [100]. In the first step, a water molecule in the active site serves as a conduit to deprotonate the primer -OH and protonate an oxygen on the -phosphate of the dNTP. In the second step, a second water molecule in the active site serves as a conduit to deprotonate the oxygen on the α-phosphate of the dNTP and to protonate an oxygen on the -phosphate. Following these two steps the deprotonated -O of the primer is a stronger nucleophile and attacks the -phosphate, while the second water molecule serves as a conduit again—this time to deprotonate the -phosphate of the dNTP and to protonate the -phosphate, which is on the pyrophosphate leaving group, thus facilitating its removal.

5. The Steps Leading to Translesion Synthesis in E. coli

A well-developed model for the steps in translesion synthesis has emerged for E. coli [107]. Replicative DNAP III stalls at many adducts. For example, in the case of AAF-C8-dG, the exonuclease activity of DNAP III competes with its polymerase activity, such that [L-1] : [L0] ratio is ~10 : 1 ratio of primers, as determined in vitro [66]. A TLS-DNAP probably helps dissociate a stalled DNAP III from the lesion site (see below). DNAP III reinitiates replication hundreds to thousands of base pairs downstream of the adduct/lesion at the next primosome assembly site in a process called “replication restart,” either on the lagging strand using the normal lagging strand machinery (i.e., PriA/B/C, DnaB/C/T, and primase), or on the leading strand, whose details are being worked out [108, 109]. This leaves an ss-gap between the lesion site and the site where DNAP III did replication restart. This gap is either filled via recombination or via DNA replication, which begins with the action of TLS-DNAPs [15, 108, 109].

DNAP IV binds -clamp to help release a stalled DNAP III from the same -clamp, leaving DNAP IV/ -clamp at the site of the lesion [110]. This process is rapid (  s). Presumably, a similar mechanism operates for each TLS-DNAP (II, IV and V), which all have -clamp binding sites (consensus: QLxLF) that are required for them to be active in E. coli [111]. An X-ray structure shows that the underlined amino acids QLVLGL at the C-terminus of DNAP IV form the main interactions with a “cleft” in the -clamp [112]. The α-subunit of DNAP III and the δ-subunit of the -complex also bind to the cleft in the -clamp. DNAP IV and V can also bind to a site in the “rim” of the -clamp, but this seems unimportant for TLS [113115]. In vitro studies show that -clamp stimulates both polymerase activity and processivity of TLS-DNAPs: the addition of -clamp in vitro increases DNAP IV activity ~2000-fold and processivity from 1 nucleotide to ~400 nucleotides, and also increases DNAP V activity ~100-fold and processivity from 1-2 nucleotides to ~18 nucleotides [22].

What factors affect the choice about which TLS-DNAP will insert opposite a particular lesion? Several lines of evidence suggest that E. coli has a hierarchy for the replication of normal, unadducted DNA when DNAP III is inactivated: DNAP II IV V [116]. (The assays did not permit an assessment of DNAP I.) Since this order (III II IV V) does not reflect the relative concentration of these DNAPs in cells (see above), another mechanism for decision making was suggested, such as relative DNAP affinity for the -clamp. This order does reflect relative fidelity of these DNAPs and would be a sensible order for E. coli to allow TLS-DNAPs to initially sample adducts/lesions prior to a decision about which will do TLS. But the ultimate decision is probably predominantly controlled by which TLS-DNAP is most efficient at bypassing a particular adduct/lesion biochemically.

After insertion opposite the lesion, additional extension synthesis by a TLS-DNAP is required, or else DNAP III’s proof-reading exonuclease activity will remove the inserted nucleotides back to the site of the lesion [66, 117]. The amount of extension required before DNAP III can resume normal synthesis appears to be pathway dependent, where it is [L + 4] for the AAF-C8-dG nonmutagenic pathway with DNAP V, and [L + 3] for the AAF-C8-dG −2 frameshift pathway with DNAP V [66, 83, 117].

6. Two Case Studies Showing the Interplay of Y-Family DNAPs in Translesion Synthesis

More is known about the details of TLS for the major adduct of N-2-acetylaminofluorene (AAFC8-dG), and N2-dG adducts in E. coli than for any other adducts/lesions in any other model system. In these cases, multiple translesion DNAPs are involved in both the nonmutagenic and mutagenic pathways, as outlined in this section.

AAF was originally developed as a potential pesticide, but it was abandoned when it was found to be a potent rat carcinogen [118]. Following activation, AAF principally binds at C8-dG, as do most aromatic amine mutagens/carcinogens, where AAF and AAF-C8-dG have frequently been used as models to probe the mutagenic and carcinogenic mechanisms of aromatic amines [117]. In E. coli AAF has a major mutational hot spot in -CG1CG2 sequences in which it induces −2 frameshift mutations [117]. AAF-C8-dG at G2 (but not G1) causes a −2 frameshift mutation in a DNAP II-dependent process, or causes no mutation in a DNAP V-dependent process [117]. The current model is that AAF-C8-dG at a replication fork exists in two different conformations [66, 117]. In one conformation, the adducted dG moiety is in a −2 slipped intermediate, which DNAP II uses for insertion, and then in the presence of β-clamp an additional three extension steps (to L + 3) are accomplished, at which point replication can be successfully continued by DNAP III [66]. From a nonslipped intermediate, DNAP V inserts dCTP opposite AAF-C8-dG and then extends by adding four more dNTPs (to L + 4), after which DNAP III can successfully continue replication [66]. These two pathways are followed approximately equally in cells, though by manipulating the concentration of DNAP II versus DNAP IV, the ratio [−2 frameshift : no mutation] can be modulated, suggesting that the two conformations interconvert [93]. In vitro in -CG1CG2 sequences DNAP II also does TLS to give a bypass product that should ultimately yield a −1 frameshift mutation, which are not, however, observed in vivo; recent in vitro studies suggest that DNAP II cannot extend far enough from the −1 frameshift intermediate, and, thus, the exonuclease activity of DNAP III degrades the intermediates in the −1 frameshift pathway [66].

Molecular modeling has provided insights about how lesion bypass might occur; for example, a modest alteration in sugar pucker (from C -endo to C -exo) is required before AAF-C8-dG can Watson-Crick base pair with dCTP [94, 119]. Though this work was done in Dpo4 and hDNAP ι, there is every reason to think that a similar conclusion would be reached for DNAP V. Recently, X-ray structures of the corresponding deacetylated adduct AF-C8-dG has been reported [32]. Regrettably, the structures do not reveal insights about how dCTP might be inserted opposite AF-C8-dG, but they do offer a glimpse of more-or-less normal Watson-Crick AF-C8-dG : dC base pairing in the L + 1, which has the AF moiety in the opening on the major groove side of Dpo4, and in the L + 2 position, in which the AF-moiety is accommodated by a modest rearrangement in the little finger domain.

Benzo[a]pyrene (B[a]P) is a well-studied DNA damaging agent that is a potent mutagen/carcinogen and an example of a polycyclic aromatic hydrocarbon (PAH), a class of ubiquitous environmental substances produced by incomplete combustion [120, 121]. PAHs in general and B[a]P in particular induce the kinds of mutations thought to be relevant to carcinogenesis and may be important in human cancer [122128]. B[a]P mutational spectra were established with the major metabolite that reacts with DNA (i.e., (+)-anti-B[a]PDE), in E. coli [129], yeast [130, 131] and mammalian (CHO) cells [132]. Mutagenesis has also been studied with [+ta]-B[a]P-N2-dG (+BP, Figure 1), the major adduct of (+)-anti-B[a]PDE, and G T mutations predominate in most cases (see [133] and references therein).

DNAPs IV and V of E. coli are both involved in TLS with B[a]P-N2-dG adducts, although they play very different roles. In studies with purified proteins, DNAP IV inserted dCTP ( 99%) opposite both +BP and its mirror image −BP ([-ta]-B[a]P-N2-dG) in a -CGA sequence, while DNAP V inserted dATP ( 99%) [77]. This tendency is evident in E. coli. DNAP IV is required in the nonmutagenic pathway with +BP [7275], −BP [75] and other N2-dG adducts [72, 76]. An amino acid change (F12I) at the conserved “steric gate” (which excludes rNTPs) decreases dCTP insertion in vitro opposite several N2-dG adducts and similarly decreases TLS in vivo, which argues that DNAP IV does dCTP insertion in vivo [72]. In the nonmutagenic pathway DNAP V is required in addition to DNAP IV with +BP [7375]. Why are two DNAPs required for nonmutagenic TLS with +BP: certain lesions need one DNAP for insertion and a second for extension [134, 135]. Thus, if DNAP IV does dCTP insertion [7277], then DNAP V must do extension, which is sensible given kinetic findings with purified proteins show that DNAP V can be significantly better than DNAP IV at the step directly following adduct-G : C formation (i.e., extension) in the case of +BP compared to −BP (discussed in greater detail in reference [75]). Regarding the nonmutagenic pathway with −BP, only DNAP IV is required for efficient TLS [75], suggesting it does both insertion and extension. In a -TGT sequence, DNAP V is required in the G T pathway for +BP, while DNAPs II and IV are not, implying that DNAP V must do insertion and extension [57]. However, in a -GGA sequence, G T mutations were shown not dependent on DNAP V and were not enhanced by SOS induction, which implies no lesion-bypass DNAP involvement and led the authors to propose that DNAP III was involved in dATP insertion opposite +BP [73, 74]. Random mutagenesis studies with [+anti]-B[a]PDE also showed the existence of a non-SOS-inducible G T pathway (discussed in [57]), though the major G T pathway did require SOS-induction, implying involvement of a lesion-bypass DNAP.

7. Architecture of Y-Family DNAPs

Table 1 [44] shows that dNTP insertion opposite a variety of adducts/lesions, including +BP, is remarkably similar for the DNAP IV and DNAP pair, suggesting they are functional orthologs. Insertion is also remarkably similar for the DNAP V and DNAP pair, suggesting they are also functional orthologs. There must be structural reasons for the insertion preferences of these DNAPs, though the key elements are not obvious, given that in alignments, for example, UmuC(V) shares only 20% amino acid identity with its functional ortholog hDNAP , which is about the same as the 21% identity that it shares with its nonfunctional ortholog hDNAP [44]. The extent of this dilemma is further revealed by the fact that hDNAP is no more identical to scDNAP (24%) than it is to hDNAP (24%). Nevertheless, a careful examination of Y-Family DNAP structure suggests that key structural features do exist.

A variety of architectural features are revealed by considering how B[a]P-N2-dG adducts must sit in the active sites of Y-family DNAPs and how these structures might relate to adduct processing [136]. To form an adduct-dG : dCTP base pair, the B[a]P moiety must be in the developing minor groove, since the adduction site (N2-dG) is in the minor groove in a Watson-Crick base pair. On the minor groove side, Y-Family DNAPs have an opening (or gap) next to the active site between the fingers and little finger domains. This opening looks like an elliptical hole of varying sizes in Dpo4 [2432], Dbh [23], hDNAP [3638] and in models of DNAP IV and UmuC(V) [44], while it looks like a slot in hDNAP [35]. It is not unreasonable to think that the size and shape of this opening might influence dNTP insertional mechanism given that the bulky B[a]P moiety must interact with this opening on the minor groove side.

The character of this opening can be analyzed based on a simple analogy to a “chimney.” Three regions of the protein contribute to the chimney as shown in Figure 3(a) for our model of DNAP IV: an upper lip (aa33–36, turquoise) and a left lip (aa73–76, blue), which are in the fingers domain, while the lower lip (blue, aa244–247), is in the little finger domain [136]. The UmuC(V) chimney is shown in Figure 3(b).

Two features control the size and character of this opening. (1) The amino acid side chains in the upper lip (aa33–36 in DNAP IV) can be thought of as a “flue,” which either plug the chimney leaving a small opening or do not plug the chimney leaving a large opening. The flue amino acids are present in all Y-Family DNAPs. (2) A “cap” may lie over the top of the chimney opening. The cap is formed by an insert of amino acids in the left lip of the chimney and is only preset in DNAP [39, 40].

First, we consider how the “flue” amino acid side chains influence the character of the “chimney.” Why do we think that the chimney is a key structural feature that is likely to be important for protein function? If the chimney is important, then evidence for its importance should exist, even in the case of UmuC where no X-ray structure exists. We aligned 408 UmuC(V) sequences, and Figure 2 shows the total number of each of the twenty amino acids that are found at each aa position. Positions with 90% aa homology or with clusters of high homology are highlighted in pink and red. Many of the first ~20 aa show high homology, including the presumptive catalytically essential aspartate (D6), and the steric gate (Y11). The region around the catalytically essential asparate/glutamate pair (D101/E102) is also highly conserved. These and other regions that are conserved in all Y-family DNAPs are highlighted in pink in Figure 2. Regions conserved in UmuC(V), but not in other Y-Family DNAPs, are highlighted in red in Figure 2. One such conserved region is V29-C36, which is part of a loop that includes one edge of the chimney lip (S31-D34). (This loop is discussed at greater length in Section 7.1). A second conserved region is S71-Y77, which includes the second lip of the chimney, as well as other features discussed below. This conservation is strong evidence that the nature of the chimney opening is important. The third edge of the chimney (E255-T258) is in the little finger domain, and in our model of UmuC the third lip is farther from the active site and appears less likely to impinge on adducts protruding from the minor groove. Consistent with this view the third lip of the chimney is less well conserved. Preliminary analysis of the chimney lips of large collections of DNAPs IV, , and sequences also reveal considerable amino acid conservation of the chimney lips.

7.1. Structural Basis for a Large versus a Small Chimney Opening

DNAP IV has a large chimney opening (Figure 3(a)), which can accommodate the pyrene thus allowing +BP to readily pair with dCTP when dCTP adopts the canonical shape observed in all other families of DNAPs [136, 137]. In contrast, UmuC(V) has a small chimney opening (Figure 3(b)), which forces +BP downward in the active site into a position where catalysis seems unlikely to be facile [136, 137]. What structural difference(s) in DNAP IV versus UmuC(V) might result in a large versus a small chimney opening, and is this structural difference(s) conserved in other Y-Family DNAPs in the IV/ -class versus the DNAP V/ -class?

The chimney upper lip (turquoise, Figure 3(a)) is closest to the active site, and principally defines whether the chimney can accommodate the bulky B[a]P moiety. The first amino acid in the upper lip of E. coli DNAP IV is glycine (G32). We have collected 434 DNAP IV sequences from the literature, and 418 have glycine at this position. Furthermore, 13/13 DNAP proteins from different species have glycine at this position. The one X-ray structure for the IV/κ-class is hDNAP [29], which shows that this glycine (G131, turquoise, Figure 3(c)), is followed by upward curvature of the chimney upper lip (red arrow, Figure 3(c)/left). This glycine can be thought of as a “flue-handle” whose / -angles permit this upward curvature (see below), with the consequence being that the R-groups on the next several amino acids (the “flue”; S132/R133, blue in Figure 3(c)/middle) point away from the chimney opening, which remains open. Our models of DNAP IV also have this upward curvature (Figure 3(a)) with an open flue, which depends on the analogous glycine flue-handle (G32).

In contrast, leucine (L30 in Figure 2) is the flue-handle in UmuC(V) in 370/408 cases. Furthermore, 11/11 DNAP proteins from different species have a bulky valine at the flue-handle position. The X-ray structure of scDNAP [25], which is in the in the V/η-class, shows that its bulky V54 flue-handle (turquoise, Figure 3(d)/left) is associated with downward curvature of the chimney upper lip (red arrow), which forces the “flue” (Q55/Y56, blue in Figure 3(d)/middle to plug the chimney. Figure 3(e) shows the upward curvature of the upper lip of hDNAP (yellow) superimposed on the downward curvature for scDNAP (green). In UmuC(V) the sequence is slightly different (VLSN), though the outcome is the same: the bulky L30 flue-handle causes downward curvature, and an asparagine (N32) plugs the chimney giving a closed flue (Figure 3(b)).

Upward versus downward curvature of the chimney upper lip can be traced to the / -angles adopted by the flue-handle [136]. The / -angles for the nonglycine flue-handles in scDNAP , hDNAP , hDNAP , UmuC(V), and Dpo4 are all similar, resulting in downward curvature of the chimney’s upper lip, causing the flue to plug the chimney and the chimney opening to be small. In contrast, Glycine has greater flexibility in its / -angles compared to all other amino acids, and the glycine flue-handles in hDNAP κ and DNAP IV adopt / -angles unique to glycine that allow upward curvature of the chimney’s upper lip, which keeps the nearby flue amino acids away from the chimney opening.

7.2. Roof-aa and Roof-Neighbor-aa

Another key difference between the IV/κ-class and the V/η-class is the bulk of the roof-aa (pink in Figures 3(c)/right and 3(d)/right), which is a positionally conserved residue that lies above the nucleobase of the dNTP, as seen in the active site of Dpo4 [2432], yDNAP [34], hDNAP [3638], hDNAP [35], and hDNAP [39, 40]. Isoleucine is the dominant roof-aa in UmuC(V) (227/408), with valine (156/408) being the next most prevalent aa (Figure 2). In fact, compared to E. coli wt-UmuC (100%), the mutant I38V-UmuC (137%) is slightly more active in the nonmutagenic pathway with +BP, while amino acids that do not branch at the β-carbon, including leucine, show much lower activity [138]. Immediately after the roof-aa in UmuC(V), principally alanine is found (346/408). In the case of DNAP from different species, the [roof-aa/next-aa] is [I/A] in 10/11 cases. In the yDNAP X-ray structure [I60/A61] form a hydrophobic layer above the nucleobase of the dNTP.

In the collection of 434 DNAP IV sequences, there is more variability at the equivalent [roof-aa/next-aa] positions: [S/T] is preferred (238/434, 59%), though any of the nonbulky amino acids S, A, or T can be found at both the roof-aa (434/434) and the next-aa (406/434). For DNAP , the roof position is also principally S, A, or T (10/13) and the next amino acid is always threonine (13/13). In X-ray structures the threonine methyl group in Dpo4 (T45) and in hDNAP (T138) sit near the roof-aa (A44 and S137, resp.), and the hydroxyl of the threonine forms a hydrogen bond with a nonbonded oxygen on Pβ of the dNTP.

When the [roof/next-aa] were mutated in wt-UmuC(V) from [I38/A39] to the single mutants [I38A/A39] or [I38/A39T], polymerase activity declined significantly; however, the double mutant [I38A/A39T], is nearly as active as wild type UmuC(V) [138]. I38A/A39T-UmuC has the same sequence as wt-Dpo4 (A44/T45), which is in the IV/ -class. These findings show the importance to activity of the coupling of the identity of the [roof-aa/next-aa].

7.3. The Interconnected Architecture of the Chimney and Roof Regions

To understand the interconnected architecture of the chimney/roof regions of Y-Family DNAPs, it is useful to focus on a bulky aliphatic amino acid, which is highly conserved V29 (374/408) in our collection of UmuC(V) sequences. (In the equivalent position, all 434 DNAP IV sequences have either valine or isoleucine; valine is present in 10/11 DNAP sequences and in 13/13 DNAP sequences.) This amino acid plays a scaffolding role as revealed in X-ray structures [2338] and in models [44, 136, 138]. Using hDNAP [35] as an example, this scaffolding valine (V130, white in Figure 3(c)/right) is the beginning of a loop that ends with the roof-aa, and the two form a backbone hydrogen bond (scaffold-C=O : HN-roof). This backbone hydrogen bond is also observed in X-ray structures from Dpo4 [2432], scDNAP [35], hDNAP [35], and hDNAP [3638]. The scaffold-V130 also contacts the flue-handle (G131) and L136 (Figure 3(c)/right, gray). Thus, the base of this loop is anchored by a square of four amino acids (V130/G131/L136/S137). In scDNAP , this region looks similar with C53/V54/I59/I60 (Figure 3(d)/right). The square includes I31/G32/I41/S42 in DNAP IV, and V29/L30/V37/I38 in UmuC(V). Evidence suggests that V29 and I38 are likely to be in contact in UmuC(V) [138].

Scaffold-V130 in hDNAP (white, Figure 3(c)/right) also helps organize the steric gate (Y12, red), which face-stacks with Y174 (brown), a highly conserved tyrosine whose other aromatic face contacts the backbone of the left lip of the chimney (i.e., aa168–171 in hDNAP ), thus helping to orient it. A tyrosine is found at this position in 406/408 UmuC(V) sequences, in 432/434 DNAP IV sequences, in 11/11 DNAP sequences, and in 13/13 DNAP sequences.

7.4. The Chimney “Cap”

The interconnection between the roof and chimney regions are similar in scDNAP (Figure 3(d)/right). However, the left chimney lip has an insert (aa93–127), which is not shown in Figure 3(c) but is indicated by a circle. In spite of this insert/loop, the chimney left lip of scDNAP resembles the left lip of hDNAP (Figure 3(c)/right) and of other Y-Family DNAPs. This insert serves as a “cap” over the chimney opening, such that the chimney is completely closed. DNAP always has a cap, though its size varies (e.g., aa81–87 in hDNAP ). Speculation about a role for the DNAP chimney cap is in the next section.

8. DNAP Structures with TT-CPDs

Recently, X-ray structures of yeast DNAP [39] and human DNAP [40] with a TT-CPD were published, and remarkable insights have emerged. Two template bases are in the active site, with the base on the -side base pairing with the dNTP. When the -T of the TT-CPD interacts with dATP, the -T of the TT-CPD is also in the active site, and when the -T of the TT-CPD is interacting with dATP, then the normal base on its -side is in the active site. Undamaged DNA appears similarly. The T-bases of a TT-CPD lie at an angle of ~30° with respect to each other and lack the usual twist between the base pairs; however, the impact of these distortions are minimized by the protein, such that the TT-CPD looks remarkably similar to a normal pair of adjacent thymines. Watson-Crick pairing is observed between the dATP and each T-base of the TT-CPD.

dATP adopts the canonical chair-like shape found in all families of DNAPs (see Section 10),though the shape is nuanced; for example, the angle of the A-base is tilted slightly downward compared to other dNTPs in Y-Family DNAPs in order to pair with the -T of the TT-CPD. In hDNAP , the guanidinium of R61 interacts with phosphate-oxygens on both the - and -positions of dATP, which is a unique interaction among Y-Family DNAPs. Interestingly, the equivalent R73 in yDNAP is flexible and can be in this position, or it can face the opposite direction and pair with the extra template base in the active site; that is, the base not paired with the dNTP. This arginine is one of the most conserved amino acids in DNAP , though the R73A mutation in yDNAP retains normal kinetics with respect to both undamaged and damaged DNA, which suggests that its most important role is not being revealed in studies with a TT-CPD. In UmuC(V), which is the bacterial ortholog of DNAP , methionine (M51) is usually at the equivalent position, though arginine or lysine are frequently present (42/408) (Figure 2). The other highly conserved amino acid in DNAP is a glutamine (Q38 in hDNAP and Q55 in yDNAP ), which sits in the minor groove and interacts with both O2-positions on the T-bases of the TT-CPD.

Why is DNAP inactive on CPDs, while DNAP is active? A number of structural elements no doubt contribute, but one important feature is that M135 in DNAP , which lies two positions before the roof-aa, is too bulky to accommodate both template-Ts of the TT-CPD. In the equivalent position, hDNAP has a glycine (G46) and yDNAP has a serine (S58), whose smaller size permits TT-CPD in the active site. Thus, DNAP may have an amino acid (i.e., M135) to minimize its activity on substrates meant for the V -class of DNAPs. Similarly, one of the functions of the flue and the cap for V/ -class DNAPs may be to minimize its activity on adducts that protrude into the minor groove, which are substrates for IV/ -class DNAPs.

9. Architecture of the Y-Family Little Finger Domain

Y-Family DNAPs show considerable amino acid homology in the thumb/palm/fingers domains (approximately aa1–230), which makes alignment in this region, including for UmuC(V), unambiguous [44]. However, alignment of the little finger domain is more problematic. X-ray structures exist for the little finger domain of seven Y-Family DNAPs, and their fundamental structure is similar. They show a conserved secondary structure of 1- 1- 2- 3- 2- 4, where the four -strands are aligned and anti-parallel, while the two -helices are aligned, antiparallel and cross-diagonally over the -strands. In spite of this structural conservation, standard sequence alignment algorithms (e.g., ClustalW or MUSCLE) do not correctly align the little finger domains of these seven proteins.

Figure 4 shows the correct alignment based on the X-ray structures (as described in the legend), including the little finger domain of DNAP IV [112]. What features of these sequences allow the structures to be conserved, even though their primary amino acid sequences are not conserved? An inspection of the X-ray structures reveals that little finger domains are held together by a core of about twenty-one hydrophobic residues, which are highlighted in turquoise in the alignment in Figure 4 (H1–H21) and are shown in a Dpo4 structure (Figure 5). Though hydrophobicity is conserved at these twenty-one positions, the exact amino acid is not. Dpo4, Dbh, DNAP , DNAP IV, DNAP , yDNAP , and hDNAP have 19, 19, 17, 20, 20, 20, and 20 hydrophobic residues, respectively, at these 21 positions (Figure 4).

A comparison of these seven proteins reveals that the little finger domain has thirteen positions where an amino acid side-chain can interact with a phosphate-oxygen. Nine positions have a consensus lysine, arginine, asparagine or glutamine, which can interact with a phosphate-oxygen in the DNA backbone; they are designated L1–L9 in the alignment in Figure 4 (red) to indicate that their R-groups are “long.” They are also shown in the Dpo4 structure in Figure 5 (red). Four positions have a consensus serine or threonine that can interact with a phosphate-oxygen; they are designated S1–S4 to indicate that their R-groups are “short” in Figure 4 (pink). They are also shown for Dpo4 in Figure 5 (pink).

In terms of DNA interactions, there are some nuances. In several cases amino acids with longer R-groups can also serve at the S1–S4 positions (e.g., K301 in Dbh). Regarding S3, S297/Dpo4, S297/Dbh and S359/DNAP clearly interact with the P + 6 phosphate-oxygen; however, T469/DNAP κ looks like a rotation would be required for it to interact properly, though it was counted as a positive. Q296 in DNAP IV might be able to interact with P + 5, though DNA is not present for definitive assessment and it is noncanonical, so it is not counted. R285/DNAP IV and R283/Dbh (instead of N340) might interact with P + 8, though DNA is not present for definitive assessment and it is non-canonical, so neither is counted.

Of the thirteen sites that can interact with phosphate-oxygens (i.e., L1–L9 and S1–S4), Dpo4 and Dbh have an appropriate amino acid at 13/13 sites and 11/13 sites, respectively. In contrast, hDNAP κ, DNAP IV, hDNAP , yDNAP and hDNAP have an appropriate amino acid at 9, 8, 7, 8, and 7 sites, respectively. The higher level of conformity for Dpo4 and Dbh undoubtedly reflects the need for more interactions with DNA given that they are from thermophilic bacteria and must operate at elevated temperatures. The similar number of residues (~8) for the other DNAPs probably reflects that they operate at a similar but lower temperature (i.e., ), and if they had more interactions they might bind DNA too tightly. We note that an increase in hydrophobic residues in the hydrophobic core was not expected for Dpo4 and Dbh, because hydrophobic interactions strengthen as temperature increases.

10. How Y-Family Architecture Influences dCTP versus dATP Insertion Opposite B[a]P Adducts

DNAP IV can pair dCTP with the dG moiety of +BP, importantly because the bulky pyrene can be accommodated in DNAP IV’s large chimney opening (Figure 3(a)). For phosphoester bond formation to occur the distance between primer-O and P -dCTP must be reaction-ready and can be compared to the closest possible distance, a van der Waals’ contact (~3.5 Å). In models of +BP in DNAP IV [136], the distance between primer-O and P -dCTP was ~3.7 Å, which approximates a van der Waals’ contact, and, thus, can be thought of as being “reaction ready.” The no-adduct control had a similar primer-O and P -dCTP distance (~3.7 Å).

In contrast, UmuC(V) does not give a satisfactory structure when +BP is paired with dCTP (Figure 3(b)/center), because UmuC(V)’s small chimney opening forces the bulky pyrene moiety downward. Asparagine-32 is the main problem, and its side chain plugs the UmuC(V) chimney leading to a clash with +BP. In the unadducted structure (Figure 3(b)/left), N32 adopts its lowest energy rotational conformer with respect to the C –Cβ bond. The presence of +BP leads to a rotation about the C –C bond (Figure 3(b)/center); however, no other rotation can get N32 any farther out of the way. Consequently, UmuC(V)’s small chimney forces the +BP-dG in the template and its paired dCTP to move downward such that the primer-O /dCTP-Pα distance is elongated to ~5.0 Å, which is a nonreaction-ready distance.

These observations provide a reasonable rationale for why DNAP IV preferentially does cellular dCTP insertion in cells: DNAP IV’s large chimney opening permits a reasonable adduct-dG : dCTP structure with reaction-ready distances between primer-O and P -dCTP.

If UmuC(V)’s small chimney enforces a non-reaction-ready distance between the primer-O /dNTP-P , then how could UmuC(V) insert any dNTP opposite +BP? Recently, we offered a hypothesis [136].

X-ray structures from all DNAP families show a canonical dNTP shape that has been called “chair-like,” and its structure from T7 DNA polymerase is shown in Figure 6 (blue insert), which is also observed in most of the X-ray structures of Y-Family DNAPs, including Dpo4 (“S1-dNTP shape,” Figure 6, green). However, a second non-canonical “goat-tail-like” shape (“S2-dNTP”, Figure 6, yellow) has also been observed [26]. The goat-tail-like S2-dNTP shape can lie lower down in the active site, and +BP paired with dCTP in the S2-dNTP shape allows the pyrene to lie comfortably under the small chimney opening of UmuC(V), which allows the primer-O /dNTP-P distance to be reaction-ready (~3.8 Å).

Interestingly, the S2-dATP shape allows the syn-conformation, such that syn-dATP can pair with adduct-dG via a Hoogsteen base pair. In contrast, the syn-adenine base in the S1-dNTP shape has steric clashes with atoms in the deoxyribose and the -phosphate. Adduct-dG : syn-dATP pairing in UmuC(V) gave reasonable structures with primer-O /dNTP-P distances of ~3.6 Å [136]. The S2-dNTP shape appears to have accompanying protein components that should allow phosphoester bond making and breaking [136, 138].

Other G : A mispairings are also possible. In principle, anti-dATP can pair with syn-guanine in adduct-dG, which requires the pyrene moiety to be in the major groove. Anti-dATP can also pair with antiguanine in adduct-dG in an elongated mispair. Thus, there are scenarios other than the one involving our syn-dATP : antiadduct-dG hypothesis.

11. Unusual Architectural Features of Dpo4

Dpo4 is by far the best studied Y-Family DNAP, both structurally and biochemically. Based on biochemical and X-ray findings [28], Dpo4 insertion opposite +BP was proposed to follow a “dislocation” or “templated” pathway. Dislocation/templated insertion (see [135, 136] and references therein) involves DNAP stalling at an adduct, slippage to the next -base along the template, which directs incorporation (e.g., dATP insertion opposite the -T in a -TG sequence context), whereupon the newly incorporated dA slips back to form an adduct-G : A mispair, from which extension yields the mispair that ultimately gives a G- T mutation. Dpo4 preferentially inserted dCTP, dTTP, dATP and dGTP opposite +BP in -GG, -AG, -TG and -CG sequences, respectively, [28], which is consistent with a dislocation/templated mechanism.

Though the dislocation/templated mechanism is attractive for Dpo4, considerable evidence both in vitro and in vivo suggest that neither DNAP IV nor DNAP V follow a dislocation/templated mechanism with +BP, as discussed extensively in [136] and references therein.

Why might Dpo4 be different than DNAPs IV and V? Dpo4 is in the IV/κ-class, and it has a nonbulky roof-aa and roof-neighbor-aa [A44/A57], as expected for the IV/κ-class. However, Dpo4 has a very small chimney opening (discussed below), which is associated with for the V/η-class. Thus, Dpo4 is a hybrid with a roof similar to the IV/κ-class and a chimney similar to the V/η-class.

Dpo4 has a small chimney opening, because its bulky flue-handle (C31) causes downward curvature of the chimney upper lip and leads to a closed flue (V32) [136]. In fact, Dpo4’s chimney is exceedingly blocked: (1) the V32 flue is inserted deeper into the chimney than, for example, the N32 flue of UmuC(V), and (2) M76, which is the second amino acid in Dpo4’s left lip, also plugs the chimney. DNAP IV and UmuC(V) have non-bulky G74 and S72, respectively, in the position equivalent to M76 in Dpo4. Thus, the excessively plugged chimney of Dpo4 forces the pyrene moiety of +BP so far from the active site that base pairing via either S1-dCTP or S2-dCTP is impossible; consequently, both the pyrene and the dG moieties of +BP are forced to be extrahelical with consequence being that pairing cannot occur with its complementary dC [28].

As mentioned above, DNAPs η, IV, , and have an appropriate amino acid at 9, 8, 7, and 8, respectively, of the thirteen sites that can interact with phosphate-oxygens (L1–L9 and S1–S4). In contrast, Dpo4 and Dbh conform 13/13 and 11/13, respectively, which undoubtedly reflects the need for more interactions with DNA given that they are from thermophilic bacteria and must operate at an elevated temperature. Thus, Dpo4 studied at , which is typical, may give results that do not reflect correctly on aspects of the mechanism of other Y-Family DNAPs, which have evolved to operate at .

This analysis suggests reasons for caution when applying conclusions from Dpo4 to other Y-Family DNAPs, especially those purely in the IV/ -class or the V/ -class. Perhaps Dpo4 evolved its hybrid roof/chimney structure to bypass a unique set of lesions encountered by a thermophilic bacteria. Alternatively, perhaps the structure of Dpo4 at physiologically relevant elevated temperatures is different than at the temperature at which it was crystallized (room temperature.) and assayed ( ), and this affects its structure and behavior.

12. Structure of B-Family Lesion-Bypass DNAPs

This paper has focused on Y-Family DNAPs, though some lesion bypass DNAPs are in the B-Family, including DNAP II in E. coli and REV3 of DNAP in many eukaryotic cells. DNAP II inserts and extends the −2 frameshift intermediate of AAF-C8-dG [66, 117], which must have two looped out nucleotides as well as the AAF moiety protruding into the major groove. Data also suggests that DNAP II and DNAP are involved in the bypass of interstrand crosslinks [4549], which must have a large oligonucleotide protruding into the major groove during TLS. Though B-Family DNAPs completely surround DNA, the structure on the minor and major groove sides are very different, as revealed in structures of both E. coli DNAP II [139] and Rb69 DNAP [140]. B-family DNAPs have a helical protein component that follows and contacts the minor groove side of duplex DNA. On the major grove side, however, a protein dome is present that leaves a large open cavity. Though Y-Family DNAPs are open to solvent on their major groove side, the solvent-exposed DNA surface inside the cavity for DNAP II (~400 Å2, when considering,for example, the template : dNTP base pair plus the L + 1 base pair) is actually larger than with either DNAP IV (~230 Å2) or UmuC(V) (~130 Å2). It seems likely that the large cavity and solvent exposed region on the major groove side of B-Family DNAPs may be essential for their ability to accomplish TLS on lesions having bulky protrusions into the major groove.

Abbreviations

B[a]P:Benzo[a]pyrene
+BP:[+ta]-B[a]P-N2-dG (Figure 1)
−BP:[−ta]-B[a]P-N2-dG
TT-CPD:Thymine-thymine cyclopyrimidine dimer
TLS:Translesion synthesis, which includes the insertion of a base opposite a DNA adduct, as well as subsequent elongation
DNAP:DNA polymerase
S1-dNTP:The “chair-like” dNTP shape
S2-dNTP:The “goat-tail-like” dNTP shape
aa:Amino acid.