Abstract

In addition to the canonical B-form structure first described by Watson and Crick, DNA can adopt a number of alternative structures. These non-B-form DNA secondary structures form spontaneously on tracts of repeat sequences that are abundant in genomes. In addition, structured forms of DNA with intrastrand pairing may arise on single-stranded DNA produced transiently during various cellular processes. Such secondary structures have a range of biological functions but also induce genetic instability. Increasing evidence suggests that genomic instabilities induced by non-B DNA secondary structures result in predisposition to diseases. Secondary DNA structures also represent a new class of molecular targets for DNA-interactive compounds that might be useful for targeting telomeres and transcriptional control. The equilibrium between the duplex DNA and formation of multistranded non-B-form structures is partly dependent upon the helicases that unwind (resolve) these alternate DNA structures. With special focus on tetraplex, triplex, and cruciform, this paper summarizes the incidence of non-B DNA structures and their association with genomic instability and emphasizes the roles of RecQ-like DNA helicases in genome maintenance by resolution of DNA secondary structures. In future, RecQ helicases are anticipated to be additional molecular targets for cancer chemotherapeutics.

1. Introduction

Structure of the right-handed B-form DNA has been known since 1953 [1]. Instead of being a conformationally homogenous molecule, DNA has the capability of adopting several types of conformations as dictated by its sequence [2]. As early as 1957, association of ribonucleic poly-A and poly-U polymers into three-stranded complexes was revealed using sedimentation coefficient and optical absorption measurements [3]. It was later shown by atomic resolution single-crystal X-ray diffraction analysis that the DNA hexamer d(CpGpCpGpCpG) forms a left-handed conformation (Z-DNA) with altered helical parameters relative to the right-handed B-form [4]. This was followed by the identification of cruciform structures formed by inverted repeats [5, 6]. Finally, guanine-rich motifs in DNA were discovered to form parallel four-stranded complexes called tetraplex, G-quadruplex, or G4 DNA [7]. More than ten different DNA conformations have now been discovered [8], and these are often referred to as secondary structures, alternative DNA, or non-B DNA. A non-B database has been developed for prediction of alternative DNA structures including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats, and their associated subsets of cruciforms, triplex, and slipped structures, respectively [9].

Non-B DNA structures are functional genomic elements that play a variety of roles in the cell [10]. These include gene function and regulation [11], immune response [12], telomere maintenance [13], recombination [14], antigenic variation in human pathogens [15], and the generation of genomic diversity [16]. The DNA secondary structures are suggested to be involved in regulation at both transcriptional and translational levels; however, when the subtle balance between the replication, transcriptional, and repair machinery is impaired, these secondary structures may induce genetic instability. Alternate structure-forming sequences are known to be unstable and represent hotspots for deletion or recombination in bacteria, yeast, and mammals [1720]. This genetic instability has generally been related to DNA replication because non-B structures cause DNA polymerase pausing in vitro and replication fork pausing in vivo [21]. Slow replication was observed in an inverted repeat sequence in Escherichia coli [22], and inverted repeats lead to deletions or chromosomal rearrangements more frequently in yeast that are deficient in DNA polymerase activity [23, 24]. Slow progression of the replication fork could facilitate formation of secondary structures at long tracts of single-stranded DNA in the lagging-strand template [25]. These secondary structures pose obstacle to replication fork progression causing fork arrest and/or collapse ultimately leading to double-strand breaks (DSBs) and genome rearrangements [26, 27]. The formation of alternative DNA structures can also activate nucleotide excision and SOS pathways resulting in segments of single-stranded DNA (ssDNA) [28]. Such ssDNA regions can be converted to DSBs during replication and lead to mutations through mechanisms such as homologous recombination or nonhomologous end joining [29]. Conditions that favor the structural transitions from B-DNA to non-B DNA lead to genetic instability in model systems [27]. Alternative structure-mediated mutagenesis has been implicated in the incidence of gross rearrangements and deletions as well as point mutations [3032]. There is significant circumstantial evidence for the involvement of DNA secondary structures in association with genetic instability leading to human disease [33, 34].

The vastness of mutagenic capability would be predicted to reduce the prevalence of secondary structure-forming tracts in genomes; however they are abundant and often enriched in the regulatory regions of genes [9]. With such an array of challenging sequence elements, it is evident that cells have developed the capacity for controlling the potential of these sequences for genome destabilization. Among several elucidated mechanisms to resolve secondary DNA structures, RecQ helicases represent an important class of enzymatic activities that are utilized to counteract such challenge to genomic stability.

2. RecQ Helicases

The RecQ family represents one of the most highly conserved groups of DNA helicases [3539]. Bacteria and budding yeast have one RecQ homolog, RecQ and SGS1, respectively. The RecQ helicase family has 5 homologs in the human genome: RECQ1, WRN, BLM, RecQ4, and RecQ5β(Figure 1). RecQ helicases share a centrally located helicase domain that couples nucleotide hydrolysis to DNA unwinding and defines the RecQ family. Many, but not all, RecQ helicases contain additional conserved RQC (RecQ C-terminal) and HRDC (helicase and RNaseD C-terminal) domains, which are implicated in protein interactions [40, 41] and DNA binding [42]. Eukaryotic RecQ helicases have, in addition, further N- and C-terminal extensions that are involved in protein-protein interactions and postulated to lend unique functional characteristics to each helicase [43, 44]. Certain RecQ homologues also have strongly acidic regions that have been shown to mediate interaction with single-strand DNA-binding proteins such as RPA [37]. Furthermore, nuclear localization signal (NLS) has been identified for several RecQ proteins (Figure 1) [37].

RecQ helicases unwind DNA duplex with 3′-5′ polarity in a reaction that is dependent of NTP hydrolysis [37]. Curiously, several RecQ proteins have been demonstrated to also have the ability to promote annealing of complementary strands in a reaction that is inhibited by the presence of ATP [37]. It was confirmed that ATP binding to the protein modulates oligomeric state of RECQ1 and regulates these apparently conflicting biochemical activities [45, 46].

RecQ helicases are remarkable among all DNA helicases for two primary reasons. First, in addition to unwinding duplex DNA, they are capable of unwinding a variety of DNA substrates containing noncanonical structures including forked duplexes, displacement loops (D-loops; an intermediate in homologous recombination reactions), triple helices, 3- or 4-way junctions, and G-quadruplex DNA [37, 47]. In fact, in many instances, they prefer these substrates to standard duplex DNA. Second, germline mutations in three human RecQ helicase homologs WRN, BLM, and RECQ4, which are located on chromosomes 8p-12, 15q-26.1, and 8q-24.3, respectively, give rise to rare genetic disorders of Werner, Bloom, and Rothmund-Thomson/ RAPADILINO/Baller-Gerold syndromes, respectively, all of which are characterized by chromosomal instability and predisposition to cancer [4851]. Distinct clinical features of these disorders indicate that human RecQ homologs perform unique cellular functions. Cellular studies point to critical requirement of RECQ1 and RecQ5β [52, 53], but defects in these have not been associated with a human disease yet. As reviewed in next sections, collective biochemical, cellular, and genetic findings signify a pivotal role of RecQ helicases in resolution of non-B DNA structures and genome maintenance.

3. Prevalence, Consequences, and Unraveling of Non-B Secondary DNA Structures

Genomic maintenance entails highly regulated interaction of intrinsic factors such as the nature of sequence or the action of DNA replication and repair proteins and extrinsic factors such as environmental mutagens. Repetitive sequences in the genomes have the propensity to form complex secondary structures which could lead to diverse types of genomic instability. One of the common mechanisms of alternative structure-induced instability is obstruction of replication fork progression leading to fork stalling and/or collapse [54]. RecQ helicases are proposed as genome caretakers and guardians of DNA replication forks [55]. The following sections summarize impact of certain specific non-B DNA structures on genomic stability and review the roles of RecQ helicases in resolving these structures.

3.1. Cruciforms
3.1.1. Cruciform DNA Structures and Their Occurrence in Genomes

A cruciform structure is formed by intrastrand base pairing of inverted repeat sequences and is characterized by the presence of a four-way junction in which two of the branches are hairpin structures formed on each strand of the inverted repeat [5] (Figure 2(a)). The bases located between the inverted repeats do not self-pair and instead form the apical loops of the hairpins; however the overall structure is stabilized by the free energy of negative supercoiling [56]. Cruciform is structurally similar to a Holliday Junction (HJ) recombination intermediate [57]. In fact, cruciform structures formed by extrusion of an inverted repeat sequence in supercoiled plasmids have been extensively used to study the mechanistic properties of HJ-resolving enzymes [58, 59].

The existence of cruciforms has been demonstrated in vitro [60] and in vivo [6163]. Cruciform structures have been reported in the genome of E. coli [64] as well as mammalian cells [63]. Cruciform-forming inverted repeat sequences have been found at the operator and transcription termination regions [65], as well as at the replication origin region [63, 66]. The distribution of such sequences often overlaps with chromosomal regions prone to gross rearrangements [67]. Because cruciform structures are energetically unfavorable, they are thought to form transiently in vivo as stable structures. The action of cellular factors such as junction specific nucleases, binding proteins, and DNA helicases is suggested to affect the equilibrium and the rate of formation of cruciform structures in vivo [68].

3.1.2. Cruciforms and Genomic Instability

Palindromes, a specific type of inverted repeat separated by only very few base pairs, are poorly tolerated in E. coli cells and are underrepresented in the S. cerevisiae and human genomes [69, 70] presumably due to their tendency to form hairpin and cruciform structures, which could be recognized and cleaved by a nuclease [71] or could affect or slow down DNA replication [72]. Cruciform structures are found in mutagenic hotspots, and their presence has been suggested to be etiologic in causing rearrangements and chromosomal instability in humans [2, 73, 74]. The AT-rich palindromic repeats involved in the recurrent t(11;22) constitutional translocation favor adopting a cruciform structure in vitro and involve frequent DSBs [75, 76]. Direct Alu repeats artificially inserted in an inverted orientation in the yeast genome undergo DSBs and enter a break-fusion cycle resulting in dicentric chromosomes [77]. It has been proposed that the break might be caused by a cruciform-specific resolution activity similar to HJ resolvase, which generally is thought to act on intermediates produced through homologous recombination in the repair of DSBs or stalled replication [67]. However, recent studies show that nearby inverted repeats in budding and fission yeasts recombine spontaneously and frequently to form dicentric and acentric chromosomes independent of DSB formation, possibly by a replication mechanism involving template switching [78, 79].

3.1.3. Cruciform Resolvases

Four-way DNA joint molecules, termed HJs, are key intermediates in recombination [80]. Proteins with the enzymatic ability to cleave synthetic HJs in vitro have been termed HJ “resolvases,” and these DNA junction-resolving enzymes exhibit considerable selectivity for the structure of their substrates [81]. E coli RuvC and its associated proteins RuvA and RuvB constitute the archetypal resolvase system [82]. RuvC is a dimeric protein that promotes HJ resolution by introducing a pair of symmetrically related nicks in two diametrically opposed strands across the junction point [82]. Ongoing search for the eukaryotic equivalent of bacterial RuvC HJ resolvase has led to the discovery of a number of DNA endonucleases, including Mus81-Mms4/EME1 [83], Slx-Slx4/BTBD12/MUS312 [8486], XPF-ERCC1 [87], and Yen1/GEN1 [88, 89]. Furthermore, MUS81-EME1 also forms part of a larger nuclease complex containing SLX1-SLX4 and XPF-ERCC1 raising the possibility that these nucleases cooperate to process HJs [81]. Thus, it appears that eukaryotes possess alternative, and mechanistically varied, ways to process HJs, perhaps reflecting the critical importance of this step for cell viability and mutation avoidance.

3.1.4. Metabolism of Cruciform-Like Structures by RecQ Helicases

Cellular and biochemical studies have established that RecQ helicases are vital in the metabolism of cruciform-like structures [90]. Loss of SGS1 in S. cerevisiae results in the accumulation of HR-dependent replication intermediates that resemble HJs [91]. In humans, DNA processing defects during replication and/or recombination have been suggested to contribute to the molecular pathology of Werner and Bloom syndromes [92]. Werner syndrome cells fail to resolve recombination intermediates [93], and the expression of wild-type WRN protein or RusA, a bacterial enzyme that cleaves four-way junctions, was both shown to rescue the WRN recombination defect and to improve cell survival following DNA damage [94]. Cytogenetic phenotype of elevated sister chromatid exchanges (SCEs) in Bloom syndrome cells has suggested hyperrecombination or aberrant resolution of DNA recombination intermediates in the absence of active BLM protein [95]. Thus, a potential role for both WRN and BLM would be to prevent DNA structures including HJ that arise at blocked or collapsed replication forks from being processed into mature recombinants [55].

Indeed, RecQ helicases preferentially resolve 4-way HJs which are formally analogous to cruciform structures [37]. Several RecQ helicase proteins, including human BLM, WRN, RECQ5, and RECQ1, and the yeast homolog SGS1 were shown to selectively bind HJ structures and to promote ATP-dependent branch migration in vitro [44, 45, 9699]. Furthermore, BLM [97], WRN [96], RECQ5 [44], and RECQ1 [100] are capable of branch-migrating HJs over several kilobases which is remarkable given that these helicases normally display poor processivity in the absence of RPA [37]. The bacterial HJ core recognition protein RuvA inhibits HJ branch migration by BLM, WRN, RECQ1 or RECQ5β suggesting that these RecQ helicases specifically recognize the HJ core where they initiate unwinding [37]. It is conceivable that RecQ helicases may promote branch migration with a mechanism similar to the oligomeric RuvAB branch migration motor wherein two RuvA tetramers bind the junction and promote the loading of two RuvB hexamers on the two arms of the junction [101]. This notion is supported by the fact that BLM, WRN, and RECQ1 form oligomeric structures in solution [102]. The WRN protein binds HJ as an oligomer [103], and an N-terminal fragment of BLM is known to form hexamers and dodecamers [104]. Remarkably, the N-terminal region (residues 1–56) of RECQ1 was found to be essential for both oligomerization and HJ resolution activity [105].

Slipped strand hairpins or cruciforms with single-stranded regions are also formed at tracts of trinucleotide repeats during replication and stall replication forks [10]. A structure-specific nuclease, Flap endonuclease 1 (FEN-1), has been implicated in resolution of such structures [54, 106, 107]. The nuclease activity of FEN-1 is robustly stimulated by physical interaction with WRN [108, 109]. WRN and FEN-1 directly interact at the sites of arrested replication forks suggesting that the formation of a functional FEN-1/WRN complex is important for resolving stalled DNA replication forks [110]. Cleavage of an HJ intermediate of fork regression by FEN-1 requires WRN branch fork migration [110]; WRN helicase activity initiating from the HJ core provides a suitable DNA molecule with a free 5′ ssDNA end on which FEN-1 can load to ultimately catalyze structure-specific cleavage of the unwound 5′ ssDNA arm [111]. However, stimulation of FEN-1 is mediated by direct protein-protein interaction but does not require WRN catalytic activity [108, 112]. In fact, expression of a conserved noncatalytic C-terminal domain of WRN necessary and sufficient for the physical and functional interaction with FEN-1 is sufficient to rescue the yeast dna2-1 mutant phenotypes [113]. The conserved C-terminal in BLM was subsequently also found to mediate a physical and functional interaction with FEN-1 [109], and the phenotypes of yeast dna2 mutants can be rescued by expression of BLM [114]. Importantly, BLM stimulated FEN-1 cleavage of foldback flaps, bubbles, or triplet repeats in a helicase-dependent manner [115, 116]. Thus, WRN and BLM helicases likely act as very effective remover of structures that inhibit FEN-1 and thereby prevent duplications, expansions, and other genome disruption [111].

Cruciform metabolism is also among critical functions of certain RecQ family helicases that are mediated by the species-specific interaction with topoisomerase III homologs [90]; the Rmi1 protein serves as an additional component of the heterotrimeric functional complex [117]. The major role of the concerted helicase-topoisomerase complex is to catenate or decatenate dsDNA, resulting in the resolution or “dissolution” of double HJs [118, 119]. Double HJs are shown to exist in vivo and are thought to arise when both ends of a DSB invade a homologous sequence at the final steps of homologous recombination [120, 121]. Human and Drosophila BLM proteins, but not other RecQ helicases, together with topoisomerase III , have the ability to catalyze double HJ dissolution on model DNA substrates in a reaction that requires BLM-mediated ATP hydrolysis and the active-site tyrosine residue of topoisomerase III [118, 122]. BLAP75/RMI promotes BLM-dependent dissolution reaction by recruiting topoisomerase III to the double HJ [123, 124]. Notably, this functional interaction is highly specific, as the BLAP-75 topoisomerase IlIa pair has no effect on either WRN or E. coli RecQ helicase activity, and E. coli Top3 cannot substitute for topoisomerase IlI in the enhancement of the BLM helicase activity [125]. Dissolution of dHJs in S. cerevisiae is performed by the SGS1-Top3-Rmi1 complex [126]. Recent data are consistent with SGS1 and Top3 acting together in vivo because cells lacking SGS1 or Top3 exhibited persistent HJ-containing DNA structures following exposure to DNA damage [127].

Collectively, RecQ helicases constitute a remarkable group of enzymes that promote resolution of HJs via nonresolvase mechanisms, and this is believed to be one of their critical functions in genome maintenance.

3.2. Triplex
3.2.1. Triplex Structures and Their Occurrence in Genomes

Naturally occurring homopurine/homopyrimidine sequences can fold into triplex configuration by binding a third strand of DNA or RNA in the major groove of Watson-Crick duplex DNA through Hoogsteen or reversed Hoogsteen hydrogen bonds [128] (Figure 2(c)). Intermolecular triplexes are formed when the triplex-forming strand originates from a second DNA molecule, for example, triplex-forming oligonucleotides (TFOs) [128] (Figure 2(b)). Intramolecular triplexes are the major elements of H-DNAs in which the third strand is provided by one of the strands of the same duplex DNA molecule at homopurine:homopyrimidine sequences with mirror symmetry [129]. Unfavourable charge repulsion between the three negatively charged DNA strands contributes to the low stability of triplexes under physiological conditions. At physiological pH, triplex formation usually involves a purine-rich third strand that is antiparallel to the complementary strand and is stabilized by negative supercoiling, modification with phosphorothioate groups, or polyvalent cations such as Mg2+ or polyamines such as spermine and spermidine [129].

Triplexes have been shown to exist in chromosomes and nuclei, and the existence of H-DNA structures has been evidenced both in vitro and in vivo [130]. Triplex formation in vivo is supported by the identification of mammalian proteins that bind specifically to them [131, 132] and to the polypyrimidine [133] and polypurine single strands [18]. H-DNA conformations have been identified in vivo by using triplex-specific monoclonal antibodies [134, 135] and fluorescent “in situ nondenaturing” hybridization [136]. The presence of an H-DNA conformation in vitro has been demonstrated in constructs containing the sequences of interest from E. coli and mammalian genomic DNA or by using chemicals that modify nucleotides specifically in single-stranded DNA or double-stranded DNA [137, 138]. The sequence-specific DNA recognition and binding characteristics of synthetic TFOs have been extensively studied because of their potential applications in genome modification and therapy [139]. Most annotated genes in both the mouse and human genomes are predicted to contain at least one unique potential TFO binding site [140]. Similarly, naturally occurring sequences capable of adopting H-DNA structures are very abundant in mammalian cells (~1 in every 50,000 bp in humans) [129, 141]. Majority of polypurine polypyrimidine sequences are located in introns, promoters and 5′ or 3′ untranslated regions and are enriched in genes involved in cell signaling and cell communication [142]. Importantly, H-DNA structure-forming sequences are found flanking protooncogenes [143, 144].

3.2.2. Triplex and Genomic Instability

Naturally occurring triplexes are sources of genomic instability, and TFO can induce targeted mutagenesis, recombination, or DNA repair, and can inhibit proliferation and induce apoptosis in cultured cells [14, 31]. Genomic instability of human DNA sequences that can form triplexes is associated with the etiology of several diseases including neurological disorders [34]. For example, the triplex-forming potential of a repeat has been correlated with the genomic instability and reduced frataxin gene expression in Friedreich’s ataxia, a triplet repeat disorder [145]. The repeated sequence was shown to inhibit DNA polymerization in vitro and progression of replication forks in vivo suggesting that the triplex formation by the Friedreich’s ataxia repeat inhibits DNA replication [146, 147]. In addition to posing block to replication progression [148], naturally occurring triplex-forming sequences have been shown to interfere with transcription [144]. Many breakpoints on the translocated c-myc gene in Burkitt's lymphoma and t(12;15) BALB/c plasmacytomas are clustered around the H-DNA-forming sequences in the promoter regions [149]. Indeed, the naturally occurring H-DNA structure-forming sequence from the human c-MYC gene was shown to induce DSBs within these sequences in mammalian cells [32] and cause genomic instability in mice [149]. Collectively, these studies imply that the triplex structures result in fragile sites or mutation hotspots causing DSBs and subsequent translocation of the gene [129]. As part of the mechanisms whereby cells prevent the deleterious effects of alternate DNA structures, triplex formation in vivo is likely to be at least partly inhibited by destabilizing proteins or helicases.

3.2.3. Triplex Unwinding by RecQ Helicases

Triplex-forming sequences have been demonstrated to block replication in vitro [148]. Purified recombinant E. coli RecQ protein partially alleviates triplex formation and facilitates fork progression through triplex-forming DNA in vitro; loss of RecQ significantly increases the mutations caused by triplex-forming DNA in vivo [150]. RecQ-deficient E. coli utilize RecG helicase for fork regression upon encountering triplex structures and thereby restart replication [150]. A RecG equivalent in human is not identified yet, but human BLM and WRN helicases can unwind a DNA triple helix [151] and also catalyze fork regression [152]. In vitro studies with triple helices formed by a pyrimidine motif third strand demonstrated that WRN and BLM catalyze triplex unwinding in nucleoside triphosphate hydrolysis-dependent reaction and require a free 3′-ssDNA overhang attached to the third strand [151]. Triplex unwinding by BLM and WRN does not require a single strand:double strand fork at the junction of the third strand and the triplex [151] indicating that the unwinding is promoted by inherent structural elements of the triplex substrate; this might also be facilitated by the oligomeric structures of these helicases [102]. More recently it was demonstrated that DHX9 (nuclear DNA helicase II (NDH II) or RNA helicase A (RHA)) protein, a superfamily 2 helicase, preferentially unwinds intermolecular triplex DNA substrates in vitro with a specific 3′-5′ polarity with respect to the displaced third strand (similar to BLM and WRN helicase); this activity required a 3′-ssDNA overhang on the third strand and was dependent on ATP hydrolysis [153]. In contrast, a 5′-ssDNA overhang on the triplex-forming oligonucleotide is required to unwind the third strand of the triplex structure by FANCJ, a superfamily 2 helicase that unwinds DNA in the 5′-3′ direction [154]. This is consistent with the observations that FANCJ requires a preexisting 5′-ssDNA to unwind conventional B-form duplex DNA substrates [155] and G-quadruplex DNA substrates [156]. Triplex unwinding by other human RecQ proteins has not been reported yet; however, examination of individual RecQ homolog might reveal differential preference for non-B DNA structures [105].

Identification and characterization of triplex unwinding helicases have signified the critical importance of triplex resolution for genomic stability. This is highlighted by the fact that mutations in WRN, BLM, FANCJ lead to genomic instability and certain cancer in humans [157], whereas homozygous DHX9 knockout mice are embryonic lethal [158]. Cellular DNA metabolic processes involve transient formation of ssDNA which can possibly interact with other strands to form secondary and tertiary structures [159]. If triplexes are not resolved, they can potentially interfere with processes such as DNA replication, recombination, and repair [129]. Further investigations to uncover the roles of helicases in resolving triplex DNA structures are necessary for understanding the cellular mechanism(s) for genome stability maintenance.

3.3. G4 DNA
3.3.1. G4 DNA Structures and Their Occurrence in Genomes

G-quadruplexes form in vitro in guanine-rich sequences that contain four tracts of at least three guanines separated by other bases and are stabilized by G-quartets [160]. The G-quartets arise from the association of four guanines into a cyclic Hoogsten hydrogen-bonding arrangement in which each guanine base makes two hydrogen bonds with its neighbor using different hydrogen-bonding positions to the canonical Watson-Crick base pairing. The planar G-quartets stack on top of each other, giving rise to four-stranded helical structures (Figure 2(d)). These structures, called G-tetraplex, G-quadruplex, or G4 DNA, may involve intramolecular or intermolecular interactions, and the phosphodiester backbones of the four participating strands may be in parallel or antiparallel orientation [161] (Figures 2(e) and 2(f)). The formation of G4 structures is strongly dependent on monovalent cations such as K+ and Na+ and, hence, physiological buffer conditions favor their formation, and it has been suggested that G4 DNA may be routinely assembled and disassembled within cells [162].

The human genome contains nearly 376 000 distinct sites with the potential to form G4 DNA [163, 164], and the evidence for in vivo formation of G4 DNA has emerged in recent years [165]. Notably, G4 DNA has been observed by electron microscopy from transcribed human G-rich DNA arrays in bacteria [166] and by immunochemistry at the end of the ciliate Oxytricha telomeres [167]. The G-rich chromosomal domains predicted to form G4 DNA include four classes of repetitive regions: telomeres, rDNA, immunoglobulin heavy chain switch region, and G-rich minisatellites [168]. Replication, recombination, transcription, and telomeric DNA elongation involve steps in which two strands of duplex DNA can be unwound transiently, providing an opportunity for the G-rich strand to form quadruplex structures during these DNA metabolic events [165]. Formation of G4 DNA modulates key cellular processes such as immunoglobulin gene rearrangement, promoter activation, and telomere maintenance [169].

3.3.2. G4 DNA and Genomic Instability

A direct link between potential G4-forming sequences and genomic instability has been provided by genetic studies in model organisms. DOG-1 (deletions of guanine-rich DNA), an ortholog of mammalian FANCJ helicase, is essential for the stability of G-tracts in the genome of C. elegans [170, 171]. Worms defective in DOG-1 accumulate deletions in regions of the genome containing long G-tracts [171] whereas the introduction of a G-quadruplex-forming DNA sequence into C. elegans is highly mutagenic and is removed from genomes lacking DOG-1 [170]. FANCJ is one of the 13 known genes which lead to Fanconi anemia, and cells from patients lacking functional FANCJ accumulate large genomic DNA deletions that map to potential G4-forming sequences [172]. Moreover, FANCJ preferentially unwinds G quadruplexes over other DNA substrates in vitro suggesting that the FANCJ helicase, like DOG-1, functions in resolving potential replication impediments caused by DNA G-quadruplexes [156]. The RTEL (regulator of telomere length) helicase, another DOG-1 homolog, has a very clear role at telomeres in mice, and RTEL-deficient embryonic stem cells exhibit chromosome-end fusions lacking detectable telomere signals [173]. It is suggested that G-quadruplexes impose a structural barrier to DNA replication and various nucleic acid processing enzymes and are a potential source of genetic instability if not resolved. Identification of several DNA helicases that efficiently unwind and disrupt G4 DNA indicates that eukaryotic cells possess the mechanism for resolution of G4 DNA structures.

3.3.3. Resolution of G4 DNA by RecQ Helicases

RecQ family members are prominent in that they preferentially unwind tetraplex DNA [37]. The E. coli RecQ [174], yeast SGS1 [175], and human WRN [176] and BLM [177] proteins have been demonstrated to melt synthetic G4 DNA constructs. Both SGS1 and BLM unwind G4 DNA with at least 15-fold preference relative to duplex substrates [175, 178, 179] and HJ structures [178]. This substrate preference correlates with the binding affinity and maps to the conserved RQC region of the RecQ proteins [41]. The G4 DNA unwinding activity is proposed to contribute in the maintenance of two G-rich genomic domains, rDNA and telomeres. SGS1 is required for recombination-mediated lengthening of telomeres in telomerase-deficient S. cerevisiae [180182]. Furthermore, SGS1-deficient cells are characterized by nucleolar fragmentation and production of rDNA circles suggesting a role of SGS1 in rDNA metabolism [183, 184]. A possible role of WRN in rDNA metabolism is indicated by the fact that a significant fraction of WRN is nucleolar [185]. Notably, cells from Werner syndrome patients show premature senescence and accelerated rates of telomere shortening [186]. WRN helicase was shown to be necessary for preventing dramatic telomere loss during lagging-strand replication of the G-rich strand and the consequent accumulation of chromosome aberrations such as chromosome fusions [187]. Consistent with a role in telomere maintenance, the WRN helicase is localized to telomeres, possibly via its interaction with TRF2 which also binds BLM [188, 189].

By resolving the tetraplex and other non-B DNA structures, RecQ proteins might clear the way for DNA polymerase during replication or repair synthesis. In support of this hypothesis, Kamath-Loeb et al. demonstrated that physical association of DNA polymerase δ with WRN enables unwinding of tetraplex (and hairpin) structures by the helicase and allows polymerase to pass through the roadblock [190, 191]. WRN was also shown to physically interact with p50 subunit of human pol δ which constitutes the active dimeric core of the enzyme with p125 subunit [192]. Thus a possible function of WRN (and presumably other RecQ proteins) might be the recruitment of this polymerase to the complex secondary structures and restoration of stalled DNA synthesis. Indeed, stimulation of DNA polymerase activity of pol δ by BLM and stimulation of BLM helicase activity by pol δ have been demonstrated [193]. Cellular phenotypes of genetic mutants and the demonstration of robust G4-unwinding activity in vitro support the notion that failure to unwind G4 DNA contributes in part to the genetic instability observed in Bloom and Werner syndrome cells.

Yet, RecQ proteins are not the only helicases known to resolve G4 structures. Besides FANCJ and its orthologs, Pif1 (petite integration frequency 1), a 5′-3′ helicase, processes G4-forming sequences in vivo and in vitro [194]. Human Pif1 helicase has been shown to bind and unwind G-quadruplex DNA [195]. In yeast, the involvement of Pif1 in telomere stability has been well established [196], and the association of hPif1 with telomeres and telomerase [197] indicates that hPif1 is a telomere G4 DNA-binding protein. Using genome-wide chromatin immunoprecipitation and Pif1-deficient cells, Zakian group has recently demonstrated that G4 motifs are a significant subset of the in vivo binding sites of the S. cerevisiae Pif1, and DNA replication through G4 motifs is promoted by the S. cerevisiae Pif1 DNA helicase [198]. The G4 DNA resolving activity of mammalian Pif1 is of questionable significance as such since the Pif1-null mice are normal [199], a contrasting condition with WRN or BLM-deficient cells where genomic instability can be readily detected. It is conceivable that Pif1 activity is normally unnecessary, with sufficient G4 resolvase activity provided by other helicases (e.g., WRN, BLM, and FANCJ). It is possible that a requirement for Pif1 in mammalian cells would be obvious when one or more of these other G4 resolvase systems are compromised.

G4 resolution is, nevertheless, not a common characteristic of all RecQ helicases. Recently, it was demonstrated that RECQ1 does not unwind G-quadruplex substrates [105]. The inability to resolve this particular form of alternate DNA structure distinguishes RECQ1 from WRN, BLM, SGS1, or E. coli RecQ helicases which proficiently unwind a variety of G-quadruplex DNA substrates [45]. Furthermore, the telomere lengths of RECQ1 wild-type, knockout, or heterozygous mouse cells show no significant difference suggesting minimum to no role of RECQ1 in telomere maintenance [200]. However, RECQ1 was purified with human telomeric chromatin specifically in cells that use a recombination mediated pathway known as Alternative Lengthening of Telomeres (ALTs) for telomere maintenance [201]. It is possible that RECQ1 plays indirect role in telomere metabolism via its interacting partners. Supporting this notion, recent evidence suggests that SGS1 regulates processing of telomeres by the 5′-3′ exonuclease, EXO1 [202]. Interestingly, RECQ1 and EXO1 exhibit physical and functional interaction in human cells [203]; however, it remains to be tested whether they collaborate in a complex for accurate processing of chromosome ends.

Regardless, it has been demonstrated through various studies that certain RecQ helicases are crucial for the metabolism of G4 DNA structures at specific genome locations such as telomeres and rDNA. Mutant phenotypes in yeast and humans affirm vital importance of this function of RecQ helicases in genome maintenance.

4. Concluding Remarks and Outlook

Proficiency of RecQ helicases in unwinding alternate DNA structures has implicated them as roadblock removers for replication fork progression since the DNA sequences that can form unusual, non-B-form structures have been shown to block polymerases in vitro [21]. It has been proposed that at least one function of the RecQ DNA helicases is to prevent aberrant deleterious recombinogenic pathways when replication is perturbed by DNA damage, alternate DNA structure, or impaired DNA synthesis [204]. The processing of aberrant DNA structures by RecQ helicases is likely to counter their potential toxicity incurred by recombinogenic pathways [205] (Figure 3). The ability of helicases to unwind non-B DNA structures would be expected to increase access to repair and replication proteins. The RecQ helicases work in close coordination with other proteins (e.g., topoisomerases) to resolve various secondary structures. Associations of RecQ helicases with proteins critical for various steps of DNA replication and repair [37] suggest that RecQ helicases might cooperate with them to ensure faithful progression of replication forks through natural impediments by non-B DNA structures. It is likely that there is competition between the proteins which promote non-B secondary structure formation and RecQ helicases in vivo. Future studies will uncover how the activities of RecQ-helicases are controlled/regulated (via protein-protein interactions, posttranslational modifications, etc.) for maintaining genomic stability in general and preventing non-B DNA structure-induced instability in particular.

As noted above, RecQ proteins exhibit subtle but noteworthy differences among themselves with respect to their ability to unwind or preference for certain alternate DNA structures. Evidently, current data implies that individual human RecQ homologs are uniquely required to unwind specific DNA structures in vivo. Further investigation is essential to elucidate cellular environment, genomic contexts, and/or protein factors that license a specific RecQ protein to metabolize specific DNA structures in vivo. It is clear that non-B structures both perform physiological roles and potentiate genomic instability. Analyses of the mutation spectrum and genomic rearrangements in RecQ-deficient cells will illustrate significance of RecQ helicases in underlying mutational mechanisms associated with non-B DNA structures [206].

An opportunistic aspect of the unique nature of naturally occurring non-B DNA conformations is to use them as potential target for cancer therapy since these sequence-specific structures are proposed to affect gene expression and telomere activation, respectively [207, 208]. Gene expression of oncogenes could be selectively inhibited by using chemicals (drugs) or small molecules targeted to specific non-B DNA conformation present in their regulatory regions [129, 209]; stabilizing the secondary structures would be predicted to prevent access of nucleic acid binding proteins and interfere with critical cellular processes. Considering the demonstrated roles of RecQ helicases in resolving such non-B DNA structures, specific inhibition of RecQ and other non-B resolving helicases via small molecules [210], DNA-binding drugs or gene silencing might be a promising strategy to explore for anticancer therapy [207]. Development of new methodologies to investigate specific functions of non-B DNA structures and identification of novel structure-specific DNA helicases involved in resolution of such secondary structures will certainly expand the array of molecular targets available for drug development and therapeutic intervention.

Acknowledgment

This work is made possible in part by NIH/NCRR Grant 2 G12 RR003048 and NIH/NIGMS Grant 7SC1GM093999-02.