Abstract

Genomic imprinting in mammals results in the expression of genes from only one parental allele. Imprinting occurs as a consequence of epigenetic marks set down either in the father's or the mother's germ line and affects a very specific category of mammalian gene. A greater understanding of this distinctive phenomenon can be gained from studies using large genomic clones, called bacterial artificial chromosomes (BACs). Here, we review the important applications of BACs to imprinting research, covering physical mapping studies and the use of BACs as transgenes in mice to study gene expression patterns, to identify imprinting centres, and to isolate the consequences of altered gene dosage. We also highlight the significant and unique advantages that rapid BAC engineering brings to genomic imprinting research.

1. Introduction

Genomic imprinting describes a unique class of genes that are expressed from only one parental allele as a consequence of epigenetic marks set down either in the father’s or the mother’s germ line [12] (Figure 1). Essentially, although two gene copies are physically present within each diploid somatic cell, only one gene copy is transcriptionally active, producing an RNA product. The first evidence that individual genes were imprinted came from studies on the mouse Insulin-like growth factor 2 (Igf2) gene [13]. An embryonic growth restriction phenotype was apparent in heterozygous offspring after paternal transmission of a targeted deletion of this locus initially suggesting haploinsufficiently. However, heterozygous animals also had unexpectedly low levels of expression of Igf2 rather than the anticipated 50% reduction. Imprinting of the locus was subsequently demonstrated genetically [14]. In quick succession, a receptor for Igf2, Igf2r, and one of the most abundant RNAs in the developing mouse embryo, H19, were found to be maternally expressed [15, 16]. Thus, in short succession, allele-specific gene expression was demonstrated for three genes in mice. We now know of at least 80 protein-coding genes that are imprinted in both mouse and human. Many of these genes play important roles in early development, and many are physically linked within domains of both maternally- and paternally-expressed genes. This work is summarized at http://www.mousebook.org/. Imprinted genes within domains are regulated by discrete genomic regions called imprinting centres (ICs) [17]. These regions, which can also be referred to as imprint control elements (ICEs) or imprint control regions (ICRs), are functionally defined by engineering-targeted deletions in mice [1832]. Inheritance of an IC deletion through one parental germ line releases all the genes within the domain from their imprinted expression (loss of imprinting, LOI) whereas inheritance through the other parent’s germ line generally, but not always, has no consequence. These ICs carry a DNA methylation imprint on one parental allele only, and studies on the DNA methyl-transferases (Dnmts) Dnmt3a and Dnmt3b and the accessory protein Dnmt3L demonstrate the necessity of de novo DNA methylation for the establishment of allele-specific gene expression [3337].

Much of our understanding of the mechanism and function of genomic imprinting is based on data from the targeted deletion of imprinted genes, trans-factors or ICs in mice. However, while studies on loss of function are important for our understanding of gene function per se, imprinting is a dosage-related phenomenon. Altering the dosage of the imprinted gene is more informative with respect to the function of the imprint. Furthermore, while targeted deletions of ICs may confirm the requirement of a region for imprinting, further painstaking work is required to dissect the function of these regions and to identify each element of the imprinting process. Transgenes provide an essential tool to our armory. A transgene-based approach can provide information both on the mechanism of imprinting and also the functional consequences of increased gene expression in a single model. In this respect, transgenes based on bacterial artificial chromosomes (BACs) have been of particular value.

2. BACs

BACs were first developed as a large insert clone system to facilitate the construction of an orderly set of overlapping clones as tools for the Human Genome Project [38]. BACs are single copy replicons based on the naturally occurring Escherichia coli fertility plasmid. This vector system is capable of cloning and propagating large DNA fragments with an average insert size of 150 kb and a maximum insert size of 700 kb. The key advantage of BACs over other large insert technologies is their stability in culture and ease of manipulation. These qualities initially rendered them an ideal resource for physically mapping genomes and they have been used in almost all the genome sequencing projects [38]. One major advantage that large insert clones bring to transgenic research is that they are more likely to contain the necessary promoter, enhancer, and silencer combination to mimic the natural expression of the gene of interest. The advantages of the BAC transgenic approach compared to a conventional transgenic approach have been discussed extensively elsewhere [39]. However, there are several advantages that BACs bring that are specific to imprinting research. Firstly, the imprinting capacity of BACs carrying both target genes and putative ICs can be examined outside the normal chromosomal context. Secondly, BACs can be used to study the developmental consequence of accurate but excess expression of single genes. And thirdly, their amenability to modification techniques to insert or delete sequences and to alter sequences as discrete as a single point mutation [4044] makes them a powerful tool for addressing both mechanistic and functional questions. BACs can be modified rapidly in vitro, with an average construction time of less than 4 weeks. Once the modified BAC is made, transgenic founders can be generated by the pronuclear injection of the construct into fertilised eggs to generate transgenic founders. Thus, modified BACs can provide an important additional tool, alongside traditional targeting of endogenous loci in embryonic stem (ES) cells.

3. Physical Maps of Imprinted Domains

BACs were first applied to imprinting research in order to generate physical maps of imprinted domains. Early work suggested that imprinted genes were not randomly scattered throughout the chromosomes but localized to discrete domains containing both maternally- and paternally-expressed genes. This featuristic organization is ideally suited to the construction of contigs of large genomic clones. Firstly, to provide detailed information about the physical organization of the known genes within the domain. Secondly, to extend the contigs to identify physically linked genes and test their imprint status. Thirdly, and perhaps most importantly, to identify nongenic regions of conservation between different species as likely ICs.

The first imprinted region to be physically mapped using BACs was human chromosome 11p15.5 [45]. This region has been the focus of intense study because of the association with the classic imprinting disorder, Beckwith Wiedemann Syndrome (OMIM 130650; BWS). First human and then mouse contigs [2, 4648] provided templates for sequencing to reveal information on the location of genes, their physical structure, and also the location of genomic sequence features conserved between mouse and human. This approach was important in identifying both a conserved putative IC (IC2) carrying a germ line imprint and also novel genes located inside and outside the previously proposed boundaries of the imprinting cluster [49, 50]. BACs have subsequently been similarly employed to physically characterize additional imprinted loci including SNRPN, MEST/PEG1, Peg3/Zim1, Dlk2/Gtl2, Gnas, and Neuronatin [1, 3, 5156].

4. BACs and Studies on the Evolution of Imprinting

The construction of BAC libraries for a variety of species provides unprecedented resources for advancing the ongoing research into genomic imprinting. These resources have particular relevance to studies on imprinting because, while imprinting has been demonstrated in marsupial and eutherian mammals, it has not been reported in monotremes (platypus and echidna) or nonmammalian vertebrates. This suggests that imprinting arose sometime after the divergence of monotremes (prototherians) from therian mammals, which has important implications for understanding the rational for this phenomenon [57]. Comparing genomic regions in key representatives of mammalian diversity and phylogeny will be of great value in unlocking further secrets of genomic imprinting.

As the most intensively investigated imprinted domain in the human and mouse genomes, the BWS imprinted region has also been scrutinized in nonmammalian vertebrates through the isolation of BAC clones [58, 59]. This work was important in establishing that the imprinted genes within this domain were physically linked prior to acquisition of their imprint. Orthologues of a number of imprinted genes have also been isolated from BAC libraries constructed from the genomes of the tammar wallaby (marsupial) and the platypus (monotreme) [60]. Mapping these BACs to their respective chromosomes demonstrated that these imprinted gene orthologues existed on separate chromosomes prior to the evolution of imprinting. This finding suggests that genomic imprinting evolved independently from X-inactivation, despite similarities in the epigenetic mechanisms directing these two processes.

In eutherian mammals, imprinted domains are regulated by imprinting centres [17]. Using these DNA sequences directly as probes to identify orthologous regions in other species has not been successful. This is most likely due to their high GC content (sticky probes) and the relatively low conservation of DNA sequence, even between human and mouse. However, BACs can be identified that span regions predicted to contain ICs by using sequences from nearby protein-coding genes as baits. As an example, an SGCE/PEG10 tammar wallaby BAC was isolated using a mouse Sgce cDNA as a probe [61]. The BAC also contained a differentially methylated region (DMR) equivalent to that observed in eutherians providing the first evidence that DNA methylation is a conserved feature of the imprinting mechanism. This work highlighted another added value of BACs in cross-species studies. The tammar wallaby PEG10 gene was not identified in a direct screen but by primer walking from the physically-linked SGCE. Employing the same methodology, researchers were unable to identify any sequence with similarity to PEG10 within the platypus SGCE BAC contig, thus, demonstrating that the PEG10 was inserted into the genome after monotremes diverged from the other lineages. Thus, not only has the mechanism for imprinting genes arisen at a critical time in the evolution of modern mammals, but also new genes have been added to the genome with entirely novel functions.

5. BACs as Transgenes

In addition to physical mapping studies, BACs have provided useful tools for establishing the expression and imprinting capabilities of specific genomic regions within imprinted loci (Table 1). Putative ICs for imprinted domains can initially be identified by their epigenetic characteristics (differential DNA methylation). These regions can subsequently be functionally defined by targeted deletion of the endogenous locus (see earlier). Another stringent test for imprint control regions is to examine their function at ectopic loci. If a sequence can direct imprinting when integrated randomly into the genome, this suggests that all the component parts of the imprinting mechanism are contained within this sequence.

The first transgenic studies of this sort relied on plasmid-based transgenes. However, the smaller transgenes were of limited use primarily because the site of integration can influence both expression and imprinting [62] but also because small transgenes were less likely to carry all the required elements to recapitulate expression of the endogenous locus. These disadvantages were overcome by increasing the size of the genomic region included in the transgene. The Igf2/H19 locus on mouse distal chromosome 7 and the Igf2r locus on mouse chromosome 17 were the first imprinted regions to be transgenically dissected using large genomic clones [63, 64]. YACs were used in both cases. Imprinted expression was reliably established away from the respective domains indicating that the cis-sequences required to establish imprinting lay within the genomic regions encompassed by these YACs. BACs have largely rendered YACs obsolete for these types of study because of their relative stability in culture and ease of DNA preparation [65] and, most importantly, the development of technologies to insert or remove specific sequences, reviewed recently [39, 66]. Sequences, such as reporters, can be homologously recombined into BACs to provide useful information on both spatial and imprinted expression. Where BACs contain more than one gene or where overexpression might be non-viable, homologous recombination can be used to inactivate gene loci [1, 2, 9, 10]. As a future goal, putative ICs could be mutated at the single nucleotide level to explore mechanistic questions or conditionally targeted in order to examine the temporal requirement for these sequences. Although these latter procedures can be performed at the endogenous IC, the major advantage that BAC modification protocols have over targeted homologous recombination in ES cells is speed. Once a targeting vector is constructed, BACs can be modified in a matter of weeks and then injected into fertilized mouse oocytes to generate founders within a few months.

Cdkn1c (previously known as p57Kip2) and Neuronatin (previously known as Peg5) were the first imprinted loci to be mechanistically explored using BACs [1, 2]. Cdkn1c maps to the BWS imprinted region on mouse distal chromosome 7/Human chromosome 11p15. This is one of the most complex imprinted regions in mice containing at least 18 maternally- and paternally-expressed genes and three DMRs (Figure 2(a)). In mice, the region can be separated mechanistically into two distinct domains, termed the IC1 and IC2 domains [2022, 26, 67]. Cdkn1c maps within the IC2 region. The fact that the regulatory elements were located at a distance from Cdkn1c was first suggested by studies on a 38 kb cosmid-based transgene spanning the human CDKN1C locus [68]. Despite containing 20 kb of sequence upstream and 15 kb of sequence downstream of the gene, CDKN1C was not expressed from the human transgene in multiple lines. Transgenic expression of Cdkn1c was only achieved using larger murine BAC-based transgenes suggesting the existence of distantly-located enhancers [2]. Two BACs spanning the murine Cdkn1c gene, of 85 and 260 kb, respectively, were engineered to include a β-galactosidase reporter under the control of the Cdkn1c promoter (Figure 2(b)). Whole-mount LacZ analyses provided easy access to the expression pattern of the Cdkn1c gene under control of regulatory elements within the BAC. From the smaller BAC, Cdkn1c-lacZ was expressed in a subset of tissues in which the endogenous Cdkn1c locus was expressed whereas the much larger 260 kb BAC drove expression in all embryonic tissues (Figure 2(c)). Enhancers for extraembryonic tissues lay outside the 320 kb region scanned. The murine Cdkn1c gene is spanned by a DMR [68], but neither BAC transgene autonomously imprinted Cdkn1c [2]. A second DMR within the IC2 region, called KvDMR1, has now been functionally defined as the imprinting centre for the region controlling imprinted expression of Cdkn1c and the other maternally-expressed genes [26]. This imprinting centre is contained within the 260 kb Cdkn1c BAC so, in theory, it should imprint at ectopic chromosomal loci. This was not the case suggesting that KvDMR1 requires additional elements to function as an IC. Currently, only an 800 kb YAC, which encompasses almost the entire IC2 domain, has the capacity to imprint Cdkn1c [67].

Adjacent to the IC2 domain lies the IC1 domain, which spans Igf2, Ins2, and H19 (Figure 2(a)). Igf2/H19 were initially shown to imprint at ectopic loci from a 130 kb YAC [63], and a 137 kb BAC was used to further refine the minimal region required to imprint H19 to −7 kb and +35 kb of the H19 promoter [69]. This region contains the functionally defined IC just upstream of H19 [2022]. Although smaller transgenes can drive imprinted expression of H19, they do so unreliably and without inducing germline DNA methylation at the IC whereas the IC within the larger BAC clone does become DNA methylated in the male germline [70]. These data suggest that, while ICs initiate the imprinting mechanism, the surrounding sequence is important in interpreting and maintaining the process.

The paternally expressed Neuronatin gene maps to one of the least complex imprinted domains, located on mouse distal chromosome 2/Human chromosome 20 (Figure 3(a)). Neuronatin is not located within a cluster of imprinted genes but lies within the intron of a second gene, Blcap (previously known as Bc10) [1]. Blcap shows a maternal-allele bias in expression in tissues where Neuronatin is highly expressed, transcriptional interference rather than a direct imprint [71]. The body of the Neuronatin gene carries direct differential DNA methylation on the maternal allele and within this DMR, there is a smaller region that exhibits the biochemical characteristics of an IC [1, 72]. Transgenic mice engineered with a series of BAC clones modified to include a β-galactosidase reporter under the control of the Neuronatin promoter were used to demonstrate that the minimum sequence required to imprint Neuronatin was approximately 30 kb and, indeed, encompassed the putative IC (Figures 3(b) and 3(c)). In addition, these studies revealed that enhancers for tissue-specific expression of Neuronatin were primarily located upstream of the putative IC and that some of them lay at a significant distance from the body of the gene.

Overlapping BAC transgenes have also been used to explore the Delta-like1 (Dlk1)/Gene-trap locus2 (Gtl2) imprinted domain on mouse chromosome 12. Imprinted expression of Gtl2 was reported from a 178 kb BAC that spans a region from 3.5 kb upstream of the physically linked Delta-like1 (Dlk1) gene to 69 kb downstream of Gtl2 [4]. This Dlk1/Gtl2 BAC drove expression of Gtl2 in a subset of tissues in which the endogenous locus is expressed, but Dlk1 was not expressed from this BAC in any tissue. Dlk1 was expressed from a smaller 70 kb BAC encompassing more sequence upstream of Dlk1 gene, but expression was not imprinted, thus confirming the location of the IC linked to Gtl2 [8].

Some BAC studies are more difficult to interpret. The 120 kb BAC spanning the Peg3/Zim1 locus, which contains 20 kb of sequence upstream of Peg3 and 80 kb of sequence downstream, showed imprinted expression of Peg3 in one transgenic line but not in two others [3]. Peg3 is spanned by a germline DMR [73, 74] which suggests that the IC for Peg3 is contained within this 120 kb BAC. Like the ICs for H19/Igf2 and Cdkn1c, perhaps the Peg3 IC is reliant on additional sequences to fully communicate the imprinting signal.

6. Functional Studies of Imprinted Genes

Providing the appropriate regulatory elements that are also present, genes are expressed from BACs with spatial and temporal accuracy and at similar levels to the endogenous loci, predominantly without being affected by the site of integration [75]. Consequently, BACs can be used to precisely engineer increased dosage of gene loci. This has particular relevance to studies on imprinted loci because gene dosage is key to the phenomenon. Genomic imprinting alters the expression level of a particular gene from one parental allele without altering its essential function. Therefore, engineering altered dosage of an imprinted gene addresses the function of the imprint as well as the function of the gene. Modifying the endogenous locus by targeting an IC is one route to engineering biallelic expression, recently reviewed [76]. These models can provide excellent tools for understanding the consequences of increased gene dosage, particularly where the IC controls a few well-characterized targets. However, LOI can mean increased dosage of some genes (gene activation) and loss of expression of others (gene silencing). Furthermore, the majority of imprinted domains are complex and not fully characterized. Interpreting the results of these studies with respect to individual genes is not straightforward.

The critical advantage that BAC transgenes provide over LOI models for exploring the function of imprinting is that the exact nature and number of genes is precisely defined by the transgenic sequence assigning phenotypes unequivocally to the gene sequences within the transgene. BACs may also be useful in “rescuing” phenotypes associated with LOI of complex domains, particularly in cases where loss of expression of one gene within the domain preclude a phenotypic assessment of other genes. Most importantly, the ability to rapidly modify BACs is particularly helpful in situations where more than one gene is present in close proximity.

Functional studies performed on the imprinted locus containing the three closely linked genes, Phlda2, Slc22a18, and Cdkn1c, provide a textbook example of the advantages that BACs bring to imprinting research [911]. These three genes are all contained within a 40 kb region of IC2 domain. Their close proximity means that it would likely be impossible to separate the genes on individual genomic fragments and still maintain appropriate temporal and spatial expression. However, when an 85 kb transgene spanning this locus was found to contain the placental enhancers for Phlda2 and Slc22a18 but not for Cdkn1c, this allowed the assignment of a placental stunting phenotype to overexpression of just Phlda2 and/or Slc22a18 [11]. The placental stunting phenotype is reciprocal to placentomegaly induced by loss of expression of Phlda2 [77]. This suggested that Phlda2 acts as a rheostat for placental growth, with overgrowth after gene deletion and growth retardation after loss of imprinting [11]. A key role for Phlda2 in regulating placental weight and glycogen storage was genetically verified by combining a single copy of the BAC transgene with a maternally-inherited targeted deletion of Phlda2 to rescue Phlda2 overexpression [10]. Essentially, these double transgenic mice have wild-type levels of Phlda2, but Slc22a18 remains in excess. Their placentae were phenotypically indistinguishable from wild type, thus excluding a role for Slc22a18 in placental growth restriction.

In addition to exhibiting placental growth restriction, the Cdkn1c/Phlda2/Slc22a18 BAC transgene also restricted embryonic growth from E13.5. This early growth restriction phenotype was genetically assigned to excess Cdkn1c and not excess Phlda2 or Slc22a18 by engineering a modification to abolish Cdkn1c expression from the BAC. Mice carrying the modified BAC, with excess Phlda2 and Slc22a18 expression but normal levels of Cdkn1c, were not growth restricted at E13.5, thus providing genetic evidence that Cdkn1c encodes a potent negative regulator of embryonic growth [9]. In addition to assigning the early embryonic phenotype to excess Cdkn1c, ablation of Cdkn1c function from the BAC transgene uncovered a second distinct growth restriction phenotype. Mice carrying the modified BAC (no Cdkn1c overexpression) were the same weight as nontransgenic embryos at E13.5 but showed a progressive loss of growth potential later in gestation, being 13% lighter that controls by birth [10, 11]. This suggests a role for Phlda2 and/or Slc22a18 in regulating late embryonic growth. This phenotype would have been missed by any other approach as the earlier Cdkn1c-induced growth restriction phenotype effectively obscures the later phenotype. This study perfectly illustrates the way in which subtle phenotypes associated with altered expression of one imprinted gene within a domain can be masked by other closely linked genes.

Functional BAC-based studies have also been performed on the Dlk1 locus [8] (Table 2). Dlk1, also known as Preadipocyte factor 1 (Pref-1), encodes an inhibitor of adipocyte differentiation. Loss of expression of this paternally-expressed gene in mice results in growth retardation, obesity, abnormal eyelids, skeletal malformation, and increased serum lipid metabolites [78]. Examining the consequence of overexpression of Dlk1 in isolation was only achievable using a 70 kb BAC transgene containing 49.4 kb of sequence upstream of Dlk1 and 18 kb downstream of the Dlk1 transcriptional start site. In contrast to the studies on Cdkn1c and Phlda2, Dlk1 was expressed at approximately the endogenous level from the BAC transgene regardless of copy number. However, a triple dose of Dlk1 was achieved by generating mice homozygous for the transgene. The 70 kb BAC recapitulated the spatiotemporal expression of the Dlk1 in embryonic tissues but not in the placenta. The study revealed an intriguing dual role for Dlk1 in driving embryonic overgrowth but with significantly reduced fitness after birth, demonstrating that Dlk1 is a dosage-critical gene within its domain, a key principle of imprinting.

7. Future Work

Initial studies on BACs have demonstrated their importance in both dissecting imprinting mechanisms and understanding imprinting function. Our ability to target BAC transgenes to specific loci as single copies would improve this technology. Such an approach would be useful in rigorously testing the imprinting capacity of different BAC clones within a single chromosomal location. Single-copy BACs would also “restore” biallelic expression to specific imprinted loci, a critical component of functional studies. Current approaches to generate BAC transgenic mice involve either pronuclear microinjection into fertilised eggs or electroporation into ES cells. Both these techniques result in the random integration of BAC clones into the mouse genome. This can cause variability due to differences in the copy number of the BAC and also position effects caused by the site of integration, albeit with low frequency compared to plasmid-based clones. As a result, multiple independent founder lines must be analysed. In addition, multiple copy number integrations can result in high levels of gene expression, which are less relevant to studies on genomic imprinting. Recently, BACs have been modified to contain the sequences necessary for homologous recombination into, and complementation of, the partially deleted hypoxanthine phosphoribosyltransferase (Hprt) locus in ES cells with positive selection for Hprt to achieve single-copy integrations [79]. Further developments could be based on recombining pre-existing loxP sites within some BAC vectors and one inserted at the Rosa26 locus [80], bypassing the necessity for any modification of the BAC.

BACs may be useful in addressing further important questions. For example, we know that ICs are required to establish imprinted expression within their domains. However, what happens if these ICs are deleted after the imprint has been established? Can domains maintain their imprinted status in the absence of continued signaling from their ICs? The ability to conditionally target ICs will provide important information on their role initiation of the imprint verses maintenance of imprinting. This is not a specific advantage to BACs since loxP sites can be targeted at the endogenous locus or to a BAC. However, performing these studies on a BAC clone would circumvent cases where loss of imprinting at the endogenous locus results in embryonic lethality, allowing studies in the adult.

Our ability to rapidly modify BACs in vitro would also make two-step sequential modifications more practical. BAC recombineering could then be used to generate a single clone containing two different regions flanked either by loxP-loxP sites or by FRT-FRT/loxP sites allowing sequential deletion of these regions in vivo Cre- and FLPe-recombinases, respectively.

In addition, modified BAC clones are now being used themselves as targeting vectors. Plasmid-based targeting vectors cover relatively short regions of the genome of a few kilobases. BACs can be used to generate targeting vectors where the two loxP sites are placed far apart. Such an approach would facilitate the generation of models aimed at the conditional deletion of larger genomic regions spanning two or more genes.

In summary, we have provided key examples of how BAC transgenesis has so far provided a powerful tool to study genomic imprinting. BACs can be used to address both mechanistic and functional questions. Our ability to rapidly modify BACs in vitro suggests that they have the potential to significantly further our understanding of genomic imprinting.

Abbreviations

IC:Imprinting centre
LOI:Loss of imprinting
DMR:Differentially methylated region
YAC:Yeast artificial chromosome
BAC:Bacterial artificial chromosome.

Acknowledgments

S. J. Tunster was supported by BBSRC Grant no. BB/G015465, and M. Van De Pette holds a BBSRC Ph.D. studentship.