Abstract

Artificial chromosomes and minichromosome-like episomes are large DNA molecules capable of containing whole genomic loci, and be maintained as nonintegrating, replicating molecules in proliferating human somatic cells. Authentic human artificial chromosomes are very difficult to engineer because of the difficulties associated with centromere structure, so they are not widely used for gene-therapy applications. However, OriP/EBNA1-based episomes, which they lack true centromeres, can be maintained stably in dividing cells as they bind to mitotic chromosomes and segregate into daughter cells. These episomes are more easily engineered than true human artificial chromosomes and can carry entire genes along with all their regulatory sequences. Thus, these constructs may facilitate the long-term persistence and physiological regulation of the expression of therapeutic genes, which is crucial for some gene therapy applications. In particular, they are promising vectors for gene therapy in inherited diseases that are caused by recessive mutations, for example haemophilia A and Friedreich's ataxia. Interestingly, the episome carrying the frataxin gene (deficient in Friedreich's ataxia) has been demonstrated to rescue the susceptibility to oxidative stress which is typical of fibroblasts from Friedreich's ataxia patients. This provides evidence of their potential to treat genetic diseases linked to recessive mutations through gene therapy.

1. The Rationale behind the Use of Large DNA Molecules in Gene Therapy

Gene therapy is defined as the transfer of nucleic acid molecules (usually DNA) to a patients somatic cells in order to prevent, treat, or alleviate a specific condition. Different gene therapy strategies have been designed to suit different types of diseases, the most “classical” of which involves gene delivery to target cells in order to obtain optimal expression of the gene introduced. This therapeutic approach is particularly well suited for inherited diseases that are caused by recessive mutations, since these are typically associated with the absence of a functional gene product or the drastic decrease in the expression of a gene. In these cases, the “therapeutic gene” must be inserted within a DNA molecule (usually a bacterial plasmid) along with all its essential regulatory sequences in order to ensure the correct expression of the gene in the target cells. To facilitate the adequate cellular uptake of DNA molecules, they must be packed within appropriate “gene delivery vehicles”.

Viral vectors have become the preferred “gene delivery vehicles” in the field of gene therapy due to their extremely high efficiency of gene transfer in somatic cells. However, viral vectors fail to fulfil all the requirements for an “optimal” gene therapy vector. Ideally, the best gene therapy protocol would involve a single administration of the therapeutic gene to the organism, such that it could replicate and segregate alongside the endogenous chromosomes, if necessary, thereby permitting long-term maintenance in any dividing cell. However, most viral vectors are either nonintegrating episomes that are unable to replicate, segregate, and persist in dividing cells, or they are integrating vectors. These latter vectors insert into the chromosomes of the host cell, and although they can therefore persist in proliferating cells, they carry the risk of producing insertional mutagenesis.

Adenoviruses are DNA viruses that have a linear double-stranded 36 kb genome. These viruses can be used as nonintegrating vectors since their genome is maintained as an episome once it is released into the nucleus of the host cell. However, these episomes are lost after several cell divisions. Moreover, the first-generation adenoviral vectors elicited a strong immune response that led to elimination of the transduced cells, and the humoral response excluded them after future administrations [1, 2]. To diminish or overcome this reaction, multiple deletions have been introduced into the subsequent generations of high-capacity [3] “gutless” adenoviral vectors [4, 5]. Nevertheless, while these vectors can now infect a wide range of cells, their lack of persistence in dividing cells represents an important weakness.

Retroviral-based vectors are capable of transducing a wide variety of dividing cells and they are efficiently integrated in the genome of the host cells [6]. However, this latter characteristic represents their major disadvantage since random integration may cause insertional mutagenesis, and the ensuing activation and/or silencing of undesirable genes might even trigger oncogenic transformation in the host cells [7, 8]. Indeed, some patients treated with retroviral vectors in a clinical trial for severe combined immunodeficiency (SCID) developed leukemia as a consequence of vector integration [9].

While Lentiviruses belong to the retrovirus family, they are able to transduce both proliferating and non-proliferating cells, augmenting the range of target cells [10, 11]. These viruses are currently the most widely used gene therapy vectors, as they sometimes permit prolonged gene expression, although gradually silencing of gene expression has also been reported on occasion [12].

Finally, most viral vectors have a limited packaging capacity and they cannot accommodate large genes. For this reason, most constructs used for gene therapy are “minigenes”, consisting of cDNA sequences (instead of entire genes) under the control of heterologous promoters (usually small compacted promoter sequences of viral origin). These “minigene constructs” often exhibit variable expression or lack tissue specificity, and they may not be properly regulated in somatic cells, being completely “silenced” in many cases, which obviously restricts their usefulness in gene therapy. Indeed, studies performed in transgenic mice have demonstrated that these “minigene constructs” are very susceptible to being switched off by neighbouring chromatin [13]. On the other hand, there are cases where the cDNA is too big to be contained in one of these vector systems and some “nonessential” regions have to be eliminated [14].

By contrast, studies performed in transgenic mice have highlighted the advantages of using large fragments of genomic DNA that permit the delivery of intact mammalian genes with all their introns, promoters, enhancers, and long-range controlling elements. By using gene’s normal promoter and controlling elements, the level and control of expression is comparable to the endogenous expression of the gene. The experience gained from using yeast artificial chromosomes (YACs) and bacterial artificial chromosomes (BACs) has shown that large genomic fragments drive tissue-specific expression at endogenous levels [1520]. For example, transgenic mice carrying the intact human cystic fibrosis transmembrane regulator gene (CFTR), which spans 200 kb of a 300 kb YAC [21], express CFTR in an appropriate tissue specific manner and complement the cftr defect in null mice. Similar physiological expression has also been achieved with the Huntingtin gene, which has been difficult to express from minigene constructs [22]. Likewise, the use of a YAC or BAC carrying the entire frataxin gene has rescued frataxin knock-out mice from embryonic lethality [16, 19]. In view of these data, large DNA molecules seem to be better candidates as vectors for gene therapy than the typical viral vectors, with their limited capacity.

As a result of the human genome project, BACs are available that cover the whole genome. Given that the average size of a BAC is approximately 150 kb, most genes, together with all their regulatory elements, can be included in such a construction. A system for linking overlapping BACs by homologous recombination in E. coli has been described for when a BAC containing a whole locus is not available, or if new regions need to be added upstream or downstream of a given gene within a BAC [2325]. Also recombination has been used to modify BACs with different tags, which is very useful in studies of protein-protein or DNA-protein interactions, and of protein localization [26].

This paper will deal with mammalian artificial chromosomes and minichromosome-like episomes. We will focus on recent advances on the use of BACs containing genomic loci as a platform for the design of artificial minichromosome-like vectors for gene therapy applications, and on the “delivery vehicles” that can accommodate such large DNA molecules. We will also analyse two examples of applications of these artificial minichromosome-like vectors for rare genetic diseases (haemophilia A and Friedreich’s ataxia).

2. Mammalian Artificial Chromosomes

An obvious vector to stably carry large genomic DNA fragments would be an artificial minichromosome that must have a replication origin and a centromere, enabling the DNA molecule to replicate during S phase and to correctly segregate during mitosis. In addition, if the DNA molecule is lineal telomeres must be present at both ends to guarantee its stability. The centromere is the most important chromosomal element since it is crucial for the attachment of chromosomes to the mitotic spindle and their correct segregation in mitosis. Centromeric DNA binds the CENP proteins that form the scaffold for the kinetochore, the structure connecting the centromere to the spindle microtubules. Artificial chromosomes have been built in yeast without any difficulty, since yeast chromosomes contain small centromeres. However, mammalian centromeres are very large and extraordinarily difficult to engineer, which has complicated the construction of mammalian artificial chromosomes (MACs) or human artificial chromosomes (HACs) [27].

2.1. Chromosomal Elements

All normal human centromeres contain large arrays of alphoid satellite DNA ( -DNA), a family of repetitive DNA sequences first discovered in the African green monkey [28]. Satellite DNA corresponds to only a small fraction of the human genome (less than 2%) whereas in other species like the rat, it is much more abundant [29]. Human -DNA consists of tandem repeats of a 171 bp monomer arranged in highly ordered repeats [30, 31]. The divergent intermonomeric sequences lying between the independent monomers vary from 20% to 40% whereas the sequence divergence is around 5% in the higher order repeats [32]. Despite this monomeric variation, each human centromere is thought to be formed by arrays of a unique monomer spanning from several kilobases up to megabases [33, 34].

Several attempts to generate artificial chromosomes have used alphoid DNA arrays from different chromosomes (X, Y, 17, and 21), although not all have succeeded. Alphoid DNA from chromosome Y has failed to form active centromeres in some cases [35, 36], probably because of the lack of CENP-B boxes [37]. Masumoto and collaborators have also showed how two different alphoid DNA loci from the same chromosome behave in a completely opposite manner. They found that alphoid-derived YAC unable to form HACs do not have CENP-B boxes in addition to a more diverged alphoid repeats [38].

In terms of specific DNA sequences, alphoid DNA may not be the only element needed for centromeres to fulfil their function, implicating additional mechanisms [39]. This view is supported by the lack of a consensus sequence for -DNA [32], the existence of dicentric chromosomes with two identical regions of -DNA but only one active centromere [4042], or the fact that functional neocentromeres can be formed from noncentromeric DNA [43]. However, it is clear that no DNA other than alphoid DNA can efficiently form a centromere after transfection into human cells. It has been already mention that not all alphoid DNAs can form active centromeres with the same efficiency and it appears that functional centromeres required a minimum size, estimated to be around 100 kb [44]. Lo and collaborators analysed all chromosomes in order to determine the minimum length of alphoid DNA tolerated by chromosomes and still able to seed de novo centromere formation [45]. In general, chromosomes cannot bear an important reduction in the length of alphoid DNA array, with the exception of chromosome 21 whose alphoid DNA could oscillate between 50 and 100 kb, with an average size of 78 kb.

Despite all this evidence, there are several reports where introducing different arrays of alphoid DNA does indeed lead to centromere formation. Several studies have shown how when large ( 50 kb) arrays of alphoid DNA are transfected into a human cell line (HT1080) in culture, de novo chromosomes may form with a functional centromere [35, 46, 47]. Nevertheless, much work is still needed in order to elucidate the mechanisms underlying de novo centromere formation.

2.2. MAC Formation

There are two different strategies to form “de novo” HACs. The “bottom-up” approach consists of assembling the elements needed for HAC formation in a vector and with this aim, a universal system to introduce alphoid DNA into any BAC [48] using in vivo recombineering has been design, containing a 70 kb array of this alphoid DNA proven to be capable of forming de novo MACs [38, 46]. With the second strategy or “top-down” approach an existing chromosome is manipulated in order to reduce its size, eliminating unneeded elements except for the centromere region [49].

In the first studies carried out on the formation of centromeres and de novo human artificial chromosomes (HACs) [35], synthetic alpha satellite arrays were cotransfected with telomeric DNA and other genomic sequences to obtain linear minichromosomes of 6 to 10 Mb in size. Due to the already mention sequence divergence between alphoid DNA, arrays from 2 different chromosomes were used to obtain 4 minichromosomes-containing cell lines, two with alphoid DNA from the Y chromosome and the other two with alphoid DNA from chromosome 17. However, three of these minichromosomes had acquired DNA from other centromeres or neo-centromeres, and only one of them contained alphoid DNA exclusively from the transfected -DNA. The HACs were cytogenetically stable, in terms of the presence and absence of selection, and their copy number or size did not change over time.

Only one year later, YAC-based mammalian artificial chromosomes (MACs) were generated [47], yeast artificial chromosomes having been widely used to introduce mammalian genes into mice as they can harbour large DNA fragments [15, 50, 51]. In these studies, a recombination deficient yeast host was used in which alphoid arrays from chromosome 21 ( 21-I and 21-II), telomere sequences and selectable markers were introduced, and the modified YAC was transfected into HT1080 cells. The results obtained showed some variability regarding the proportion of cell lines containing minichromosomes (from 18% to 68%), although the number of MACs per cell line was most frequently one. The MACs obtained segregated accurately, they bound centromeric proteins, and they were about 1–5 Mb in size. In addition, they did not gain host DNA and presented a loss rate of less than 1% per generation in the absence of selection.

A further advance came when it was demonstrated that no telomere sequences are needed when the input DNA is circular [46]. Accordingly, it was established that MACs can be formed in HT1080 cells from either linear or circular input DNA, with circular alphoid constructs being very effective at forming MACs, whether or not human telomere sequences are added. Indeed, the presence of these arrays did not favour MAC formation. By contrast, it was ascertained how linear DNA molecules require telomere sequences to maintain and protect minichromosome ends.

When considering the de novo formation of HACs, an important inconvenience is the fact that these structures are not stable in murine cell lines [52], making it difficult to perform studies in mice as a model system. When a previously established HAC was transferred into several murine cell lines (RAG, STO and LA-9), the HAC DNA rearranged and was different to the original HAC. However, murine DNA was not acquired and no integration events were observed. When selection was withdrawn, the artificial chromosomes were rapidly lost at a rate between 2%–5% per day, depending on the cell line. These results show how centromere activity is retained to some extent, as they were still able to bind a marker of active centromere, yet as pointed out above, they also suggested that something else is needed to provide correct segregation.

2.3. Gene Expression from Artificial Chromosomes

The next advance in the development of human artificial chromosomes was probably the demonstration that any gene cloned into those constructs could be expressed normally, as addressed in several reports. There are two studies where the entire human HPRT gene was expressed from such mammalian artificial chromosomes (MACs) [53, 54], using slightly different strategies but with the same results. A PAC containing about 140 kb of genomic DNA, including the human HPRT gene, and a second PAC including a 70 kb array of -DNA derived from chromosome 21 ( 21-I) [46] were transfected into HPRT-deficient HT1080 cells [53]. The results obtained showed how two of the three MAC-containing cell lines generated were mitotically stable and that HPRT expression was maintained over long periods of time, even in the absence of selection. In a second study, a 404 kb HAC made by in vivo recombination and introduced into HPRT-deficient HT1080 cells successfully ensured expression of the gene after two months without any selection [55].

More recently, a 21-derived minichromosome made in DT-40 chicken cells using the top-down approach [56] was introduced into human HT1080 cells by microcell-mediated chromosome transfer to check its stability [57]. Subsequently, the erythropoietin gene was added by cre-loxP recombination and transgene expression and vector stability was finally tested in HFL-1 cells. Accordingly, the vector was mainly maintained as a single copy per cell with no translocation and/or insertion events, and epo expression was sustained over 12 weeks without any loss in the absence of selection.

Using the same strategy, the dystrophin gene (about 2.4 Mb in length) has been cloned into a HAC [58], a construction that enabled all the tissue-specific isoforms of human dystrophin gene to be detected, overcoming the problem shared by other vector systems that only produce one of the isoforms. One year later, the same group used the dystrophin HAC to correct patient-derived fibroblasts and then transform the corrected cells into human iPS (induced pluripotent stem) cells [59]. This is a great achievement as probably autologous-corrected cells would be the best candidate cells to be used in gene therapy protocols.

2.4. Conclusions

It seems that the size and/or structure of the HAC matters since artificial chromosomes have a higher rate of missegregation than normal chromosomes [60, 61]. It is also notable that different HACs behave in a different way in different cell lines [62], suggesting that centromere activity could probably be maintained by trans-acting factors with different specificities between different cell lines, even within the same species.

A better understanding of centromere function seems crucial in order to be able to generate “synthetic” centromeres, which should improve the development of MACs. Thus, despite some of the advances and successes, the engineering of HACs/MACs to carry genes of interest is evidently quite difficult, which has restricted their use as vectors for gene therapy.

3. Stable Replicating Episomes

In view of the difficulties associated with MACs, other alternative vectors have been developed to carry large genomic constructs. Among them, stable non-integrating and replicating episomes have emerged as very promising gene therapy vectors for dividing cells. Interestingly, these stable episomes do not become integrated within host cell chromosomes and are therefore not associated with any risk of insertional mutagenesis.

3.1. OriP/EBNA-1 Vectors: Episomal Maintenance

Perhaps the best characterized artificial minichromosome-like episomes of this type are the so called OriP/EBNA1 vectors. These vectors are bacterial plasmids or bacterial artificial chromosomes in which episomal replication and segregation are achieved through DNA sequences derived from the Epstein-Barr virus (EBV). This herpes virus has a genome consisting of dsDNA of approximately 172 kb and it is characterized by a latent phase during which it is maintained for life asymptomatically in lymphocytes of about 90% of the human population. In the latent stage, the viral genome is maintained as a circular extrachromosomal replicating episome within the nucleus of infected cells [63].

The only viral elements needed to confer replication and segregation on this episome are the latent origin of replication, oriP, and the viral gene encoding the EBNA1 protein [6466]. The origin of plasmid replication (oriP) from Epstein-Barr virus is a cis-acting element which has been proven to confer replication autonomy and maintenance to recombinant plasmids in cells harbouring latent EBV. OriP is formed by two elements about 1 kb apart: the family of repeats (FR), a 20 member family of 30 bp direct repeats, and the dyad symmetry (DS) element, a 65 bp DS containing four copies of the repeat [67, 68]. OriP resides within a 1.8 kb segment in the short unique region of the EBV genome, and it appears to be the only element needed in cis [65, 68]. The trans-acting gene allowing oriP function lies in a 2.6 kb region encoding the Epstein-Barr Nuclear Antigen 1 (EBNA-1) [66]. The EBNA-1 protein can bind to both the FR and DS sequences [69], whereby the DS element is the initiation site of episomal DNA replication whereas FR acts as a replication fork barrier and termination site [70]. EBNA-1 also activates the 30 bp repeats of oriP in trans, the latter acting as a transcriptional enhancer for genes linked to the repeat [71].

Plasmids containing these two elements (oriP and EBNA-1) are replicated once per cell cycle and they segregate passively by attachment to the mitotic chromosomes [72, 73]. Such plasmids are maintained in human cell lines in tissue culture [63, 66], as well as in mouse cell lines if the plasmid contains large fragments of genomic DNA that permit replication [74]. Although standard EBV-derived vectors usually carry both oriP and EBNA-1, it is well established that vectors containing only the oriP FR and EBNA-1 as well as inserts of genomic DNA greater than 100 kb can be replicated and maintained as episomes, and can thus be used for studies in human and rodent cells [63]. As the segregation of episomes is passive, by binding to endogeneous chromosomes instead of by direct attachment to the microtubules of the mitotic spindle, selection is sometimes required to maintain a population of cells that carry the episomes (especially in cases of a low copy number of episomes per cell). The typical structure of OriP/EBNA1 vectors and their localization in the interphase and mitotic cells is shown in Figure 1.

The replication and segregation of OriP/EBNA1 vectors depends only on the presence of EBNA-1 protein. As mentioned above, the binding of EBNA-1 to the DS element of OriP destabilizes nucleosomes and aids the recruitment of cellular replication factors to OriP. The segregation function of EBNA-1 depends on its ability to bind to both the FR element of OriP and to the periphery of chromosomes. It is thought that EBNA-1 interacts with chromosomes in two different ways. During interphase (and probably also in mitosis), EBNA-1 associates with chromatin through its binding to either cellular DNA or chromatin-associated proteins. During metaphase/anaphase, there is additional binding of EBNA-1 to a cellular protein referred to as EBP2 (EBNA1-binding protein 2), which strengthens the association of EBNA-1 with mitotic chromosomes [75].

3.2. Applications

OriP/EBNA-1-based vectors have been widely used for a variety of purposes. One of the strategies for cloning a eukaryotic gene relies on the ability to select for the functional expression of the gene of interest in mammalian cells [76]. A subcloning vector has been designed containing oriP/EBNA-1 that enables the expression of very rare clones to be selected directly [77]. These clones were stably maintained as episomes, 2 to 10 copies per cell, as long as selection was applied.

Other studies using oriP/EBNA-1 vectors have expressed cDNA from heterologous promoters, all producing high levels of expression for several weeks or months, and all showing stable retention of the episomes in the cells for long periods of time [7881]. Such vectors have even been used to produce large amounts of recombinant proteins [82].

There are also several studies showing how these two elements are sufficient for large plasmids to be maintained indefinitely in cells, expressing the gene they contain in a physiological fashion. For example, 2 YACs of 90 and 660 kb were created that contained oriP and when they were introduced into human cells expressing EBNA-1, these constructs were maintained as stable episomes for more than 8 months in the presence of selection, and for up to 5.5 months without selection [83]. Three cell lines were produced with the 90 kb YAC, all of which contained unrearranged episomes. However, of the 3 cell lines containing the 660 kb YAC only two presented the intact form of the 660 kb molecule. FISH analysis also demonstrated how episomes, some visible as pairs, associated with the host cell chromosomes. This association may explain how they are so efficiently maintained, even in the absence of selection, indicating that stability is achieved by attachment to host chromosomes as proposed for the EBV genome [84]. An intact CFTR gene has also been expressed from an oriP/EBNA1 episome in mouse cells [85]. By introducing circularized YACs into CMT-93 and LA-9 cells by fusion with yeast spheroplasts, nonrearranged YACs were obtained and maintained as episomes of either 320 or 640 kb. The copy number varied between 2 to 56 in the different cell lines obtained and the rate of loss varied from 0.4% to 5% in the absence of selection, very similar to the data from the 293-cell lines expressing oriP-YACs [83]. The level of expression of the exogenous CFTR gene was dependent on its copy number and each copy of the YAC produced about 20% of the level of each endogenous gene. A further advance was made by changing the constructs based on YACs to others based on BACs, as the latter are easier to use and manipulate, allowing purification of much larger quantities of DNA.

In order to modify BACs for their use in gene therapy protocols, an efficient system for retrofitting BACs with oriP/EBNA1 and reporter genes has been described using loxP/Cre recombination [86, 87]. These vectors not only contain the elements needed for episomal maintenance, they also include suitable marker genes for selection and monitoring of the mammalian cells into which the constructs are introduced.

4. Delivery Methods

As mentioned above, the type of DNA molecules used for gene therapy is crucial but the “DNA delivery vehicles” are also very important. Indeed, the transfer of large DNA molecules to mammalian cells is not an easy task. Medium size BACs can be delivered into mammalian cells using Lipofectamine 2000, and more than half of the transfected cells contain intact delivered BAC [87, 88]. However, when working with BACs, the purification of supercoiled DNA may be quite cumbersome and it becomes increasingly difficult with larger BACs. Different methods involving less manipulation and/or higher efficiency of delivery should be tested in order to optimize the whole system. To date, two methods appear to be quite promising to transfer BACs-based vectors to mammalian cells: bactofection and packaging within Herpes simplex virus 1 (HSV-1) particles.

4.1. Bactofection

Bactofection refers to the use of bacteria for the delivery of DNA molecules to mammalian cells. One advantage of this method is that it minimizes the need to manipulate the DNA molecules, which is especially useful in the case of very large molecules such as BACs. Indeed, different studies have proven how DNA transfer can occur from bacteria to mammalian cells [8991]. Hence, bacterial strains that are naturally or artificially engineered to be invasive, but that are attenuated to prevent pathogenesis, are particularly useful for bactofection. Accordingly, attenuated intracellular bacteria engineered to lyse after cell invasion have been shown to transfer functional genes to a very broad range of mammalian cells [92, 93]. The main bacterial species used as delivery vectors are facultative intracellular pathogens such as Salmonella sp [94], Shigella sp [95], Listeria sp [96], or Yersinia sp [97]. However, more recently, it was demonstrated that almost any species can be used provided they are appropriately engineered [98]. Anaerobic or facultative anaerobic bacteria are also being applied as anticancer agents to target solid tumours taking advantage of their ability to grow in the hypoxic region of tumours [96, 98103]. Indeed, bacterial delivery has been assessed as a way to administer DNA vaccines using different intracellular pathogen strains [104, 105]. For certain purposes, determined bacterial toxins have been used as vectors to deliver DNA into mammalian cells [106].

A different approach for bacterial delivery relies on the use of a genetically modified E. coli strains that express two additional genes; inv from Yersinia pseudotuberculosis, and hly from Lysteria monocytogenes [107, 108]. Inv permits binding to the integrins expressed on the surface of mammalian cells and hly, the escape from lysosomes once in the cytosol of the mammalian cells. E. coli is also auxotrophic for aminopimelic acid (DAP-), which means that it is not able to form a new cellular membrane once in the mammalian cytosol. For this reason it lyses and freely liberates the DNA it contains. The Y. pseudotuberculosis invasin protein (inv) is encoded by a 3.8 kb gene expressing a 103 kDa protein localized to the outer membrane, part of a family of adhesins encoded by enteropathogenic bacteria [109]. All the members of this family have a region of similarity in the amino terminal 500 amino acids of the Y. pseudotuberculosis invasin (36% identity). This conserved domain is required for protein localization in the outer membrane and for export of the carboxyl termini of these proteins. Indeed, the terminal 192 amino acids of invasins represents the most divergent region and it is that which binds to integrin receptors [110, 111]. Invasin is the only protein needed to invade mammalian cells and it can be cloned into non-invasive strains of bacteria making them invasive [89, 112]. The second gene introduced in the modified E. coli strain is the hly locus from L. monocytogenes, encoding listeriolysin O. This protein is a member of the cytolysin family that are produced by various Gram positive species [113]. This 58 kDa cholesterol-dependent pore-forming toxin [114] allows the bacteria to escape from phagolysosomes into the cytosol of the infected cell [115]. In order to promote lysis of the internalized bacteria, the strain used is unable to synthesise a new cell wall when it divides since it is a diaminopimelic acid (dap) auxotrophic strain, a substrate not present in the mammalian cells. This ensures that the bacteria will not survive in the cells, and that the plasmid DNA will be liberated into the cytosol and eventually reach the nucleus.

Using this bacterial vehicle, functional DNA can be transferred into a variety of mammalian cell lines by simple coincubation with the engineered E. coli strain [107]. Different studies have showed that this strain is also able to transfer large DNA fragments [116118], establishing the proof-of-principle for the use of this kind of delivery method as a true alternative to efficiently deliver intact BACs into recipient mammalian cells.

Currently, the use of bactofection seems to be limited to ex vivo gene transfer to cultured mammalian cells, although some examples of in vivo applications to cells of the gastrointestinal tract are envisaged. However, there are very serious concerns about the biosafety of bacteria as gene delivery vehicles due to the risks of important adverse immunological and inflammatory reactions, which will probably restrict their wider application in gene therapy.

4.2. Packaging into HSV-1 Virions

Vectors based on Herpes simplex virus 1 (HSV-1) represent one of the most promising alternatives for the delivery of large DNA molecules, including many artificial minichromosome-like episomes. HSV-1 has a genome of 152 kb, so viral vectors based on HSV-1 have a notably larger capacity to accommodate DNA than other conventional viral vectors used for gene therapy, such as retroviruses and adenoviruses [119].

HSV-1 derived amplicon vectors are plasmids (or BACs) bearing only two sequences of viral origin: an OriS to permit replication in packaging cells; and a pac sequence to permit packaging into HSV-1 viral particles. Hence, the remaining genomic capacity is available for exogenous DNA (up to approximately 150 kb), which may include entire genomic loci with all their regulatory elements to ensure persistent and physiologically regulated levels of expression [120123]. This feature is probably the most promising characteristic of this type of vector as, to date, they are the only viral vectors with such a characteristic high efficiency of gene transduction that are capable of carrying such large DNA fragments that they may include a whole genomic locus. Other relevant aspects of HSV-1 amplicons as gene delivery vehicles include their capability to transduce both dividing and non-dividing cells, and their ability to persist as non-integrating episomes, eliminating the risk of insertional mutagenesis.

As amplicon vectors only contain the oriS and the pac packaging signal as viral DNA sequences, generating them requires the use of a helper virus encoding all the viral genes necessary for the assembly and packaging of DNA into viral particles. This helper virus, although replication-defective, quite often produces amplicons contaminated with helper viruses that can trigger immune and inflammatory responses [124]. Recently, a new system has been developed where all the genes needed for packaging are encoded in a large BAC (which is unable to be packed within virions), eliminating all the genes unnecessary to generate the amplicon-containing particles [123, 125, 126].

HSV-1 amplicons can play an important role in gene therapy protocols for neurological disorders due to their notable ability to deliver genes into neurons, both in vitro and in vivo [127131]. Although they can express foreign genes driven by viral promoters only for a period of several days, genes under the control of a neuronal promoter can be stably expressed [132]. In this context, it is not surprising that expression from a genomic locus where the gene is under the control of its own promoter renders physiological levels of expression that are maintained stably [121]. As neurons are postmitotic cells, there is no need for reinoculation of the therapeutic gene should a gene therapy protocol use this kind of vector. However, for dividing cells some modifications would have to be made to the amplicon vectors in order for them to persist during cell divisions. One approach is to use hybrid amplicons containing elements from HSV-1 and elements from the Epstein-Barr virus that confer episomal retention [121, 122].

Figure 2 shows the structure of an HSV-1-derived amplicon vector encompassing an EBV-based minichromosome-like episome, and the procedure to pack it into HSV-1 virions. Given the high capacity of these vectors, even artificial minichromosomes may be delivered to mammalian cells with an efficiency higher than that obtained by other methods [133]. Thus, HSV-1 HAC amplicons containing alphoid DNA for episomal maintenance and a HPRT minigene have efficiently transduced cultured cells, producing stable HPRT expression for over three months. Curiously, not all HACs behave in the same way in the different cell lines tested, which was probably due to the level of expression of some proteins implicated in cell cycle control [133].

Whereas HSV-1 virions are unable to pack DNA molecules larger than 160 kb, viral vectors based on other herpes viruses may have a greater capacity to accommodate exogenous DNA. Thus, cytomegalovirus, which has a 250 kb genome, may permit larger DNA fragments to be used [134]. In this way, herpes virus-based vectors appear to be highly promising “delivery vehicles” of artificial minichromosome-like episomes for gene therapy applications.

4.3. Gene Therapy of Haemophilia A

Haemophilia A is an X-linked recessive bleeding disorder caused by mutations in the Factor VIII gene (FVIII) that encodes for a clotting factor [135]. It affects about 1 in 5,000 male births in all populations and it is currently treated by infusion of plasma-derived or recombinant Factor VIII protein. The frequency of the disorder combined with the high cost of recombinant Factor VIII make it a major burden on the healthcare systems. Therefore, gene therapy for FVIII deficiency is clearly a very attractive option. There are different degrees of severity for this disease depending on the levels of the circulating-factors (severe, less than 1%; moderate, between 2%–5%; and mild, between 6%–30%), Thus, low levels of expression may have a large positive effect on the health of the individual. Interestingly, FVIII is normally expressed in the liver, an organ that is particularly easy to access for gene delivery. Much work has been carried out towards preparing gene therapies for both FVIII (haemophilia A) and Factor IX (FIX) deficiencies (hemophilia B). FIX is a much smaller protein encoded by a 1385 bp open reading frame that can be easily packaged as a minigene in retroviral, adenoviral and adeno-associated (AAV) vectors driven by either a viral or a mammalian promoter [136139]. Long-term delivery and physiological expression have been achieved for FIX in mouse models, although many problems remain and gene therapy suitable for human patients is still not available [135]. By contrast, the open reading frame of FVIII is 7055 bp long and it has been much harder to express in viral vectors [140]. Different strategies have been applied to make the FVIII cDNA a bit shorter, and hence easier to introduce into different vectors, including using a B-domain deleted version of FVIII cDNA [141, 142] as this domain does not appear to be essential for coagulation [14, 143145].

In view of these data, it seems that the election of the vector into which the FVIII gene has to be introduced is not a minor issue for haemophilia A gene therapy. First generation adenovirus-based vectors still contained viral gene sequences that produced immunogenic reactions and as a consequence, transgene expression was only short-lived as a result of vector clearance [146]. Nevertheless, even with these vectors it might be possible to achieve curative levels of FVIII for several months in haemophilic mice [147, 148]. However, the results from larger animals were not so promising [149, 150] and although some improvements in adenoviral vectors have been made, particularly regarding transgene expression, some toxicity still persists when these vectors are used [151, 152].

There is evidence of the expression of the FVIII cDNA with the B-domain deleted from retroviral vectors in different cells lines, from human skin fibroblasts [153, 154] to bone marrow stromal cells [155]. In each case, a functional FVIII factor is expressed but at low levels, due to the repressive sequences found in the of gene [156159]. However, this difficulty has been overcome by using slightly different retroviral vectors based on the MFG retroviral vector system [160]. Yet, in these systems, the therapeutic levels of FVIII in circulation were obtained over just one week [161]. An important drawback of retroviral vectors is their inability to transduce nondividing cells, particularly important if gene therapy is directed to the liver. One way to bypass this limitation has been to use FVIII-expressing retroviral vectors to transduce neonatal mice where hepatocytes are undergoing rapid cell division, thereby obtaining complete correction of the disease [162]. A different approach is to employ lentivirus-based vectors which can transduce both dividing and non-dividing cells [163, 164].

Due to the small genome of adeno-associated virus (AAV) vectors, they have been used little with the FVIII gene. Nonetheless, there are reports of therapeutic levels of FVIII expression for a limited period of time using this viral system [165, 166]. Perhaps due to the disappointing results obtained with other viral vectors, delivery systems based on AAV vectors have recently attracted significant attention [167].

Besides the limitations presented by viral delivery systems, three phase I trials have been launched to test the safety of the procedure and to check the levels of expression from these vectors. In an ex vivo gene therapy protocol, a B-domain deleted FVIII cDNA expressed from a non-viral plasmid was electroporated into dermal fibroblasts and, after selection and expansion, the fibroblasts were reintroduced into the patients [168]. FVIII levels increased in four of the six subjects studied without producing adverse events. In the patient with the highest levels of FVIII expression, the therapeutic effects lasted for 10 months, although FVIII expression finally disappeared. In a second trial, a B-domain-deleted cDNA was expressed from a moloney murine leukaemia virus-based (retroviral) vector [169], which only produced levels of FVIII expression above 1% sporadically, concluding that efficient retroviral transduction would probably require higher doses and some degree of mitotic induction. In 2004, a third trial using an adenovirus vector encoding the complete FVIII cDNA was carried out in just a single patient. Despite obtaining sustained expression of FVIII above 1% of the normal levels for several months, adverse effects were observed associated with the appearance of thrombocytopenia and an elevation of transaminases [170].

There is only one study where factor VIII has been expressed from a BAC containing the entire genomic locus [118]. As no single FVIII-containing BAC was available, a new one was constructed by homologous recombination [171] and then retrofitted with different elements in order to provide episomal maintenance [48, 88]. The different vectors obtained were introduced into hepatic and nonhepatic human cell lines to check whether the construct could drive expression from the transgene, achieving detectable FVIII levels with all of them and in some cases, even stronger expression than the endogenous gene. Although a demonstration of functional FVIII expression from the BAC awaits the use of cell lines or mice with null endogenous expression, this is the first characterized genomic clone carrying the intact locus which could potentially be useful for therapeutic applications [118].

4.4. Gene Therapy for Friedreich’s Ataxia

Friedreich’s ataxia (FA) is the most common form of autosomal recessive ataxia and it is caused by a decrease in the levels of frataxin, a mitochondrial protein encoded by the nuclear FRDA (FXN) gene [172]. FA is a predominantly neurodegenerative disease with an estimated prevalence of 1-2 in 50,000 individuals, and it is characterized by the progressive loss of large sensory neurons and spinocerebellar tracts. Other clinical symptoms may include hypertrophic cardiomyopathy and diabetes mellitus.

The most common cause of the disease is an expansion of the GAA triplet within the first intron of the FRDA gene, which has a dramatic effect in reducing mRNA levels and consequently frataxin protein levels [173]. Although most patients have both chromosomes affected by the GAA expansion, it is possible to find some cases with one expanded allele and a point mutation in the other [174].

Frataxin is an 18 KDa protein [175] implicated in functions such as mitochondrial iron homeostasis [176, 177], iron-sulfur cluster biosynthesis [178] and oxidative phosphorylation [179]. Frataxin is expressed in all cell types, although cells from the nervous system, heart and muscle present the highest levels of the protein [180].

Homozygous frda knockout mice are embryonic lethal a few days after implantation, demonstrating the essential role for frataxin during early mammalian embryo development [181]. Thus, it appears that the milder phenotype associated with the human disease is due to the residual frataxin expression observed in patients with the expansion mutations. These results may also explain why no patients with homozygous point mutations have been encountered.

It has been difficult to generate mouse models for FA because of the embryonic lethality associated with the null mutations. Neuron-specific and conditional knock-out models generally exhibit a wider and more prominent neurodegenerative phenotype than the human disease [182, 183]. Some knock-in mice bearing the human gene with an expansion mutation surprisingly failed to develop any clinical phenotype, despite the significant reduction in frataxin levels to 25%–36% the wild-type levels [184]. More recently, representative FA mouse models have been generated by crossbreeding lines of a human FXN YAC transgenic mice that contain unstable GAA repeat expansions with heterozygous frda knock-out mice [185]. The resultant “transgenic-knockout” mice express comparatively little human-derived frataxin, and they exhibit a neurodegenerative and cardiac pathological phenotype similar to human disease sufferers, although significantly milder. With their limitations, these mice currently constitute the most useful model to study the physiopathology of FA and to test for possible therapeutic approaches.

There is no effective cure for FA, although antioxidants may have a mildly positive effect on some clinical signs [186]. There has been great interest in the possibility of boosting frataxin gene expression as a therapeutic approach and indeed, erythropoietin [187] has been shown to increase frataxin protein levels in cultured cells from FA patients. In a pilot clinical trial, 12 patients were treated with human recombinant erythropoietin (rHeEPO) to establish whether it could produce an increment in frataxin levels [188]. After 8 weeks of treatment, only two patients experienced a net increment in frataxin levels in peripheral blood lymphocytes, patients who also showed a reduction in markers of oxidative stress. However, it is not yet clear whether or not erythropoietin raises frataxin levels within affected neurons. Since chromatin condensation around the GAA repeat is thought to be responsible, to some extent, for the low levels of frataxin mRNA expression in FA, histone deacetylase (HDAC) inhibitors have also been investigated with the same encouraging results [189, 190].

FA is a good candidate for treatment with gene therapy as it is a monogenic disease and an increases in frataxin level could substantially improved the symptoms given that healthy carriers may have around 40%–50% of normal frataxin levels [191].

A human frataxin cDNA has been delivered into FA patient fibroblasts using lentiviral or adeno-associated viral (AAV) vectors, which has resulted in a partial correction of their sensitivity to oxidant stress [192]. However, the nonphysiological overexpression of frataxin driven by these vectors has been shown to be cytopathological.

Interestingly, the expression of frataxin cDNA to “physiological” levels driven by a HSV-1 amplicon vector can rescue the neurodegeneration triggered by frataxin deficiency both in cultured neurons and in vivo [193]. These results constitute the first “proof of principle” that neurological function can be recovered through a gene therapy approach aimed at correcting frataxin deficiency.

As mentioned above, it seems that optimal frataxin levels are required for correct cell function, since both deficiency and massive overexpression are associated with cell pathology. Thus, physiologically regulated expression of frataxin gene seems to be important for the gene therapy to succeed in FA. An interesting possibility is the use of the entire genomic locus of frataxin, since its large size may ensure the inclusion of all the regulatory elements required for proper gene expression.

Previous reports of transgenic mice indeed support the use of the frataxin genomic locus to correct frataxin deficiency in vivo. Thus, the entire FXN gene within a human YAC clone of 370 kb was reported to rescue the embryonic lethality of frataxin knock-out mice [19]. Likewise, a human BAC clone of 188 kb containing the entire genomic FXN locus has also been demonstrated to rescue the lethal phenotype of frataxin deficiency [16]. This indicates that the FRDA-BAC contains all the regulatory elements needed for the FRDA gene to be expressed in a physiological fashion.

A slightly smaller FRDA-BAC of 135 kb containing the entire 80 kb FXN genomic locus has been used to generate an HSV-1 amplicon (iBAC-FRDA) in order to test its ability to correct the FA phenotype [194]. This vector is capable of restoring physiological levels of frataxin in fibroblasts from FA patients and hence, of almost completely rescuing the cell phenotype of susceptibility to oxidative stress. Another key issue when using this kind of vector is the ability to infect cells that are hard to transfect with other techniques [195]. Once FRDA expression from the iBAC has been demonstrated, the iBAC might be used for in vivo delivery of the FRDA gene to FA-affected regions since delivery of HSV-1 amplicons to determined brain regions has already been shown [127]. Indeed, we have results indicating that HSV-1 amplicons containing the entire frataxin genomic locus produce persistent gene expression in the nervous system in vivo (Corona et al., manuscript in preparation). All these results support the promise of these vectors for gene therapy of FA.

In this regard, it is particularly interesting that herpes viral vectors are now in clinical trials for gene therapy of chronic pain. The results of this trial will provide the first clues about the safety and efficacy of herpes viral vectors as “gene-delivery vehicles” in the human nervous system [196].

5. Conclusions

Artificial chromosomes and minichromosome-like episomes based on BACs containing entire genomic loci are very promising tools for gene therapy of inherited diseases caused by recessive mutations, such as haemophilia or Friedreich’s ataxia. The development of vehicles capable of accommodate whole genes emerges like a priority due to the importance of correct and accurate expression of most genes. Even if this is the key question, we should not forget that the way these molecules are delivered into target cells is as important as the former. So, improved development of delivery methods should be as well crucial for the success of gene therapy using these large DNA molecules. In this respect, amplicon vectors based on herpes viruses are currently a very useful tool regarding average BACs but still much work is needed in order to increase their DNA capacity or finding alternative methods for delivery.

Acknowledgments

the work on gene therapy in the authors lab is supported by Grants from the “Plan Nacional de Investigación en Biomedicina” (SAF 2006-12782-C03-02 and SAF2009-10757), the “Comunidad Autónoma de Madrid” (Neurodegmodels, Ref S-SAL-0202-2006) and the “Fundación Alicia Koplowitz”. The Biomedical Network Research Centre for Rare Diseases “Centro de Investigación Biomédica en Red sobre Enfermedades Raras” (CIBERER) is supported by the “Instituto de Salud Carlos III” (ISC III). The authors also want to acknowledge Dr. Mark Sefton for paper revision.