Abstract

The effort to develop a tuberculosis (TB) vaccine more effective than the widely used Bacille Calmette-Guérin (BCG) has led to the development of two novel fusion protein subunit vaccines: Ag85B-ESAT-6 and Ag85B-TB10.4. Studies of these vaccines in animal models have revealed their ability to generate protective immune responses. Yet, previous work on TB fusion subunit vaccine candidate, Mtb72f, has suggested that genetic diversity among M. tuberculosis strains may compromise vaccine efficacy. In this study, we sequenced the esxA, esxH, and fbpB genes of M. tuberculosis encoding ESAT-6, TB10.4, and Ag85B proteins, respectively, in a sample of 88 clinical isolates representing 57 strains from Ark, USA, and 31 strains from Turkey, to assess the genetic diversity of the two vaccine candidates. We found no DNA polymorphism in esxA and esxH genes in the study sample and only one synonymous single nucleotide change (C to A) in fbpB gene among 39 (44.3%) of the 88 strains sequenced. These data suggest that it is unlikely that the efficacy of Ag85B-ESAT-6 and Ag85B-TB10.4 vaccines will be affected by the genetic diversity of M. tuberculosis population. Future studies should include a broader pool of M. tuberculosis strains to validate the current conclusion.

1. Introduction

The need for an improved vaccine against tuberculosis (TB) has never been more urgent. One in three people today are infected with Mycobacterium tuberculosis, the causative agent of TB, and worldwide approximately three million people die from TB annually. The currently available TB vaccine, Bacille Calmette-Guérin (BCG), has failed to consistently protect against the most contagious form of the disease, adult pulmonary TB, despite its widespread use [14]. Developing a new vaccine, which may serve as a booster or a replacement for BCG, is of critical importance in the fight against worldwide TB-related morbidity and mortality [4, 5].

Of the various vaccine candidates proposed, fusion subunit vaccines have received considerable attention in the recent literature, especially those composed of antigenic proteins ESAT-6, Ag85B, and TB10.4 [38]. It appears that the multiple epitopes that fusion subunit vaccines offer makes them more effective than single-peptide vaccines in interacting with the complexity of the host immune response against TB and the genetic restriction imposed by major histocompatibility complex molecules [3, 9]. Two fusion subunit vaccines, Ag85B-ESAT-6 and Ag85B-TB10.4, which are the focus of the present study, have been found to induce protective cell-mediated immunity in animal models [3, 6, 7, 9, 10]. Ag85B-ESAT-6 is currently in expanded Phase I studies in which the vaccine is tested in BCG-vaccinated, latently infected, and individuals from TB endemic regions [4]. As these candidates move forward in or toward clinical trials, it will be critically important to evaluate their protective potential as global vaccines via bioinformatic approaches built upon the comparative genomics of the pathogen population and the immunomics of the host population.

Bioinformatics approaches are invaluable to the development of effective vaccine candidates. Comparative genomics of the pathogen population, for instance, allows vaccine candidates that are potentially ineffective due to genetic diversity of the pathogen population to be discredited before they reach the costly stages of clinical trials. In other words, bioinformatic approaches can provide information based on which a rational selection of clinical trial sites can be made. As for subunit vaccines, comparative genomics can help analyze whether antigenic targets are conserved among infectious strains of an organism in order to ensure their protective efficacy across diverse pathogen populations circulating in different geographic region [8].

Although previous studies have suggested that M. tuberculosis has a relatively stable genome in comparison with other bacteria [11, 12], recent genomic studies have revealed biologically significant variation among clinical strains [13]. Hebert and colleagues, for instance, revealed considerable genetic variation in the PPE18 gene of M. tuberculosis, with important implications for the ability of the Mtb72f vaccine candidate to provoke protective immunity against diverse populations of M. tuberculosis [14]. Furthermore, the interaction of the genetic variation of the PPE18 component of Mtb72f with the allelic variation of human MHC-II DRB1 proteins negatively affects vaccine epitope binding to DRB1 proteins [15]. Taken in the context of vaccine development, revelations like these are crucial to the survival of vaccine candidates as potential clinical vaccines. A similar comparative genomics study on Ag85B-ESAT-6 and Ag85B-TB10.4 subunit vaccines may provide useful information for predicting the protective efficacy of these candidates in the pre- or early stages of their clinical evaluation.

Little information has been documented on the genetic variation of the genes encoding for ESAT-6, Ag85B, and TB10.4 proteins. If any of these three genes is highly variable, the protective efficacy of Ag85B-ESAT-6 and Ag85B-TB10.4 subunit vaccines might be compromised on the global stage. To further investigate the ability of these two-vaccine candidates in recognizing naturally occurring M. tuberculosis strains, we investigate the genetic diversity of the esxA, esxH, and fbpB genes of M. tuberculosis that encode for the components of the two new subunit vaccines in a sample of 88 M. tuberculosis strains collected from Turkey and Arkansas, USA.

2. Materials and Methods

2.1. M. tuberculosis Isolates

The clinical strains used in the present study are from Ark, USA, and Turkey. Following the work of Herbert and colleagues, the isolates were selected to represent different geographical regions, including Arkansas and Malatya, Turkey, to assess the impact of regional genetic variability on the two subunit vaccine candidates [14]. Each of the selected isolates represents a different strain of M. tuberculosis with a distinct IS6110 restriction fragment polymorphism (RFLP) pattern with more than five bands or a distinct combination of a common IS6110 RFLP pattern with five or less bands and a unique spoligo typing pattern (Table 1). The rationale for including isolates from two geographical regions was to discern the potential impact of genetic variation on future vaccination with Ag85B-ESAT-6 and Ag85B-TB10.4 in separate populations.

Our initial intent was to analyze the same set of clinical isolates ( ) used by Hebert and colleagues in their work on PPE18 and pepA [14]. However, after initial sequencing of a randomly selected subset ( ) of the 225 isolates revealed no genetic variation in the ESAT-6, TB10.4, and Ag85B genes, we selected only 47 of the remaining 84 isolates that had previously shown variation in the PPE18 gene ( ) for the current study. This decision was made with the consideration of cost-effective lab procedure, reasoning that local genetic variations would be indicative of broader genomic variability. The 88 study isolates represent 57 different strains from Ark, USA, and 31 distinct strains from Turkey. The 88 study isolates shared 49 different spoligotypes that represent 47 different spoligo international types, as determined by using the query tool of the fourth international spoligotyping database (SpolDB4) [16]. The present sample represents 80 of 84 strains (95.2%) that showed PPE18 variation in Hebert’s study.

2.2. PCR of esxA, esxH, and fbpB Genes

The three genes under study were amplified using Invitrogen Platinum Taq PCRx Polymerase kit (Invitrogen, Carlsbad, CA) with primers published previously [17]. The primers used for esxA, encoding the ESAT-6 protein were esxA-F ( -GCAATCCGGCGGCTCCACCAG- ) located 533 bp upstream of the esxA gene and esxA-R ( -TCGGCCGCCATGACAACCTCTC- ) located 124 bp downstream from the end of the esxA gene. The primers used for esxH, encoding the TB10.4 protein were esxH-F ( -GAGAGGGGGAGGCGACGGCTTACC- ) located 411 bp upstream of the esxH gene, and esxH-R ( -TCCCCGCCCCAATGGTTTCAGC- ) located 86 bp downstream from the end of the esxH gene. Finally, the primers used for fbpB, encoding Ag85B protein were fbpB-F2 ( -ACTCGGCTAACTGGCTGGTGC- ) located 217 bp upstream of the fbpB gene, and fbpB-R ( -CATACCGCCATACCGTTTGTGAGC- ) located 164 bp downstream from the end of the fpbB gene. The inclusion of the regions flanking the esxA, esxH, and fbpB genes allowed further confirmation that the PCR products were specific. The positive control in all the PCR reactions was M. tuberculosis H37Rv, and the negative control was PCR-grade water. The 50  L PCR reaction mixture used was composed of 5  L of amplification buffer, 1.5  L of 50 mM , 1  L of 10 mM deoxyribonucleoside triphosphate mixture, 20 pmol of each primer in 1  L, 0.5  L of Invitrogen Platinum Taq polymerase mixture, 4  L of a DNA solution containing 50 ng of DNA template, and 40  L of PCR-grade water. The thermocycling program used for all genes was one cycle at for 1 minute; 30 cycles of for 30 seconds, for 30 seconds, and for 2.5 minutes; and a final cycle of for 10 minutes. The sizes of the PCR products were verified by 1.0% (wt/vol) agarose gel electrophoresis in -borate-EDTA buffer.

2.3. Automated DNA Sequencing

PCR products were sequenced to identify any insertions/deletions or single nucleotide polymorphisms (SNPs) in the esxA, esxH, and fbpB genes in the selected isolates. PCR products were purified using Invitrogen PureLink PCR Purification Kit, according to manufacturer instructions (Invitrogen, Carlsbad, CA). DNA sequencing was performed with Applied Biosystems DNA sequencers 3700 and 3730 at the University of Michigan Sequencing Core, using the same primers that were used for the PCR of the three genes. The sequences of esxA, esxH, and fbpB of the study strains were compared to those of M. tuberculosis laboratory reference strain H37Rv (GenBank accession number BX842575) using the BLAST and BLAST2 nucleotide sequence alignment program of the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/BLAST) and the Sequencher 4.9 DNA Sequence Assembly software (Demo version) of Gene Codes Corporation (www.genecodes.com). Primary DNA transcripts were translated to amino acid sequences using the in silico simulation of molecular biology experiment of the University of Basque Country (http://insilico.ehu.es/).

3. Results and Discussion

3.1. Genetic Diversity of esxA, esxH, and fbpB and Corresponding Amino Acid Sequences

Among the 88 strains investigated, genetic analysis of esxA and esxH in this study revealed no nucleotide polymorphisms in the genes encoding for ESAT-6 and TB10.4 proteins. Of the 88 strains, 38 (43.2%) belong to principal genetic group 1, 29 (33.0%) belong to principal genetic group 2, and 21 (23.9%) belong to principal genetic group 3. The principle genetic groups were defined by SNPs in the katG and gyrA genes as described previously by Sreevatsan and colleagues [12]. Unlike the study by Herbert et al., where genetic group 1 strains were found to have the highest frequency of DNA polymorphisms in the PPE18 protein, a component of the Mtb72f vaccine [14], strains in all of the three principal genetic groups showed no DNA variations in both esxA and esxH. This observation suggests that these gene regions might be conserved among M. tuberculosis strains of different geographic origins and among different genetic groups of the pathogen. In fact, Gey Van Pittius and colleagues have recently posited interspecies conservation of the ESAT-6 gene region as part of a novel Gram-positive secretion system with distant homologues in Bacillus subtilis, Bacillus anthracis, Staphylococcus aureus, and Clostridium acetobutylicum [18].

The analysis of fbpB, the gene encoding for Ag85B revealed only one synonymous C to A SNP, located at position 714 bp of the gene sequence, among 39 (44.3%) of the 88 strains sequenced. Double strand sequencing was conducted on the 39 isolates to confirm the existence of this SNP. Although this SNP had no effect on the amino acid sequence of the peptide when translated, it is indicative of an allelic variation in the M. tuberculosis gene pool. Of the 39 strains, 15 (17.0%) belong to principal genetic group 1, 11 (12.5%) belong to principal genetic group 2, and 13 (14.7%) belong to principal genetic group 3. Furthermore, the SNP was not found to be associated with any specific geographic origin of the study strains, suggesting that this particular nucleotide polymorphism is of ancestral origin. These data suggest that M. tuberculosis Ag85B antigen is highly conserved in, at least, certain populations of M. tuberculosis clinical strains.

3.2. Implications for Immunization

TB remains one of the deadliest infectious diseases of our times. Despite widespread use of the BCG vaccine, the disease continues to claim 2-3 million lives per year. The need for a new vaccine has never been more urgent. In order to gain insight into the efficacy of two new vaccine candidates, two fusion proteins combining Ag85B and ESAT-6, and Ag85B and TB10.4 [6, 9, 10], respectively, the gene regions encoding for ESAT-6, TB10.4, and Ag85B proteins were analyzed for their variability. Experiments involving these two candidates have revealed them to be effective in generating protective immunity in animal models [6, 9, 10].

Yet, while animal models have played an important role in the development of new TB vaccines so far, they are not always representative of the internal human biological environment. As Flynn noted earlier, an important drawback of the murine model is that the pathology of pulmonary TB in mice is quite different from that in humans [19]. Specifically, the heterogeneity of granuloma types observed in the human host is not displayed in the mouse lung [19]. Similar difficulties arise when using other animal models. In the case of bovine infection, the pathology of TB is quite similar to human host response in granulomatous reactions, but differs with respect to cavitation [20]. Non-human primate models also represent the human pathology of TB quite well, but like cattle are limited by economic and infrastructural factors [20].

Furthermore, current preclinical studies of new TB vaccines’ protection against M. tuberculosis infection in animal models do not take the population diversity of M. tuberculosis into consideration. However, as Hebert and colleagues noted previously [14], genetic diversity of M. tuberculosis genes can be found among clinical isolates, and such diversity may have important implications for the efficacy of the new vaccines. Thus, comparative genomics of the pathogen population stands as an additional useful tool for pre-clinical evaluation of new vaccines, providing information complementary to those from current in vivo and in vitro studies. Given the resource-demanding nature of clinical trials, comparative genomics serves as a method for predicting the potential protection of proposed vaccines candidates in the general population. Hebert and colleagues’ work revealed that the PPE18 protein, part of the Mtb72f subunit vaccine, was quite variable among isolates collected from Turkey and Arkansas [14]. Analyzing the variability of antigens targeted by potential vaccines in a diverse set of isolates at the genomic level may indeed allow researchers to avoid developing a vaccine that is only variably effective like the current BCG. The findings of such study can also inform the rational selection of the study populations, with a consideration of covering the diverse pathogen populations in clinical trials of new vaccines.

Our observation that ESAT-6, TB10.4, and Ag85B proteins were highly conserved in our study sample comprising strains from two geographically distant regions and three different principal genetic groups suggest that it is unlikely that the efficacy of Ag85B-ESAT-6 and Ag85B-TB10.4 subunit vaccines will be affected by the genetic diversity of M. tuberculosis population. Thus, the protective efficacy of these two novel vaccine candidates may have a wider reach than Mtb72b vaccine, which contains a highly variable antigen of M. tuberculosis [14]. However, our findings also indicate the need for further bioinformatics research on the three genes investigated and their specific interaction with the host immune system. While highly conserved genes are indicative of homologous protein antigens, they may also suggest a lack of selective pressure by the host immune system and thus a lack of recognition on behalf of the host’s immune response. Previous examples of the inability of animal models to accurately represent the human host environment suggest that the degree to which the human host system interacts with these important peptides remains to be studied.

The goal of pre-clinical evaluation of these two vaccines may be furthered by future studies that include a larger sample of isolates from a greater range of geographic origins, making the data even more representative of the diversity of M. tuberculosis worldwide. The finding of this study that esxA, esxH, and fbpB were conserved across 88 clinical strains from Arkansas and Turkey does not confirm that they are conserved globally. Including a larger and more genetically diverse sample of isolates would address the two primary limitations of this study—the number and diversity of the isolates used.

Another factor that must be taken into account is the potential impact that host diversity may have on the global coverage of these two new TB vaccines. This study looked at the diversity of pathogen genes coding for vaccine proteins, but even uniformly conserved proteins may fail to induce protective immunity if host diversity impedes their ability to effectively bind effector immune cells. McNamara and colleagues provide an eloquent, bioinformatic approach to studying the impact that host diversity may have on Mtb27f vaccine coverage by analyzing the allelic variation of human class II MHC DRB1 proteins and its impact on proper vaccine epitope binding [15]. A similar study on Ag85B-ESAT-6 and Ag85B-TB10.4 may provide insightful information on host diversity and its impact on the coverage of these vaccines.

With these future directions in mind, the results of the present study represent an important first step in the pre-clinical bioinformatic assessment of Ag85B-ESAT-6 and Ag85B-TB10.4 vaccine candidates, and the impact that genetic diversity among their respective antigenic protein targets has on their potential success as global vaccines. The finding that esxA, esxB, and fbpB genes are highly conserved in two distinct populations suggests that Ag85B-ESAT-6 and Ag85B-TB10.4 vaccine candidates may be effective in geographically distinct areas of the world.

Acknowledgments

This study was supported by Grant NIH-R01-AI151975 from the National Institutes of Health and the Research Fund of the Office of the Vice President for Research of the University of Michigan. The Arkansas isolates and the genotyping data of the study isolated used in this study were kindly provided by Dr. Joseph H. Bates at the Arkansas Department of Health and Dr. Donald M. Cave at the University of Arkansas for Medical Sciences.