Research Article | Open Access
Analysis of Structures and Epitopes of Surface Antigen Glycoproteins Expressed in Bradyzoites of Toxoplasma gondii
Toxoplasma gondii is a protozoan parasite capable of infecting humans and animals. Surface antigen glycoproteins, SAG2C, -2D, -2X, and -2Y, are expressed on the surface of bradyzoites. These antigens have been shown to protect bradyzoites against immune responses during chronic infections. We studied structures of SAG2C, -2D, -2X, and -2Y proteins using bioinformatics methods. The protein sequence alignment was performed by T-Coffee method. Secondary structural and functional domains were predicted using software PSIPRED v3.0 and SMART software, and 3D models of proteins were constructed and compared using the I-TASSER server, VMD, and SWISS-spdbv. Our results showed that SAG2C, -2D, -2X, and -2Y are highly homologous proteins. They share the same conserved peptides and HLA-I restricted epitopes. The similarity in structure and domains indicated putative common functions that might stimulate similar immune response in hosts. The conserved peptides and HLA-restricted epitopes could provide important insights on vaccine study and the diagnosis of this disease.
Toxoplasma gondii (T. gondii) is a species of parasitic protozoa in the genus Toxoplasma that can be carried by many warm-blooded animals including humans . There are three infectious stages in a complex life cycle of T. gondii: the tachyzoites, the bradyzoites, and the sporozoites . A bradyzoite is a slowly replicating version of the parasite, which is responsible for chronic infection of T. gondii . In chronic toxoplasmosis, the parasitophorous vacuoles containing the reproductive bradyzoites form cysts in the tissues of the muscles and brain .
The surface antigen of T. gondii that plays roles in the processes of host cell attachment and host immune evasion is dominated by a SRS (SAG1-related sequence) family of proteins which includes the SAG1-like sequence branch and the SAG2-like sequence branch . SRS proteins are expressed in a stage-specific manner. SAG1, SAG2A, SAG2B, SAG3, SRS1, SRS2, and SRS3 are mainly expressed on the tachyzoite surface . Studies have indicated that SAG2 members participate in the process of parasite’s invasion to the host, and their antibodies could block the further attachment of T. gondii on host cells [7, 8]. Previous studies have demonstrated that T. gondii parasites with a deletion of SAG2C, -2D, -2X, and -2Y gene cluster are less capable of maintaining a chronic infection in the brain . It revealed that SAG2CDXY are important for persistence of cysts in the brain and these antigens might protect bradyzoites against an immune response. Contrary to SAG2A and SAG2B, which are expressed in tachyzoites, SAG2C, -2D, -2X, and -2Y appeared to be expressed exclusively on the surface of bradyzoites [9, 10]. However, among 160 members of the SRS family, only three proteins’ structures were reported. They are (i) the tachyzoite-expressed SAG1 , (ii) the bradyzoite-expressed BSR4 , and (iii) the sporoSAG . The structure and function domains of SAG2C, -2D, -2X, and -2Y are still not very clear.
In this study, we sought to predict the structure and function domains of SAG2C, -2D, -2X, and -2Y by bioinformatics methods. The protein sequence alignments were performed by the T-Coffee method. Secondary structural and functional domains were predicted using the software PSIPRED v3.0 and SMART software. The 3D structure model of each protein was mapped using the I-TASSER server. The structural similarities of these proteins were summarized and possible functions of some key amino acids were predicted using the space confrontation by VMD and SWISS-spdbv. Furthermore, HLA-restricted epitopes of SAG2C, -2D, -2X, and -2Y proteins were predicted via algorithms.
2.1. Data Resources
The protein sequences were derived from ToxoDB 5.1 (http://toxodb.org/toxo/). Toxoplasma gondii has three common types: type I, T. gondii GT1 (TGGT1_chrX 7,429,598); type II, T. gondii ME49 (TGME49_chrX 7,419,075); type III, T. gondii VEG (TGVEG_chrX 7,553,721). The original resources are listed in Table 1.
|Size is the amino acid number that the protein has.|
bCoding gene is the location of the gene that coded the protein.
cSRS domain-containing protein number.
2.2. Modular Architecture Identification
Multiple sequence alignment tool, T-Coffee (http://www.tcoffee.org/) [14, 15], was used to obtain the alignment analysis among SAG2C, SAG2D, SAG2X, and SAG2Y. The secondary structures were constructed using the software PSIPREDv3.0 (http://bioinf.cs.ucl.ac.uk/psipred/) [16, 17]. Simple modular architecture research identification and annotation of signaling domain sequences were analyzed via a web-based tool, SMART (http://smart.embl-heidelberg.de/) .
The 3D models of proteins were constructed by I-TASSER, a protein structure server on the website http://zhanglab.ccmb.med.umich.edu/I-TASSER/, which is considered to predict protein 3D structures that have more than 100 amino acids [19–21]. VMD is a molecular visualization software for displaying, animating, and analyzing large biomolecular systems using 3D graphics and built-in scripts (http://www.ks.uiuc.edu/Research/vmd/). VMD was used to read standard Protein Data Bank (PDB) files and display the contained structure [22–25]. Swiss-Pdb Viewer (http://www.expasy.org/spdbv/) is an application that provides a user friendly interface allowing analyses of several proteins at the same time. The proteins can be superimposed in order to obtain structural alignments and compare their active domains. We deduced amino acid mutations, H bonds, angles, and distances between atoms from the intuitive graphic and menu interface. 3D protein molecular fitness analysis was performed for SAG2C, -2D and SAG2X, -2Y [22, 23].
2.3. Conserved HLA-Restricted Epitopes Prediction
Consensus methods including ANN, SMM, and CombLib-Sidney in immune epitope database IEDB (http://www.immuneepitope.org/) were used to predict HLA-restricted epitopes [26–28]. We used this tool to determine each peptide sequence's ability to bind to the specific HLA class I molecule.
3. Results and Discussion
3.1. Amino Acid Sequence Alignment Analysis
SAG2C, SAG2D, SAG2X, and SAG2Y are positioned next to each other on chromosome X. The molecular masses of SAG2C, -2D, -2X, and -2Y are 32–38 kDa, 18–20 kDa 31–34 kDa, and 28–30 kDa, respectively . Multiple sequence alignment for SAG2C, -2D, -2X, and -2Y shows that the four proteins sequences have 97% similarity (Figure 1). In Particular, SAG2C (184 to 364) has a 98% sequence identity to SAG2D (14 to 196) and SAG2X (184 to 367) has a 99% sequence identity to SAG2Y (128 to 300). The protein sequence alignment analysis indicated that SAG2C, -2D, -2X, and -2Y have high homologous sequences. However, when including SAG2A and SAG2B in the alignment analysis, the consensus dropped to 73%, even though the consensus between SAG2A and SAG2B has very good score 84%. It indicates that a great difference exists among SAG2A, -2B and SAG2C, -2D, -2X, -2Y.
3.2. 2-D Structure Alignment for SAG2C, -2D, -2X, and -2Y Proteins
PSIPRED v. 3.0 was used to predict the secondary structures of SAG2C, -2D, -2X, and -2Y proteins. Figure 2 showed that SAG2C protein has two -helixes, 19 -strands, and 20 coils; SAG2D protein has one -helix, 9 -strands, and 10 coils; SAG2X protein has 3 -helixes, 14 -strands, and 18 coils; SAG2Y protein has two -helixes, 15 -strands, and 18 coils. Obviously, there was a long -helix on the C-terminal of all the proteins. SAG2D protein has similar secondary structure elements as SAG2C protein resides from 169 to 364. SAG2X and SAG2Y also have quite similar secondary structures except for a little discrepancy: SAG2X have one more helix than SAG2Y and one strand less than SAG2Y.
Furthermore, we used SMART to identify domains of these proteins (Figure 3). SAG2C, SAG2X, and SAG2Y all have two domains, while SAG2D only has one domain. SAG2D has an insertion of an adenosine, causing a frame shift and a premature stop codon, presumably leading to a truncated protein. SAG2C and SAG2D have transmembrane segments, while no transmembrane segments were identified on SAG2X and SAG2Y. From Figure 3, we could see that these proteins have no signal peptides, indicating that they are mature proteins. Members of the SAG2 family also differ in terms of open reading frame size, with the smaller SAG2D protein consisting of only one SAG domain, whereas SAG2C, SAG2X, and SAG2Y contain two SAG domains interrupted by a single intron. This indicates that SAG2C, SAG2X, and SAG2Y proteins have similar structure domains except SAG2D protein, which only has one domain.
3.3. Construction of 3D Model for SAG2C, -2D, -2X, and -2Y Proteins
3D model of SAG2C, -2D, -2X, and -2Y proteins were constructed by I-TASSER server. Five models were set up for each protein by Dr. Zhang’s lab . We selected the model with highest confidence C-score, which estimates the quality of predicted models by I-TASSER. It was calculated based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations . C-score is typically in the range of , and model with a C-score above 2 suggested a high confidence.
Low temperature replicas (decoys) generated during the simulation were clustered by SPICKER and top five cluster centroids were selected to generate full atomic protein models. The cluster density was defined as the number of structure decoys at each unit of space in the SPICKER cluster. A higher cluster density meant that the structure occurs more often in the simulation trajectory and therefore a better quality model. Table 2 showed the parameters for construction D model of each protein.
|C-score is a confidence score for estimating the quality of predicted models by I-TASSER. C-score is typically in the range of [−5, 2], where a C-score of higher value signifies a model with a high confidence and vice versa.|
bTM-score and RMSD are known standards for measuring structural similarity between two structures which are usually used to measure the accuracy of structure modeling when the native structure is known.
cNumber of decoys represents the number of structural decoys that are used in generating each model.
dCluster density represents the density of cluster.
The best model of each protein was selected and viewed via VMD program (Figure 4). SAG2C, -2X, and -2Y have obvious two domains, D1 and D2, which are formed by two -strands separated by one -helix; SAG2D has one domain which is formed by one -strand separated by one -helix. The -strands rotate to form a sheet tube that is a common character of these proteins. Furthermore, the binding sites of residues in the model were predicted and showed in Table 3.
|C-scoreLB is the confidence score of predicted binding site. C-scoreLB values range between , where a higher score indicates a more reliable ligand-binding site prediction.|
bTM-score is a measure of global structural similarity between query and template protein.
cRMSD is the RMSD between residues that are structurally aligned by TM-align.
dIDEN is the percentage sequence identity in the structurally aligned region.
eCov. represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein.
fBS-score is a measure of local similarity (sequence and structure) between template binding site and predicted binding site in the query structure.
Previous analysis of SAG2C, -2D, -2X, and -2Y structures revealed that the five on three sandwich fold of SAG2 was most similar to the T. gondii bradyzoite-expressed BSR4 with TM-scores of 0.583, 0.661, 0.672, and 0.670, respectively (Table 4). BSR4 is a prototypical bradyzoite surface antigen encoded in a cluster of SRS genes on chromosome IV, including the closely related paralogs SRS6 and SRS9 [8, 9]. Sequence alignment shows that SAG2C, -2D, -2X, and -2Y share 71% sequence identity with the tachyzoite-expressed BSR4. This observation is consistent with the prediction that stage-specific structural features might play an important role in the process of infection, dissemination, and pathogenesis in T. gondii. In BSR4, two strands are organized in an antiparallel fashion, followed by another strand on the lower face of the sandwich. The dimeric structure of SAG1 showed a sandwich, two parallel outside strands with an opposite one in between . The overall topology of the five on three sandwich D2 domain is conserved between SAG2C, -2D, -2X, -2Y and BSR4. A detailed comparison of SAG2C, -2D, -2X, -2Y and BSR4 reveals a similarity in topology of the D1 and D2 domain consistent with the lower Z-score from the Dali search.
|TM-score of the structural alignment between the query structure and known structures in the PDB library.|
bRMSD is the RMSD between residues that are structurally aligned by TM-align.
cIDEN is the percentage sequence identity in the structurally aligned region.
dCov. represents the coverage of the alignment by TM-align and is equal to the number of structurally aligned residues divided by length of the query protein.
By comparison, the next most similar structure is SproSAG (surface antigen glycoprotein) with a substantially reduced TM-score [30, 31]. SporoSAG is a dominant surface coat protein expressed on the surface of sporozoites. SporoSAG crystallized as a monomer and displayed unique features of the SRS sandwich fold compared to SAG1 and BSR4 . Intriguingly, the structural diversity is localized to the upper sheets of the sandwich fold and may have important implications for multimerization and host cell ligand recognition. By fit analysis, SAG2D fits well on the C-terminal of the protein SAG2C. SAG2X and SAG2Y fit pretty well from C-terminal to N-terminal (Figure 5).
3.4. Conserved HLA-Restricted CD8+ T Cells Epitope Prediction
Epitope prediction algorithm consensus was used to predict peptides that could stimulate human to induce effective and protective immune response against T. gondii. We want to see if they have similar epitopes scattered on the surface of their protein. The epitopes from SAG2C, -2D, -2X, and -2Y were predicted using the software from IEDB (http://www.immuneepitope.org/) which could identify novel HLA-class I restricted T cell epitopes derived from T. gondii. 16 peptides were selected based on a high HLA allele binding score (percentile rank < 3).
From Table 5, we can see three HLA-A*0201-restricted peptides: VVLGSAFMI, FMIAFISCF, AFISCFALV; four HLA-A*1101-restricted peptides: QVTVAVTSK, SSPQNIFYK, QVGTQTECK, KVLINIEEK; and two HLA-B*0702-restricted peptides: LPSSPQNIF, KPEAETPAT shared by SAG2C and SAG2D. From Table 6, we can see two HLA-A*0201-restricted peptides: ALVPNSSLV, VLSSSFMIV; three HLA-A*1101-restricted peptides: ALAITSTTK, SSAQTFFYK, KVLISVEKR; and two HLA-B*0702-restricted peptides: LPSSAQTFF, RPDSDATAT shared by SAG2X and SAG2Y.
More interestingly, when we marked the HLA-restricted epitopes on the alignment sequences of the proteins, we found that the epitopes restricted by the same type of HLA allele are located at the same domains of the proteins (Figure 6). Our results indicated that the epitopes from SAG2C, -2D, -2X, and -2Y can be recognized by the proper MHC-I molecular and present on the cell surface to induce immune response in the host T cells which might be helpful on vaccine study and diagnosis for this parasitic disease. Some identified peptides from these proteins have been proven to be recognized by PBMC cells from proper HLA-restricted T. gondii seropositive individuals and significantly induced IFN- production in T cells from immunized mice [32, 33] and therefore confirmed our predictions.
In this study, we have conducted a detailed bioinformatic and structural characterization analysis of the bradyzoite proteins SAG2C, -2D, -2X, and -2Y. The characterization of SAG2C, -2D, -2X, and -2Y provided structural view of the T. gondii SRS family members at chronic bradyzoite stage. Our bioinformatic analysis clearly showed that SAG2C, -2D, -2X, and -2Y are homologous protein members of the SAG2 subfamily. Consistently, our structural analysis demonstrated that SAG2C, -2D, -2X, and -2Y are similar to two other bradyzoite SAG2 members, BSR4 and SPOROSAG, rather than tachyzoite SAG1. This result indicated that SAG2 family has conserved structure at bradyzoite stage but a great difference from SAG1 at tachyzoite stage. Furthermore, the predicted conserved peptides and HLA-restricted epitopes shed interesting light on vaccine study and diagnosis for this parasitic disease.
Conflict of Interests
The authors wish to declare that there is no known conflict of interests associated with this publication and there has been no other significant financial support for this work that could have influenced its outcome.
This study was supported by a Grant from the National Natural Science Foundation Project of China (no. 81171604) and no. 49 China postdoc foundation.
- R. McLeod, F. Kieffer, M. Sautter, T. Hosten, and H. Pelloux, “Why prevent, diagnose and treat congenital toxoplasmosis?” Memorias do Instituto Oswaldo Cruz, vol. 104, no. 2, pp. 320–344, 2009.
- A. M. Tenter, A. R. Heckeroth, and L. M. Weiss, “Toxoplasma gondii: from animals to humans,” International Journal for Parasitology, vol. 30, no. 12, pp. 1217–1258, 2000.
- W. K. Chew, M. J. Wah, S. Ambu, and I. Segarra, “Toxoplasma gondii: determination of the onset of chronic infection in mice and the in vitro reactivation of brain cysts,” Experimental Parasitology, vol. 130, no. 1, pp. 22–25, 2012.
- D. Soldati and J. C. Boothroyd, “Transient transfection and expression in the obligate intracellular parasite Toxoplasma gondii,” Science, vol. 260, no. 5106, pp. 349–352, 1993.
- C. Jung, C. Y. Lee, and M. E. Grigg, “The SRS superfamily of Toxoplasma surface proteins,” International Journal for Parasitology, vol. 34, no. 3, pp. 285–296, 2004.
- I. D. Manger, A. B. Hehl, and J. C. Boothroyd, “The surface of Toxoplasma tachyzoites is dominated by a family of glycosylphosphatidylinositol-anchored antigens related to SAG1,” Infection and Immunity, vol. 66, no. 5, pp. 2237–2244, 1998.
- C. Lekutis, D. J. P. Ferguson, and J. C. Boothroyd, “Toxoplasma gondii: identification of a developmentally regulated family of genes related to SAG2,” Experimental Parasitology, vol. 96, no. 2, pp. 89–96, 2000.
- A. V. Machado, B. C. Caetano, R. P. Barbosa et al., “Prime and boost immunization with influenza and adenovirus encoding the Toxoplasma gondii surface antigen 2 (SAG2) induces strong protective immunity,” Vaccine, vol. 28, no. 18, pp. 3247–3256, 2010.
- J. P. J. Saeij, G. Arrizabalaga, and J. C. Boothroyd, “A cluster of four surface antigen genes specifically expressed in bradyzoites, SAG2CDXY, plays an important role in Toxoplasma gondii persistence,” Infection and Immunity, vol. 76, no. 6, pp. 2402–2410, 2008.
- S. K. Kim, A. Karasov, and J. C. Boothroyd, “Bradyzoite-specific surface antigen SRS9 plays a role in maintaining Toxoplasma gondii persistence in the brain and in host control of parasite replication in the intestine,” Infection and Immunity, vol. 75, no. 4, pp. 1626–1634, 2007.
- X. L. He, M. E. Grigg, J. C. Boothroyd, and K. C. Garcia, “Structure of the immunodominant surface antigen from the Toxoplasma gondii SRS superfamily,” Nature Structural Biology, vol. 9, no. 8, pp. 606–611, 2002.
- J. Crawford, O. Grujic, E. Bruic, M. Czjzek, M. E. Grigg, and M. J. Boulanger, “Structural characterization of the bradyzoite surface antigen (BSR4) from Toxoplasma gondii, a unique addition to the surface antigen glycoprotein 1-related superfamily,” Journal of Biological Chemistry, vol. 284, no. 14, pp. 9192–9198, 2009.
- J. Crawford, E. Lamb, J. Wasmuth, O. Grujic, M. E. Grigg, and M. J. Boulanger, “Structural and functional characterization of SporoSAG: a SAG2-related surface antigen from Toxoplasma gondii,” Journal of Biological Chemistry, vol. 285, no. 16, pp. 12063–12070, 2010.
- C. Notredame, D. G. Higgins, and J. Heringa, “T-coffee: a novel method for fast and accurate multiple sequence alignment,” Journal of Molecular Biology, vol. 302, no. 1, pp. 205–217, 2000.
- P. Di Tommaso, S. Moretti, I. Xenarios et al., “T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension,” Nucleic Acids Research, vol. 39, no. 2, pp. W13–W17, 2011.
- D. W. A. Buchan, S. M. Ward, A. E. Lobley, T. C. O. Nugent, K. Bryson, and D. T. Jones, “Protein annotation and modelling servers at University College London,” Nucleic Acids Research, vol. 38, no. 2, Article ID gkq427, pp. W563–W568, 2010.
- M. I. Sadowski and D. T. Jones, “The sequence-structure relationship and protein function prediction,” Current Opinion in Structural Biology, vol. 19, no. 3, pp. 357–362, 2009.
- J. Schultz, R. R. Copley, T. Doerks, C. P. Ponting, and P. Bork, “SMART: a web-based tool for the study of genetically mobile domains,” Nucleic Acids Research, vol. 28, no. 1, pp. 231–234, 2000.
- A. Roy, D. Xu, J. Poisson, and Y. Zhang, “A protocol for computer-based protein structure and function prediction,” Journal of Visualized Experiments, vol. 3, no. 57, p. e3259, 2011.
- Y. Zhang, “I-TASSER server for protein 3D structure prediction,” BMC Bioinformatics, vol. 9, article 40, 2008.
- A. Roy, A. Kucukural, and Y. Zhang, “I-TASSER: a unified platform for automated protein structure and function prediction.,” Nature protocols, vol. 5, no. 4, pp. 725–738, 2010.
- W. Humphrey, A. Dalke, and K. Schulten, “VMD: visual molecular dynamics,” Journal of Molecular Graphics, vol. 14, no. 1, pp. 33–38, 1996.
- T. Schwede, J. Kopp, N. Guex, and M. C. Peitsch, “SWISS-MODEL: an automated protein homology-modeling server,” Nucleic Acids Research, vol. 31, no. 13, pp. 3381–3385, 2003.
- N. Guex and M. C. Peitsch, “SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling,” Electrophoresis, vol. 18, no. 15, pp. 2714–2723, 1997.
- N. Guex, M. C. Peitsch, and T. Schwede, “Automated comparative protein structure modeling with SWISS-MODEL and Swiss-PdbViewer: a historical perspective,” Electrophoresis, vol. 30, no. 1, pp. S162–S173, 2009.
- M. Moutaftsi, B. Peters, V. Pasquetto et al., “A consensus epitope prediction approach identifies the breadth of murine T(CD8+)-cell responses to vaccinia virus,” Nature Biotechnology, vol. 24, no. 7, pp. 817–819, 2006.
- J. Sidney, E. Assarsson, C. Moore et al., “Quantitative peptide binding motifs for 19 human and mouse MHC class i molecules derived using positional scanning combinatorial peptide libraries,” Immunome Research, vol. 4, no. 1, article 2, 2008.
- C. Lundegaard, M. Nielsen, and O. Lund, “The validity of predicted T-cell epitopes,” Trends in Biotechnology, vol. 24, no. 12, pp. 537–538, 2006.
- A. Barragan and L. David Sibley, “Transepithelial migration of Toxoplasma gondii is linked to parasite motility and virulence,” Journal of Experimental Medicine, vol. 195, no. 12, pp. 1625–1633, 2002.
- B. Wallner and A. Elofsson, “Can correct protein models be identified?” Protein Science, vol. 12, no. 5, pp. 1073–1086, 2003.
- M. Graille, E. A. Stura, M. Bossus et al., “Crystal structure of the complex between the monomeric form of Toxoplasma gondii surface antigen 1 (SAG1) and a monoclonal antibody that mimics the human immune response,” Journal of Molecular Biology, vol. 354, no. 2, pp. 447–458, 2005.
- H. Cong, E. J. Mui, W. H. Witola et al., “Towards an immunosense vaccine to prevent toxoplasmosis: protective Toxoplasma gondii epitopes restricted by HLA-A*0201,” Vaccine, vol. 29, no. 4, pp. 754–762, 2011.
- H. Cong, E. J. Mui, W. H. Witola et al., “Human immunome, bioinformatic analyses using HLA supermotifs and the parasite genome, binding assays, studies of human T cell responses, and immunization of HLA-A*1101 transgenic mice including novel adjuvants provide a foundation for HLA-A03 restricted CD8+T cell epitope based, adjuvanted vaccine protective against Toxoplasma gondii,” Immunome Research, vol. 6, no. 1, article 12, 2010.
Copyright © 2013 Hua Cong et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.