Abstract

Sexual stages of Plasmodium such as zygote, ookinete, and young oocysts express 25 kDa surface protein P25, which along with P28 proteins protect the parasite from harmful environment inside mosquito midgut. Vaccines against these proteins induce antibodies in vertebrate host capable to inhibit parasite development in mosquito midgut and thus preventing the transmission of parasite from mosquito to other human host. Transmission-blocking vaccines help reduce malaria burden. The purpose of this study was in silico structural characterization of P25 family proteins and to predict their phylogenetic relationships with other proteins. Results indicate that members of P25 family have four EGF domains arranged in triangular fashion with major variations lying in the loop regions. All 22 cysteines are conserved forming 11 disulphide bonds. The C-loop of EGF domain IV in P25 proteins is smaller in comparison to P28 proteins. B loop of EGF domain II showed maximum RMSD variations followed by loops of EGF domain III. P25 proteins are tile-like triangular flat proteins that protect the parasite inside mosquito midgut. Obtained structures will help in understanding the biology of the parasite inside the mosquito midgut. These structures may also help in designing transmission-blocking vaccine against malaria in absence of experimentally determined structures.

1. Introduction

During sexual stages of the Plasmodium parasite, many surface proteins are synthesized de novo. P25 proteins are the major surface proteins of Plasmodium ookinetes, having molecular weight of 25 kDa. P25 proteins are present on ookinete surface of all known Plasmodium species [13]. Many proteins have been identified as promising vaccine candidates against malaria including two proteins of P25 family [4]. P25 proteins start expressing immediately after fertilization and continue to be expressed on zygote, ookinete, and young oocyst stages of Plasmodium [5]. These proteins are present in abundance and are evenly distributed over the entire ookinete surface [68]. Gene knockout experiments suggest that these proteins along with P28 proteins are essential for the survival of parasite inside mosquito midgut [9]. Structurally, all P25 proteins contain a signal sequence, four epidermal growth factor (EGF) domains, and a C-terminal glycosylphosphatidylinositol (GPI) anchor [10, 11]. EGF domains are known to be present especially in surface proteins where they participate in recognition and adhesion-like processes [12], which indicates that they play important roles in host parasite interactions. P25 proteins show very few sequence polymorphisms presumably because these proteins are never exposed to the vertebrate immune system [13]. Transmission blocking vaccines are being developed using P25 proteins. Antisera induced against Pfs25 protein from P. falciparum belonging to this family significantly blocked malaria transmission in nonhuman primates [14]. Experiments show that transmission blocking antibodies arrest/slow down movement of ookinete inside the mosquito midgut [15], and ookinete delayed in forming oocyst is not able to survive in the harsh proteolytic environment inside mosquito midgut [16]. The levels of antibody are directly related to the transmission blocking activity [17]. Recombinant versions of transmission blocking antibodies also block transmission [18, 19]. Structure of only one P25 protein is available to date, that is, Pvs25 from P. vivax [20]. The 3D structure of a P25 transmission blocking vaccine candidate proteins will help in understanding the drug and disease mechanism as they are important vaccine candidates against malaria. Structure of these proteins can help in studying interaction of these proteins with respective transmission blocking antibodies and the receptors molecules present in mosquito midgut, which may help in understanding the biology of the parasite inside mosquito midgut. Homology modeling of eight P25 proteins suggested that all the mature P25 proteins are triangular flat molecules, each having four EGF domains arranged in the form of a triangle. A comparative analysis of the structures revealed that all P25 proteins have a structural scaffold of 22 conserved cysteines which form 11 disulphide bonds in all the members of the family. These disulphide bonds act as a skeleton for all the members of the P25 family leading to similarities in the overall structures of the family members. Dissimilarities in structures are primarily present in the loop regions of the P25 proteins. These loop regions are responsible for variable molecular recognition of P25 proteins among different species, whereas the conserved regions are responsible for similarity in functions of the P25 proteins among different Plasmodium species.

2. Methods

2.1. Template Selection

The amino acid sequences of the selected eight P25 proteins as shown in Table 1 were downloaded from the nonredundant protein sequence database Swiss-Prot [21] release 50.4. In order to select suitable templates to model the selected eight P25 ookinete surface proteins, BLAST (http://www.ncbi.nlm.nih.gov/BLAST/) [22] search was performed against the protein structure database PDB (http://www.rcsb.org/pdb/Welcome.do, 52103 structures) [23] taking an e-value of 0.001. This homology search resulted in only one significant hit, that is, Pvs25 (PDB id = 1Z27) protein from Plasmodium vivax. The sequence identity and sequence similarity between the P25 proteins and Pvs25 protein are shown in Table 1. Sequence alignment was done with the help of ClustalX and MUSCLE programs [24, 25]. All the P25 proteins contain 22 conserved cysteines in oxidized form resulting in formation of 11 disulphide bonds. Within an EGF domain, disulphide bonds are formed between 1–3, 2–4, and 5-6 cysteines. Out of the four EGF domains, EGF domain II, EGF domain III, and EGF domain IV contain six cysteines, whereas EGF domain I contains only four cysteines forming disulphide bonds between cysteines 1–3 and 2–4.

2.2. Homology Modeling

In present study, eight sequences of P25 family were modeled with the help of Modeller version 8v2. Modeller implements homology modeling of proteins by satisfaction of spatial restraints [26]. The query P25 sequences and template Pvs25 sequence after alignment using MUSCLE program were provided to Modeller, and 100 models for each P25 protein were generated. Modeller derives restraints from known related structures automatically. These restraints include bond angles, distances, dihedral angles, pairs of dihedral angles, and some other spatial restraints. Bond and angle values are taken from CHARMM-22 force field. All the 3D models were generated by molecular probability density function optimization. As template for homology modeling, X-ray crystallographic structure of Pvs25 from Plasmodium vivax (Protein database structure id of template that was used. http://www.rcsb.org/pdb/explore/explore.do?structureId=1z27) [20] was used. This template shows cysteine arrangement similar to P25 proteins. It is known that cysteines are the most conserved residues during evolution, and all P25 proteins contain four EGF domains with all the 22 cysteines in oxidized form; therefore, disulphide bonds were assigned as constraints to the program during modeling of the proteins. For each of the eight members, 100 models were generated with the help of Modeller program, and the best structures were selected on the basis of standard evaluation methods. In all the eight P25 proteins studied, loops were seen most variable, especially the B loop of the EGF domain II; therefore, the loop regions of P25 proteins were carefully modeled with the help of ModLoop server (http://modbase.compbio.ucsf.edu/modloop/) [27, 28].

2.3. Model Evaluation

For evaluation of models of the P25 proteins obtained by modeling with the help of Modeller, ProSA-web (https://prosa.services.came.sbg.ac.at/prosa.php) [29, 30], PROCHECK (http://nihserver.mbi.ucla.edu/SAVS/) [31], and WHATIF (http://swift.cmbi.ru.nl/servers/html/index.html) [32] were used. Out of the total models generated, only those models were selected, which showed satisfactory ProSA-web, PROCHECK, and WHATIF profile, and then the best model was selected out of these. According to our observations during modeling, the loop regions of the P25 family members were showing maximum RMSD variations, and therefore these regions needed special attention. Taking this into account, the flexible large loops of P25 proteins were modeled with the help of ModLoop server (http://modbase.compbio.ucsf.edu/modloop/) [33, 34]. 3D structural superimposition of the structures was done using the program STAMP which is a part of VMD (Visual Molecular Dynamics) version 1.8.4 [35]. Residue-residue interactions of the domains within the P25 proteins were calculated with the help of PDBsum (http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/upload.html) [36].

2.4. Phylogenetic Analysis

To analyze P25 protein sequences from evolutionary perspective, the phylogenetic tree was constructed using both Neighbor Joining (NJ) and Minimum Evolution (ME) method using MEGA4 (http://www.megasoftware.net/) software [37]. The settings of MEGA4 were as follows: 125 bootstrap replicates and 42535 seed; a close neighbor interchange was used as search option (level = 1) with an initial neighbor-joining tree, MaxTrees = 1. Gaps were considered as complete deletions. Evolution rates among sites were considered uniform.

3. Results and Discussion

3.1. The P25 Proteins

Two members of P25 family,Pfs25 from P. falciparum and Pvs25 from P. vivax, are already in clinical trial Phase II of vaccine development, which itself represents their importance as vaccine candidate proteins. The members of P25 family have a signal sequence which is followed by four tandem epidermal growth factor- (EGF-) like domains and a glycosylphosphatidylinositol (GPI) anchor, which anchors the protein to the parasite surface [4]. Single letter amino acid sequences of P25 proteins in FASTA format were downloaded from Swiss-Prot release 50.4 (http://au.expasy.org/sprot/) [21]. In order to find out the similarities between the sequences of P25 family, multiple sequence alignment was done with the help of MUSCLE and ClustalX programs. Figure 1 shows the multiple sequence alignment of the 8 members of P25 family with the template protein Pvs25 of the same family, whereas Figure 2 represents the web logo [38, 39] of the P25 family of ookinete surface proteins. Web logo of the family clearly indicates that all the twenty-two cysteines are totally conserved in the P25 family. Obtained models of the eight P25 proteins were found to be similar to the template sequence Pvs25 [20]. This similarity was as expected because of the presence of 22 conserved cysteine residues (Figure 1), 45–85% sequence identity, and 61–91% sequence similarity (Table 1) between the target P25 protein sequences and the template Pvs25 (http://www.rcsb.org/pdb/explore/explore.do?structureId=1z27) protein sequence. The lengths of the proteins taken for modeling the P25 proteins are mentioned in Table 1. P25 proteins have many other conserved and semiconserved residues within the family apart from the structural skeleton provided by the cysteines. These conserved interactions are important in making specific functional contacts within the molecule among the four EGF domains of the proteins as well as with other nearby molecules while forming sheet on the surface of the parasite Plasmodium. Table 2 represents the list of conserved domain-domain interactions of the eight members of P25 family. All the modeled members were compared with template protein Pvs25. EGF domains have high tolerance to polymorphism/mutations and are tolerant to insertions/deletions; therefore, a high level of similarity between the members of P25 family was expected. Our findings confirmed the same. Due to 3 complete EGF domains found in P25 domains, they show even lesser polymorphism when compared to P28 proteins. In other words P25 proteins are more resistant to mutations and deletions in comparison to P28 proteins. Table 2 explains why the structures of P25 proteins are similar to Pvs25 protein. P25 models when superimposed with Pvs25 template showed similarity in the overall fold as shown in Figure 3. Many other residues along with cysteines were found to be conserved in the eight P25 proteins modeled in this study. Residues V/I2, T/S3, T6, K9, G11, L13, Q15, M16, S17, H19, E21, and N/S32 of EGF domain I were conserved. In EGF domain II, K/E37, K39, K/E42, V/L48, G53, D/E54, F/Y55, and Y/F78 were conserved. EGF domain III showed that residues S114, I117, G118, V120, N/S122, D/N125, T/S130, G133 and T135 are conserved, and in EGF domain IV, the conserved residues were L139, D/E144, Q/E146, K/R149, Y/F155, D/E161, G162, F/Y163 and E168. High degree of sequence conservation among the sequences of the P25 family was also reflected by the conserved domain-domain interactions found in the models obtained for the P25 proteins as shown in Table 2. Many of these conserved interactions are supposed to be involved in the interactions of P25 molecules to another P25/P28 molecule in order to form sheet over the parasite surface.

3.2. The P25 Models

The homology modeling of eight P25 proteins showed that all of them contained four EGF domains arranged in the form of a triangle as shown in Figure 4. EGF domains are basically 30–40 residue domains containing two central beta strands followed by two smaller beta strands. A typical EGF domain contains three disulphide bonds formed with the help of six cysteines linked in the manner 1–3, 2–4, and 5-6. In P25 proteins, EGF domains II, III, and IV represented typical EGF domains with six cysteines and three disulphide bonds. However, EGF domain I contained only four cysteines, which formed only two disulphide bonds instead of three, which means that three EGF domains in P25 proteins are complete, whereas the IVth domain is incomplete. The four EGF domains of P25 proteins shared remarkable resemblance with each other. The number and spacing of the twenty-two cysteines were conserved within the P25 family of ookinete surface proteins.

3.3. Validation of P25 Protein Models

Ramachandran plot statistics of the P25 protein models are shown in Figure 5. No residue was found in the disallowed region of any of the eight sequences of P25 family modelled as shown in Figure 5. Figures 6 and 7 show ProsaWeb analysis of the P25 members, which indicate that Z-scores of all the P25 proteins lie within the standard defined values, and the energy plot showed that the P25 protein models have overall negative energy and therefore are stable energetically. P25 proteins contained very few polymorphisms as they only expressed inside the mosquito host and are never exposed to the vertebrate immune system. This property of the P25 proteins makes them suitable to be used as vaccine candidate proteins against malaria. Less polymorphism found in P25 proteins is due to the extent of disulphide cross-linking within the EGF domain. P25 proteins had very few hydrophobic residues in their hydrophobic core, and sequence changes were accommodated by the solvent accessible atoms leading to the same structural folds in the domain with variations lying in the loops of the EGF domains. The obtained models of P25 family showed that the loops represent maximum RMSDs as shown in Figures 2 and 3. Overall similarity among the P25 family members indicated that these proteins may have a similar major functions, whereas limited variations in the loop regions indicated towards different functional requirements of the loops in different species.

3.4. Electrostatic Representation

Electrostatic representation of the different faces of the P25 family members revealed that most of the charged residues were present on the surface of the proteins as shown in Figures 7(a)7(e). All the P25 members studied carry an overall negative charge. The details of the charge, area, and volumes of the eight P25 proteins studied are given in Table 3. Though the P25 family members represented similar structures. Despite very similar structures P25 member molecules differ considerably when it comes to charge distribution. It can be clearly seen that negatively charged residues were differentially distributed on the surface of different P25 members (Figure 7). The central pores of all the eight P25 proteins were surrounded by negatively charged residues. Positively charged residues were few and mostly located towards the edges of the molecules. Pcs25 from P. cynomolgi was the maximum negatively charged molecule with only three positively charged residues on dorsal surface (Figure 7(a)) and two on ventral surface (Figure 7(a)). P25 family members showed very different charge distribution when compared to each other as shown in Figures 7(a)7(e). In order to know whether these proteins have a different amino acid composition in terms of charged residues, we carried out the amino acid composition analysis of positively and negatively residues of the eight P25 member proteins. Also a comparison was done with template protein Pvs25. It was found that the percentage of the positively charged amino acids (K, R and H) was almost constant among the P25 proteins and was similar to that of the template Pvs25 as shown in Figure 8, whereas in case of negatively charged residues (D and E), amino acid composition was found variable Pcs25 showed maximum percentage of negatively charged residues in comparison to other members of P25 family. Overall, the P25 proteins are relatively more negatively charged and less positively charged. Pcs25 shows an overall negatively charged surface in GRASP surface presentation. The negatively charged residues form big patches on the surface of the molecules. There was a minimal hydrophobic core in P25 molecules with comparatively few residues buried, and most of them were solvent accessible. Charged residues were mainly situated on the surface of the molecule (relatively very high accessible molecular surface) thereby indicating significance of these residues in parasite’s interaction to the mosquito midgut components. The conserved cysteines were present in the core region of the P25 molecules with least relative surface exposure which explains why cysteines were conserved in all the members of the P25 family. A similar cysteine arrangement and high sequence and structural identity between the members have been observed among the eight members studied. Though members of P25 family have similar functions and all of them are supposed to have four EGF domains, they differed in the loop regions of the EGF domains, and therefore each of them was able to recognize different molecules present on the surface of the mosquito midgut. Maximum variations were seen in the B loop of EGF domain II followed by the loop regions of EGF domain III as shown in Figures 2 and 3. EGF domains I and IV showed maximum similarities with minimum RMSD variations throughout the P25 family members Figures 2 and 3. The charge distribution on the surface of the P25 molecules also indicated their species specific interaction with the molecules present in the mosquito midgut.

3.5. Phylogenetic Studies

In order to know the phylogenetic relationships of P25 proteins with other ookinete surface proteins, we did a NCBI BLAST analysis of the P25 family members. Sequences showing similarities were downloaded and a phylogenetic tree was constituted. P25 proteins show significant sequence similarities to P28 family of ookinete surface proteins, which was also shown in our previous publication [40]. Analysis of the phylogenetic tree for the genus Plasmodium, based on the amino acid sequences of ookinete surface proteins, suggests that all the human Plasmodium P25 and P28 proteins were situated between rodent and avian P25 and P28 homologues, respectively, as shown in Figure 9. The position of P. knowlesi can be explained by the fact that it is now considered as the fifth human malaria parasite [41], whereas P. reichenowi causes P. falciparum-like cerebral malaria in chimpanzees, considered the closest evolutionary primate to humans. Also it has been observed that the Guanine + Cytocine (G+C) content of the genomic DNA of P. falciparum was very similar to both rodent and avian parasites, which suggested closer relationship between the two species. Observed similar structures of P28 proteins indicate convergent evolution of the family. Our studies on P28 proteins indicated that the rodent parasites are distant from the rest of the members of the family. The results were also in agreement with those reported by Waters et al. [42] based on phylogenetic analysis of asexually expressed SSU rRNA genes providing an experimental support to our study. Apart from the similarities with the P28 proteins, the P25 proteins are distantly related to Fibrillins and latent transforming growth factor proteins. It is interesting that the Plasmodium P25 and P28 proteins do not share similarities with any other lower prokaryotic genera. All the proteins as shown in Figure 9 belonged to higher chordates. It is likely that the P25 proteins originated from some higher chordate protein.

4. Conclusion

In this study, we have proposed eight models for different members of P25 family of ookinete surface proteins of Plasmodium sp. All these models were developed by homology modeling looking at their high sequence identity with the template Pvs25 sequence. All the models we obtained for the eight P25 proteins contained four EGF domains arranged in triangular-prism-shaped structures. When compared to each other and to that of the template protein Pvs25, the P25 proteins represented maximum variability in the B loop of EGF domain II as shown in Figures 2(b) and 3. P25 proteins along with P28 proteins are supposed to form sheets on the parasite surface, which provides protection to the parasite from the external harsh environment inside mosquito midgut. Parasites lacking both P25 and P28 proteins are not able to survive. P25 proteins interact with transmission blocking antibodies leading to transmission blocking of malaria, and therefore the obtained structures may help in vaccine development against malaria. As these therapeutic antibodies are becoming more important, such models are being used to predict protein-protein interaction complex structures of low resolution, which can then be used to guide experimental methods. Our study also indicates that the P25 proteins show close relationships to P28 proteins evolutionarily. It is expected that P25 and P28 proteins evolved due to duplication. P25 proteins do not show any similarity to prokaryotes; instead they show distant relations to higher chordate proteins, like Fibrillins and latent transforming growth factor proteins. As shown in Figure 9, almost all other sequences belong to higher chordates specially mammals. The models obtained in this study may serve many significant purposes. Such models can be used to predict the mechanism of transmission blocking as done in our previous two publications [43, 44]. A protein-protein docking study with proteins present in mosquito gut, like laminin and annexin, will help in understanding the host parasite interactions inside the mosquito midgut.

Acknowledgments

Fellowship from Council of Scientific and Industrial Research (CSIR), India to Sharma B during Ph.D. is acknowledged. The authors thank Kushwaha, H. R., Yennamalli, R., and Patel, A., Centre for Computational Biology and Bioinformatics, Jawaharlal Nehru University, for their suggestions, discussions, and software support. Sharma B. thanks her advisors Professor P. K. Yadav, Professor K. C. Upadhaya, Professor R. Madhubala, Dr. S. Gaurinath, and Dr. A. Pareek for their help and guidance during the Ph.D. Sharma B. thanks her supervisor Dr. A. K. Saxena for allowing to design and implement the protocols of this study independently.