Research Article | Open Access
In Silico Characterization of Histidine Acid Phytase Sequences
Histidine acid phytases (HAPhy) are widely distributed enzymes among bacteria, fungi, plants, and some animal tissues. They have a significant role as an animal feed enzyme and in the solubilization of insoluble phosphates and minerals present in the form of phytic acid complex. A set of 50 reference protein sequences representing HAPhy were retrieved from NCBI protein database and characterized for various biochemical properties, multiple sequence alignment (MSA), homology search, phylogenetic analysis, motifs, and superfamily search. MSA using MEGA5 revealed the presence of conserved sequences at N-terminal “RHGXRXP” and C-terminal “HD.” Phylogenetic tree analysis indicates the presence of three clusters representing different HAPhy, that is, PhyA, PhyB, and AppA. Analysis of 10 commonly distributed motifs in the sequences indicates the presence of signature sequence for each class. Motif 1 “SPFCDLFTHEEWIQYDYLQSLGKYYGYGAGNPLGPAQGIGF” was present in 38 protein sequences representing clusters 1 (PhyA) and 2 (PhyB). Cluster 3 (AppA) contains motif 9 “KKGCPQSGQVAIIADVDERTRKTGEAFAAGLAPDCAITVHTQADTSSPDP” as a signature sequence. All sequences belong to histidine acid phosphatase family as resulted from superfamily search. No conserved sequence representing 3- or 6-phytase could be identified using multiple sequence alignment. This in silico analysis might contribute in the classification and future genetic engineering of this most diverse class of phytase.
Phytate (myo-inositol 1,2,3,4,5,6-hexakisphosphate; IP6) is the major storage form of phosphorus (P), representing approximately 80% of P in soil , 65–80% of total P in grains , and up to 80% of P in manures from monogastric animals . Phytate exists primarily as metal phytate complex with nutritionally important cations, that is, Ca2+, Fe2+, and Zn2+ .
Phytases (IP6 phosphohydrolase) are a class of phosphatases which catalyses hydrolysis of phytate to inositol phosphates, inorganic phosphorus, and myo-inositol , also lowers down affinity of phytate to associated minerals and proteins , and thus increases bioavailability of P, minerals, and proteins for growth and development of plants and animals [7–9].
Phytases are widely distributed among plants [10, 11], certain animal tissues, and microbial cells [12–15]. To date, four classes of phytases have been characterized in terrestrial organisms: histidine acid phytase (HAPhy), cysteine phytase (CPhy), purple acid phosphatase (PAP), and -propeller phytase (BPPhy) [16, 17]. HAPhys are the most studied and diverse class of phytase. Most bacterial, fungal, and plant phytases belong to histidine acid phosphatases (EC 22.214.171.124) which are further classified as 3-phytase (EC 126.96.36.199) or 6-phytase (EC 188.8.131.52) due to their high specific activity for phytate and position specific initial hydrolysis of phytate.
Phytases have been extensively reviewed for various industrial and biotechnological applications [18–21], biochemical properties , and consensus phytase construct . Conserved amino acid residues are reported in HAPhy sequences at N-terminal “RHGXRXP,” C-terminal “HD,” and eight cysteine residues in around sequence [16, 24, 25]. It is a well-adopted fact that all phytases have not similar and common active site; hence the initial classification system is based on catalytic mechanism . Still, there is a need to devise a taxonomic system to accommodate new types of phytases with novel catalytic mechanism.
The in silico characterization of protein sequences of industrially important enzymes has been reported recently [26–28]. Biochemical features, homology search, multiple sequence alignment, phylogenetic tree construction, motif, and superfamily distribution of alkaline proteases have been analyzed using various bioinformatics tools . A total of 121 protein sequences of pectate lyases were subjected to homology search, multiple sequence alignment, phylogenetic tree construction, and motif analysis . Malviya et al.  collected forty-seven full-length amino acid sequences of PPO from bacteria, fungi, and plants and subjected them to multiple sequence alignment (MSA), domain identification, and phylogenetic tree construction.
In the present study, we performed in silico analysis of 50 HAPhy protein sequences. The biochemical features, homology search, multiple sequence alignment, phylogenetic tree construction, motif, and superfamily distribution have been analyzed using various bioinformatics tools.
2. Material and Methods
Representative genes from histidine acid phytases (E. coli AppA, GenBank accession number P07102; Aspergillus niger PhyA and PhyB, P34752 and P34754) were used as probes to BLAST microbial genome database from NCBI (http://www.ncbi.nlm.nih.gov/). The protein sequences in FASTA format from RefSeq entries, which were shown to exhibit phytase activities, were selected for further in silico study.
Physiochemical data were generated from various tools in the EXPASY proteomic server (ClustalW, ProtParam, protein calculator, Compute pI/Mw, ProtScale) . The molecular weights (kDa) of the various histidine acid phytases were calculated by the addition of average isotopic masses of amino acid in the protein and deducting the average isotopic mass of one water molecule. The pI of enzyme was calculated using pK values of amino acid according to Bjellqvist et al. .
The evolutionary history was inferred using the Neighbor-Joining method . The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Poisson correction method  and are in the units of the number of amino acid substitutions per site. All positions containing gaps and missing data were eliminated. There were a total of 303 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 . For domain search, the Pfam site (http://www.sanger.ac.uk/resources/software/) was used. Domain analysis was done using MEME (http://meme.nbcr.net/meme/) . The conserved protein motifs deduced by MEME were characterized for biological function analysis using protein BLAST, and domains were studied with InterProScan providing the best possible match based on the highest similarity score.
3. Result and Discussion
The 50 protein sequences of HAPhy were retrieved from NCBI. The accession number of retrieved sequences along with species names is listed in Table 1. The sequences were characterized for homology search, multiple sequences alignment, biochemical features, phylogenetic tree construction, motifs, and superfamily search using various bioinformatics tools. Out of 50 sequences 12 sequences belong to HAPhy gene AppA, 26 sequences to PhyA, and 12 sequences to PhyB.
Multiple sequence alignment showed presence of conserved sites for HAPhy N-terminal “RHG/NXRXP” and C-terminal “HD” in all sequences as reported by other coworkers . This is consistent with Pfam analysis of predicted active site residues, which in all sequences is shown to be N-terminal histidine residue present in conserved region and C-terminal aspartic acid. The histidine in N-terminal region seems as a nucleophile in the formation of a covalent phosphohistidine intermediate . Aspartic acid at C-terminal “HD” sequence acts as a proton donor to the oxygen atom of the scissile phosphomonoester bond [36, 37]. No conserved sequence representing 3- or 6-phytase could be identified using multiple-sequence alignment.
The phylogenetic tree based on protein sequences revealed three major clusters. Cluster 1, a larger cluster containing 26 sequences under study, includes the majority of Aspergillus sp., Penicillium sp., Ajellomyces sp., Arthroderma sp., Trichophyton sp., Sclerotinia sp., Uncinocarpus sp., and Coccidioides sp. (Figure 1). Biochemical features for this cluster are listed in Table 2. The total number of amino acid residues ranged from 441 to 539 with variable molecular weights. pI values of this cluster ranged from 4.87 to 8.53. Variations among various phytase in this group in terms of other physiochemical parameters like positively charged and negatively charged residues, hydropathicity (GRAVY) are given in Table 2.
Aliphatic index analysis reveals uniformity in this group of phytases within the range of except for some sequences of Arthroderma sp. (XP_002849736.1, XP_003169494.1, XP_003015622.1) and Trichophyton sp. (XP_003021635.1). Aliphatic index of protein measures the relative volume occupied by aliphatic side chains of the amino acids: alanine, valine, leucine, and isoleucine. Globular proteins with high aliphatic index have high thermostability, and an increase in aliphatic index increases protein thermostability [38, 39].
Cluster 2 includes 12 protein sequences and represents PhyB gene sequences including the majority of Candida sp., S. cerevisiae, C. posadasii, and D. hansenii. Total number of sequences in this group is in the range of 457 to 479, and the pI values range from 4.41 to 5.82. It has less variation in its pI as compared to cluster 1 sequences (PhyA). Aliphatic index of this cluster sequences is uniform in the range of except for Candida tropicalis (XP_002546108.1) with a value of 67.74 and Komagataella pastoris (XP_002490985.1) with a value of 84.19.
Cluster 3 represents protein sequences from phytase gene AppA, also abbreviated as PhyC , which includes E. coli (in majority) along with various Shigella sp. and Citrobacter freundii. Various biophysical parameters for this group of sequences reveal amino acid residues ranging from 428 to 523, while pI value of the majority of sequences is in range of 5.5 to 6.5 except for E. albertii (9.35) and E. fergusonii (8.37). Aliphatic index of this group of sequences reveals highest thermostability among all three clusters. Predominantly positively charged amino acids are present in all three clusters.
The instability index is used to measure in vivo half-life of a protein . The proteins which have been reported as in vivo half-life of less than 5 hours showed instability index greater than 40, whereas those having more than 16 hours half-life  have an instability index of less than 40. Instability index of HAP sequences under the study is found higher than 40 (Table 2) for 15 sequences including fully characterized E. coli and A. niger phytases, indicating an in vivo half-life of less than 5 hours. Superfam tool on ExPASy server for superfamily analysis of phytase sequences reveals the identity of all sequences to histidine acid phosphatase family belonging to phosphoglycerate mutase-like superfamily  (Table 3).
Histidine acid phytase from all three clusters shares a large / and a small -domain . MEME analysis results in frequently observed 10 motifs (Table 4). A set of 41 amino acid residues “SPFCDLFTHEEWIQYDYLQSLGKYYGYGAGNPLGPAQGIGF” representing motif 1 were conserved and uniformly observed in 38 phytase protein sequences from clusters 1 and 2, that is, PhyA and PhyB, revealing their identity with HP_HAP like, histidine acid phosphatase superfamily. Other motifs are associated with HAP superfamily (Table 2). Cluster 3, representing AppA, does not have motif 1 in its sequences, but it does contain a 50 amino acid residues long unique motif 9 “KKGCPQSGQVAI IADVDERTRKTGEAFAAGLAPDCAITVHTQADTSSPDP.” Motif 5 “YAFLKTYNYSL GADDLTPFGEQQLVDSGIKFYQRYESLAKDIVPFIRASG” is present in all protein sequences representing PhyA cluster 1. PhyB protein sequences also contain a unique 41 amino acid residues long motif 8 “ETSPENSEGPYAGTTNALRHGAAFRARY GSLYDENSTLPVF.”
Phylogenetic clustering and variation among biochemical features of different phytases might contribute in further classification of highly diverse HAPhys and their selection for various application purposes. Conserved sequences in motifs may be utilized for designing specific degenerate primers for identification and isolation of type and class of phytase (HAPhy) as numerous phytases are being isolated to fulfill the need of efficient phytase for feed application in various systems. Variation in biochemical features may be a key source of information for the screening of novel phytases and comparison with other classes of phytases. Functional attributes are needed to verify experimentally for conserved motifs found. This in silico analysis might be used for future genetic engineering of industrially important phytase.
- B. L. Turner, M. J. Papházy, P. M. Haygarth, and I. D. McKelvie, “Inositol phosphates in the environment,” Philosophical Transactions of the Royal Society B, vol. 357, no. 1420, pp. 449–469, 2002.
- J. N. A. Lott, I. Ockenden, V. Raboy, and G. D. Batten, “Phytic acid and phosphorus in crop seeds and fruits: a global estimate,” Seed Science Research, vol. 10, no. 1, pp. 11–33, 2000.
- G. M. Barnett, “Phosphorus forms in animal manure,” Bioresource Technology, vol. 49, no. 2, pp. 139–147, 1994.
- K. Asada, K. Tanaka, and Z. Kasai, “Formation of phytic acid in cereal grains,” Annals of the New York Academy of Sciences, vol. 165, no. 2, pp. 801–814, 1969.
- M. Wyss, R. Brugger, A. Kronenberger et al., “Biochemical characterization of fungal phytases (myo-inositol hexakisphosphate phosphohydrolases): catalytic properties,” Applied and Environmental Microbiology, vol. 65, no. 2, pp. 367–373, 1999.
- D. B. Mitchell, K. Vogel, B. J. Weimann, L. Pasamontes, and A. P. G. M. Van Loon, “The phytase subfamily of histidine acid phosphatases: isolation of genes for two novel phytases from the fungi Aspergillus terreus and Myceliophthora thermophila,” Microbiology, vol. 143, no. 1, pp. 245–252, 1997.
- N. R. Augspurger, D. M. Webel, X. G. Lei, and D. H. Baker, “Efficacy of an E. coli phytase expressed in yeast for releasing phytate-bound phosphorus in young chicks and pigs,” Journal of Animal Science, vol. 81, no. 2, pp. 474–483, 2003.
- O. A. Olukosi, A. J. Cowieson, and O. Adeola, “Age-related influence of a cocktail of xylanase, amylase, and protease or phytase individually or in combination in broilers,” Poultry Science, vol. 86, no. 1, pp. 77–86, 2007.
- S. M. Rutherfurd, T. K. Chung, and P. J. Moughan, “The effect of microbial phytase on ileal phosphorus and amino acid digestibility in the broiler chicken,” British Poultry Science, vol. 43, no. 4, pp. 598–606, 2002.
- D. M. Gibson and A. H. J. Ullah, “Purification and characterization of phytase from cotyledons of germinating soybean seeds,” Archives of Biochemistry and Biophysics, vol. 260, no. 2, pp. 503–513, 1988.
- C. E. Hegeman and E. A. Grabau, “A novel phytase with sequence similarity to purple acid phosphatases is expressed in cotyledons of germinating soybean seedlings,” Plant Physiology, vol. 126, no. 4, pp. 1598–1608, 2001.
- R. Greiner, M. L. Alminger, and N. G. Carlsson, “Stereospecificity of myo-inositol hexakisphosphate dephosphorylation by a phytate-degrading enzyme of baker's yeast,” Journal of Agricultural and Food Chemistry, vol. 49, no. 5, pp. 2228–2233, 2001.
- Y. O. Kim, J. K. Lee, H. K. Kim, J. H. Yu, and T. K. Oh, “Cloning of the thermostable phytase gene (phy) from Bacillus sp. DS11 and its overexpression in Escherichia coli,” FEMS Microbiology Letters, vol. 162, no. 1, pp. 185–191, 1998.
- Y. H. Tseng, T. J. Fang, and S. M. Tseng, “Isolation and characterization of a novel phytase from Penicillium simplicissimum,” Folia Microbiologica, vol. 45, no. 2, pp. 121–127, 2000.
- A. H. Ullah and D. M. Gibson, “Extracellular phytase (E.C. 184.108.40.206) from Aspergillus ficuum NRRL 3135: purification and characterization,” Preparative Biochemistry, vol. 17, no. 1, pp. 63–91, 1987.
- E. J. Mullaney and A. H. J. Ullah, “The term phytase comprises several different classes of enzymes,” Biochemical and Biophysical Research Communications, vol. 312, no. 1, pp. 179–184, 2003.
- H. M. Chu, R. T. Guo, T. W. Lin et al., “Structures of Selenomonas ruminantium phytase in complex with persulfated phytate: DSP phytase fold and mechanism for sequential substrate hydrolysis,” Structure, vol. 12, no. 11, pp. 2015–2024, 2004.
- S. Afinah, A. M. Yazid, M. H. Anis Shobirin, and M. Shuhaimi, “Phytase: application in food industry,” International Food Research Journal, vol. 17, no. 1, pp. 13–21, 2010.
- U. Konietzny and R. Greiner, “Bacterial phytase: potential application, in vivo function and regulation of its synthesis,” Brazilian Journal of Microbiology, vol. 35, no. 1-2, pp. 11–18, 2004.
- L. Cao, W. Wang, C. Yang et al., “Application of microbial phytase in fish feed,” Enzyme and Microbial Technology, vol. 40, no. 4, pp. 497–507, 2007.
- X. G. Lei and J. M. Porres, “Phytase enzymology, applications, and biotechnology,” Biotechnology Letters, vol. 25, no. 21, pp. 1787–1794, 2003.
- B. C. Oh, W. C. Choi, S. Park, Y. O. Kim, and T. K. Oh, “Biochemical properties and substrate specificities of alkaline and histidine acid phytases,” Applied Microbiology and Biotechnology, vol. 63, no. 4, pp. 362–372, 2004.
- M. Lehmann, C. Loch, A. Middendorf et al., “The consensus concept for thermostability engineering of proteins: further proof of concept,” Protein Engineering, vol. 15, no. 5, pp. 403–411, 2002.
- R. L. Van Etten, R. Davidson, P. E. Stevis, H. MacArthur, and D. L. Moore, “Covalent structure, disulfide bonding, and identification of reactive surface and active site residues of human prostatic acid phosphatase,” Journal of Biological Chemistry, vol. 266, no. 4, pp. 2313–2319, 1991.
- E. J. Mullaney and A. H. J. Ullah, “Conservation of cysteine residues in fungal histidine acid phytases,” Biochemical and Biophysical Research Communications, vol. 328, no. 2, pp. 404–408, 2005.
- A. K. Dubey, S. Yadav, M. Kumar, V. K. Singh, B. K. Sarangi, and D. Yadav, “In silico characterization of pectate lyase protein sequences from different source organisms,” Enzyme Research, vol. 2010, Article ID 950230, 14 pages, 2010.
- N. Malviya, M. Srivastava, S. K. Diwakar, and S. K. Mishra, “Insights to sequence information of polyphenol oxidase enzyme from different source organisms,” Applied Biochemistry and Biotechnology, vol. 165, pp. 397–405, 2011.
- V. K. Morya, S. Yadav, E. K. Kim, and D. Yadav, “In silico characterization of alkaline proteases from different species of Aspergillus,” Applied Biochemistry and Biotechnology, vol. 166, pp. 243–257, 2012.
- J. Kyte and R. F. Doolittle, “A simple method for displaying the hydropathic character of a protein,” Journal of Molecular Biology, vol. 157, no. 1, pp. 105–132, 1982.
- B. Bjellqvist, G. J. Hughes, C. Pasquali et al., “The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences,” Electrophoresis, vol. 14, no. 10, pp. 1023–1031, 1993.
- N. Saitou and M. Nei, “The neighbor-joining method: a new method for reconstructing phylogenetic trees,” Molecular Biology and Evolution, vol. 4, no. 4, pp. 406–425, 1987.
- E. Zuckerkandl and L. Pauling, “Evolutionary divergence and convergence in proteins,” in Evolving Genes and Proteins, pp. 97–166, 1965.
- K. Tamura, D. Peterson, N. Peterson, G. Stecher, M. Nei, and S. Kumar, “MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods,” in Molecular Biology and Evolution, vol. 28, pp. 2731–2739, 2011.
- T. L. Bailey and C. Elkan, “Fitting a mixture model by expectation maximization to discover motifs in biopolymers,” in Proceedings of the 2nd International Conference on Intelligent Systems for Molecular Biology, pp. 28–36, AAAI Press, Menlo Park, Calif, USA, 1994.
- E. J. Mullaney and A. H. J. Ullah, “Phytases: attributes, catalytic mechanisms and applications,” in Inositol Phosphates: Linking Agriculture and the Environment, pp. 97–110, CAB International, Oxfordshire, UK, 2007.
- Y. Lindqvist, G. Schneider, and P. Vihko, “Crystal structures of rat acid phosphatase complexed with the transition-state analogs vanadate and molybdate. Implications for the reaction mechanism,” European Journal of Biochemistry, vol. 221, no. 1, pp. 139–142, 1994.
- K. S. Porvari, A. M. Herrala, R. M. Kurkela et al., “Site-directed mutagenesis of prostatic acid phosphatase. Catalytically important aspartic acid 258, substrate specificity, and oligomerization,” Journal of Biological Chemistry, vol. 269, no. 36, pp. 22642–22646, 1994.
- A. Ikai, “Thermostability and aliphatic index of globular proteins,” Journal of Biochemistry, vol. 88, no. 6, pp. 1895–1898, 1980.
- N. D. Rawlings, F. R. Morton, and A. J. Barrett, “MEROPS: the peptidase database,” Nucleic Acids Research, vol. 34, pp. D270–D272, 2006.
- K. Guruprasad, B. V. B. Reddy, and M. W. Pandit, “Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence,” Protein Engineering, vol. 4, no. 2, pp. 155–161, 1990.
- S. Rogers, R. Wells, and M. Rechsteiner, “Amino acid sequences common to rapidly degraded proteins: the PEST hypothesis,” Science, vol. 234, no. 4774, pp. 364–368, 1986.
- J. Gough, “The SUPERFAMILY database in structural genomics,” Acta Crystallographica Section D, vol. 58, no. 11, pp. 1897–1900, 2002.
Copyright © 2012 Vinod Kumar et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.