Table of Contents Author Guidelines Submit a Manuscript
Enzyme Research
Volume 2012 (2012), Article ID 845465, 8 pages
Research Article

In Silico Characterization of Histidine Acid Phytase Sequences

1Department of Biochemistry, G. B. Pant University of Agriculture & Technology, Pantnagar 263145, India
2Akal School of Biotechnology, Eternal University, Baru Sahib, Sirmour 173101, India

Received 27 August 2012; Accepted 19 November 2012

Academic Editor: Jose M. Guisan

Copyright © 2012 Vinod Kumar et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Histidine acid phytases (HAPhy) are widely distributed enzymes among bacteria, fungi, plants, and some animal tissues. They have a significant role as an animal feed enzyme and in the solubilization of insoluble phosphates and minerals present in the form of phytic acid complex. A set of 50 reference protein sequences representing HAPhy were retrieved from NCBI protein database and characterized for various biochemical properties, multiple sequence alignment (MSA), homology search, phylogenetic analysis, motifs, and superfamily search. MSA using MEGA5 revealed the presence of conserved sequences at N-terminal “RHGXRXP” and C-terminal “HD.” Phylogenetic tree analysis indicates the presence of three clusters representing different HAPhy, that is, PhyA, PhyB, and AppA. Analysis of 10 commonly distributed motifs in the sequences indicates the presence of signature sequence for each class. Motif 1 “SPFCDLFTHEEWIQYDYLQSLGKYYGYGAGNPLGPAQGIGF” was present in 38 protein sequences representing clusters 1 (PhyA) and 2 (PhyB). Cluster 3 (AppA) contains motif 9 “KKGCPQSGQVAIIADVDERTRKTGEAFAAGLAPDCAITVHTQADTSSPDP” as a signature sequence. All sequences belong to histidine acid phosphatase family as resulted from superfamily search. No conserved sequence representing 3- or 6-phytase could be identified using multiple sequence alignment. This in silico analysis might contribute in the classification and future genetic engineering of this most diverse class of phytase.

1. Introduction

Phytate (myo-inositol 1,2,3,4,5,6-hexakisphosphate; IP6) is the major storage form of phosphorus (P), representing approximately 80% of P in soil [1], 65–80% of total P in grains [2], and up to 80% of P in manures from monogastric animals [3]. Phytate exists primarily as metal phytate complex with nutritionally important cations, that is, Ca2+, Fe2+, and Zn2+ [4].

Phytases (IP6 phosphohydrolase) are a class of phosphatases which catalyses hydrolysis of phytate to inositol phosphates, inorganic phosphorus, and myo-inositol [5], also lowers down affinity of phytate to associated minerals and proteins [6], and thus increases bioavailability of P, minerals, and proteins for growth and development of plants and animals [79].

Phytases are widely distributed among plants [10, 11], certain animal tissues, and microbial cells [1215]. To date, four classes of phytases have been characterized in terrestrial organisms: histidine acid phytase (HAPhy), cysteine phytase (CPhy), purple acid phosphatase (PAP), and -propeller phytase (BPPhy) [16, 17]. HAPhys are the most studied and diverse class of phytase. Most bacterial, fungal, and plant phytases belong to histidine acid phosphatases (EC which are further classified as 3-phytase (EC or 6-phytase (EC due to their high specific activity for phytate and position specific initial hydrolysis of phytate.

Phytases have been extensively reviewed for various industrial and biotechnological applications [1821], biochemical properties [22], and consensus phytase construct [23]. Conserved amino acid residues are reported in HAPhy sequences at N-terminal “RHGXRXP,” C-terminal “HD,” and eight cysteine residues in around sequence [16, 24, 25]. It is a well-adopted fact that all phytases have not similar and common active site; hence the initial classification system is based on catalytic mechanism [22]. Still, there is a need to devise a taxonomic system to accommodate new types of phytases with novel catalytic mechanism.

The in silico characterization of protein sequences of industrially important enzymes has been reported recently [2628]. Biochemical features, homology search, multiple sequence alignment, phylogenetic tree construction, motif, and superfamily distribution of alkaline proteases have been analyzed using various bioinformatics tools [28]. A total of 121 protein sequences of pectate lyases were subjected to homology search, multiple sequence alignment, phylogenetic tree construction, and motif analysis [26]. Malviya et al. [27] collected forty-seven full-length amino acid sequences of PPO from bacteria, fungi, and plants and subjected them to multiple sequence alignment (MSA), domain identification, and phylogenetic tree construction.

In the present study, we performed in silico analysis of 50 HAPhy protein sequences. The biochemical features, homology search, multiple sequence alignment, phylogenetic tree construction, motif, and superfamily distribution have been analyzed using various bioinformatics tools.

2. Material and Methods

Representative genes from histidine acid phytases (E. coli AppA, GenBank accession number P07102; Aspergillus niger PhyA and PhyB, P34752 and P34754) were used as probes to BLAST microbial genome database from NCBI ( The protein sequences in FASTA format from RefSeq entries, which were shown to exhibit phytase activities, were selected for further in silico study.

Physiochemical data were generated from various tools in the EXPASY proteomic server (ClustalW, ProtParam, protein calculator, Compute pI/Mw, ProtScale) [29]. The molecular weights (kDa) of the various histidine acid phytases were calculated by the addition of average isotopic masses of amino acid in the protein and deducting the average isotopic mass of one water molecule. The pI of enzyme was calculated using pK values of amino acid according to Bjellqvist et al. [30].

The evolutionary history was inferred using the Neighbor-Joining method [31]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Poisson correction method [32] and are in the units of the number of amino acid substitutions per site. All positions containing gaps and missing data were eliminated. There were a total of 303 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [33]. For domain search, the Pfam site ( was used. Domain analysis was done using MEME ( [34]. The conserved protein motifs deduced by MEME were characterized for biological function analysis using protein BLAST, and domains were studied with InterProScan providing the best possible match based on the highest similarity score.

3. Result and Discussion

The 50 protein sequences of HAPhy were retrieved from NCBI. The accession number of retrieved sequences along with species names is listed in Table 1. The sequences were characterized for homology search, multiple sequences alignment, biochemical features, phylogenetic tree construction, motifs, and superfamily search using various bioinformatics tools. Out of 50 sequences 12 sequences belong to HAPhy gene AppA, 26 sequences to PhyA, and 12 sequences to PhyB.

Table 1: List of retrieved protein sequences from NCBI/Entrez and their accession number.

Multiple sequence alignment showed presence of conserved sites for HAPhy N-terminal “RHG/NXRXP” and C-terminal “HD” in all sequences as reported by other coworkers [25]. This is consistent with Pfam analysis of predicted active site residues, which in all sequences is shown to be N-terminal histidine residue present in conserved region and C-terminal aspartic acid. The histidine in N-terminal region seems as a nucleophile in the formation of a covalent phosphohistidine intermediate [35]. Aspartic acid at C-terminal “HD” sequence acts as a proton donor to the oxygen atom of the scissile phosphomonoester bond [36, 37]. No conserved sequence representing 3- or 6-phytase could be identified using multiple-sequence alignment.

The phylogenetic tree based on protein sequences revealed three major clusters. Cluster 1, a larger cluster containing 26 sequences under study, includes the majority of Aspergillus sp., Penicillium sp., Ajellomyces sp., Arthroderma sp., Trichophyton sp., Sclerotinia sp., Uncinocarpus sp., and Coccidioides sp. (Figure 1). Biochemical features for this cluster are listed in Table 2. The total number of amino acid residues ranged from 441 to 539 with variable molecular weights. pI values of this cluster ranged from 4.87 to 8.53. Variations among various phytase in this group in terms of other physiochemical parameters like positively charged and negatively charged residues, hydropathicity (GRAVY) are given in Table 2.

Table 2: Biochemical characteristics of HAPhy protein sequences.
Figure 1: Phylogenetic tree constructed by NJ method based on HAPhy protein sequences.

Aliphatic index analysis reveals uniformity in this group of phytases within the range of except for some sequences of Arthroderma sp. (XP_002849736.1, XP_003169494.1, XP_003015622.1) and Trichophyton sp. (XP_003021635.1). Aliphatic index of protein measures the relative volume occupied by aliphatic side chains of the amino acids: alanine, valine, leucine, and isoleucine. Globular proteins with high aliphatic index have high thermostability, and an increase in aliphatic index increases protein thermostability [38, 39].

Cluster 2 includes 12 protein sequences and represents PhyB gene sequences including the majority of Candida sp., S. cerevisiae, C. posadasii, and D. hansenii. Total number of sequences in this group is in the range of 457 to 479, and the pI values range from 4.41 to 5.82. It has less variation in its pI as compared to cluster 1 sequences (PhyA). Aliphatic index of this cluster sequences is uniform in the range of except for Candida tropicalis (XP_002546108.1) with a value of 67.74 and Komagataella pastoris (XP_002490985.1) with a value of 84.19.

Cluster 3 represents protein sequences from phytase gene AppA, also abbreviated as PhyC [22], which includes E. coli (in majority) along with various Shigella sp. and Citrobacter freundii. Various biophysical parameters for this group of sequences reveal amino acid residues ranging from 428 to 523, while pI value of the majority of sequences is in range of 5.5 to 6.5 except for E. albertii (9.35) and E. fergusonii (8.37). Aliphatic index of this group of sequences reveals highest thermostability among all three clusters. Predominantly positively charged amino acids are present in all three clusters.

The instability index is used to measure in vivo half-life of a protein [40]. The proteins which have been reported as in vivo half-life of less than 5 hours showed instability index greater than 40, whereas those having more than 16 hours half-life [41] have an instability index of less than 40. Instability index of HAP sequences under the study is found higher than 40 (Table 2) for 15 sequences including fully characterized E. coli and A. niger phytases, indicating an in vivo half-life of less than 5 hours. Superfam tool on ExPASy server for superfamily analysis of phytase sequences reveals the identity of all sequences to histidine acid phosphatase family belonging to phosphoglycerate mutase-like superfamily [42] (Table 3).

Table 3: Distribution of superfamily among HAPhy protein sequences determined using superfam server.

Histidine acid phytase from all three clusters shares a large / and a small -domain [22]. MEME analysis results in frequently observed 10 motifs (Table 4). A set of 41 amino acid residues “SPFCDLFTHEEWIQYDYLQSLGKYYGYGAGNPLGPAQGIGF” representing motif 1 were conserved and uniformly observed in 38 phytase protein sequences from clusters 1 and 2, that is, PhyA and PhyB, revealing their identity with HP_HAP like, histidine acid phosphatase superfamily. Other motifs are associated with HAP superfamily (Table 2). Cluster 3, representing AppA, does not have motif 1 in its sequences, but it does contain a 50 amino acid residues long unique motif 9 “KKGCPQSGQVAI IADVDERTRKTGEAFAAGLAPDCAITVHTQADTSSPDP.” Motif 5 “YAFLKTYNYSL GADDLTPFGEQQLVDSGIKFYQRYESLAKDIVPFIRASG” is present in all protein sequences representing PhyA cluster 1. PhyB protein sequences also contain a unique 41 amino acid residues long motif 8 “ETSPENSEGPYAGTTNALRHGAAFRARY GSLYDENSTLPVF.”

Table 4: Distribution of commonly observed motifs in different HAPhy protein sequences along with their functional domains.

4. Conclusion

Phylogenetic clustering and variation among biochemical features of different phytases might contribute in further classification of highly diverse HAPhys and their selection for various application purposes. Conserved sequences in motifs may be utilized for designing specific degenerate primers for identification and isolation of type and class of phytase (HAPhy) as numerous phytases are being isolated to fulfill the need of efficient phytase for feed application in various systems. Variation in biochemical features may be a key source of information for the screening of novel phytases and comparison with other classes of phytases. Functional attributes are needed to verify experimentally for conserved motifs found. This in silico analysis might be used for future genetic engineering of industrially important phytase.


  1. B. L. Turner, M. J. Papházy, P. M. Haygarth, and I. D. McKelvie, “Inositol phosphates in the environment,” Philosophical Transactions of the Royal Society B, vol. 357, no. 1420, pp. 449–469, 2002. View at Publisher · View at Google Scholar · View at Scopus
  2. J. N. A. Lott, I. Ockenden, V. Raboy, and G. D. Batten, “Phytic acid and phosphorus in crop seeds and fruits: a global estimate,” Seed Science Research, vol. 10, no. 1, pp. 11–33, 2000. View at Google Scholar · View at Scopus
  3. G. M. Barnett, “Phosphorus forms in animal manure,” Bioresource Technology, vol. 49, no. 2, pp. 139–147, 1994. View at Publisher · View at Google Scholar · View at Scopus
  4. K. Asada, K. Tanaka, and Z. Kasai, “Formation of phytic acid in cereal grains,” Annals of the New York Academy of Sciences, vol. 165, no. 2, pp. 801–814, 1969. View at Google Scholar · View at Scopus
  5. M. Wyss, R. Brugger, A. Kronenberger et al., “Biochemical characterization of fungal phytases (myo-inositol hexakisphosphate phosphohydrolases): catalytic properties,” Applied and Environmental Microbiology, vol. 65, no. 2, pp. 367–373, 1999. View at Google Scholar · View at Scopus
  6. D. B. Mitchell, K. Vogel, B. J. Weimann, L. Pasamontes, and A. P. G. M. Van Loon, “The phytase subfamily of histidine acid phosphatases: isolation of genes for two novel phytases from the fungi Aspergillus terreus and Myceliophthora thermophila,” Microbiology, vol. 143, no. 1, pp. 245–252, 1997. View at Google Scholar · View at Scopus
  7. N. R. Augspurger, D. M. Webel, X. G. Lei, and D. H. Baker, “Efficacy of an E. coli phytase expressed in yeast for releasing phytate-bound phosphorus in young chicks and pigs,” Journal of Animal Science, vol. 81, no. 2, pp. 474–483, 2003. View at Google Scholar · View at Scopus
  8. O. A. Olukosi, A. J. Cowieson, and O. Adeola, “Age-related influence of a cocktail of xylanase, amylase, and protease or phytase individually or in combination in broilers,” Poultry Science, vol. 86, no. 1, pp. 77–86, 2007. View at Google Scholar · View at Scopus
  9. S. M. Rutherfurd, T. K. Chung, and P. J. Moughan, “The effect of microbial phytase on ileal phosphorus and amino acid digestibility in the broiler chicken,” British Poultry Science, vol. 43, no. 4, pp. 598–606, 2002. View at Publisher · View at Google Scholar · View at Scopus
  10. D. M. Gibson and A. H. J. Ullah, “Purification and characterization of phytase from cotyledons of germinating soybean seeds,” Archives of Biochemistry and Biophysics, vol. 260, no. 2, pp. 503–513, 1988. View at Google Scholar · View at Scopus
  11. C. E. Hegeman and E. A. Grabau, “A novel phytase with sequence similarity to purple acid phosphatases is expressed in cotyledons of germinating soybean seedlings,” Plant Physiology, vol. 126, no. 4, pp. 1598–1608, 2001. View at Publisher · View at Google Scholar · View at Scopus
  12. R. Greiner, M. L. Alminger, and N. G. Carlsson, “Stereospecificity of myo-inositol hexakisphosphate dephosphorylation by a phytate-degrading enzyme of baker's yeast,” Journal of Agricultural and Food Chemistry, vol. 49, no. 5, pp. 2228–2233, 2001. View at Publisher · View at Google Scholar · View at Scopus
  13. Y. O. Kim, J. K. Lee, H. K. Kim, J. H. Yu, and T. K. Oh, “Cloning of the thermostable phytase gene (phy) from Bacillus sp. DS11 and its overexpression in Escherichia coli,” FEMS Microbiology Letters, vol. 162, no. 1, pp. 185–191, 1998. View at Publisher · View at Google Scholar · View at Scopus
  14. Y. H. Tseng, T. J. Fang, and S. M. Tseng, “Isolation and characterization of a novel phytase from Penicillium simplicissimum,” Folia Microbiologica, vol. 45, no. 2, pp. 121–127, 2000. View at Google Scholar · View at Scopus
  15. A. H. Ullah and D. M. Gibson, “Extracellular phytase (E.C. from Aspergillus ficuum NRRL 3135: purification and characterization,” Preparative Biochemistry, vol. 17, no. 1, pp. 63–91, 1987. View at Google Scholar · View at Scopus
  16. E. J. Mullaney and A. H. J. Ullah, “The term phytase comprises several different classes of enzymes,” Biochemical and Biophysical Research Communications, vol. 312, no. 1, pp. 179–184, 2003. View at Publisher · View at Google Scholar · View at Scopus
  17. H. M. Chu, R. T. Guo, T. W. Lin et al., “Structures of Selenomonas ruminantium phytase in complex with persulfated phytate: DSP phytase fold and mechanism for sequential substrate hydrolysis,” Structure, vol. 12, no. 11, pp. 2015–2024, 2004. View at Publisher · View at Google Scholar · View at Scopus
  18. S. Afinah, A. M. Yazid, M. H. Anis Shobirin, and M. Shuhaimi, “Phytase: application in food industry,” International Food Research Journal, vol. 17, no. 1, pp. 13–21, 2010. View at Google Scholar · View at Scopus
  19. U. Konietzny and R. Greiner, “Bacterial phytase: potential application, in vivo function and regulation of its synthesis,” Brazilian Journal of Microbiology, vol. 35, no. 1-2, pp. 11–18, 2004. View at Google Scholar · View at Scopus
  20. L. Cao, W. Wang, C. Yang et al., “Application of microbial phytase in fish feed,” Enzyme and Microbial Technology, vol. 40, no. 4, pp. 497–507, 2007. View at Publisher · View at Google Scholar · View at Scopus
  21. X. G. Lei and J. M. Porres, “Phytase enzymology, applications, and biotechnology,” Biotechnology Letters, vol. 25, no. 21, pp. 1787–1794, 2003. View at Publisher · View at Google Scholar · View at Scopus
  22. B. C. Oh, W. C. Choi, S. Park, Y. O. Kim, and T. K. Oh, “Biochemical properties and substrate specificities of alkaline and histidine acid phytases,” Applied Microbiology and Biotechnology, vol. 63, no. 4, pp. 362–372, 2004. View at Publisher · View at Google Scholar · View at Scopus
  23. M. Lehmann, C. Loch, A. Middendorf et al., “The consensus concept for thermostability engineering of proteins: further proof of concept,” Protein Engineering, vol. 15, no. 5, pp. 403–411, 2002. View at Google Scholar · View at Scopus
  24. R. L. Van Etten, R. Davidson, P. E. Stevis, H. MacArthur, and D. L. Moore, “Covalent structure, disulfide bonding, and identification of reactive surface and active site residues of human prostatic acid phosphatase,” Journal of Biological Chemistry, vol. 266, no. 4, pp. 2313–2319, 1991. View at Google Scholar · View at Scopus
  25. E. J. Mullaney and A. H. J. Ullah, “Conservation of cysteine residues in fungal histidine acid phytases,” Biochemical and Biophysical Research Communications, vol. 328, no. 2, pp. 404–408, 2005. View at Publisher · View at Google Scholar · View at Scopus
  26. A. K. Dubey, S. Yadav, M. Kumar, V. K. Singh, B. K. Sarangi, and D. Yadav, “In silico characterization of pectate lyase protein sequences from different source organisms,” Enzyme Research, vol. 2010, Article ID 950230, 14 pages, 2010. View at Google Scholar
  27. N. Malviya, M. Srivastava, S. K. Diwakar, and S. K. Mishra, “Insights to sequence information of polyphenol oxidase enzyme from different source organisms,” Applied Biochemistry and Biotechnology, vol. 165, pp. 397–405, 2011. View at Publisher · View at Google Scholar · View at Scopus
  28. V. K. Morya, S. Yadav, E. K. Kim, and D. Yadav, “In silico characterization of alkaline proteases from different species of Aspergillus,” Applied Biochemistry and Biotechnology, vol. 166, pp. 243–257, 2012. View at Google Scholar
  29. J. Kyte and R. F. Doolittle, “A simple method for displaying the hydropathic character of a protein,” Journal of Molecular Biology, vol. 157, no. 1, pp. 105–132, 1982. View at Google Scholar · View at Scopus
  30. B. Bjellqvist, G. J. Hughes, C. Pasquali et al., “The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences,” Electrophoresis, vol. 14, no. 10, pp. 1023–1031, 1993. View at Google Scholar · View at Scopus
  31. N. Saitou and M. Nei, “The neighbor-joining method: a new method for reconstructing phylogenetic trees,” Molecular Biology and Evolution, vol. 4, no. 4, pp. 406–425, 1987. View at Google Scholar · View at Scopus
  32. E. Zuckerkandl and L. Pauling, “Evolutionary divergence and convergence in proteins,” in Evolving Genes and Proteins, pp. 97–166, 1965. View at Google Scholar
  33. K. Tamura, D. Peterson, N. Peterson, G. Stecher, M. Nei, and S. Kumar, “MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods,” in Molecular Biology and Evolution, vol. 28, pp. 2731–2739, 2011. View at Publisher · View at Google Scholar
  34. T. L. Bailey and C. Elkan, “Fitting a mixture model by expectation maximization to discover motifs in biopolymers,” in Proceedings of the 2nd International Conference on Intelligent Systems for Molecular Biology, pp. 28–36, AAAI Press, Menlo Park, Calif, USA, 1994.
  35. E. J. Mullaney and A. H. J. Ullah, “Phytases: attributes, catalytic mechanisms and applications,” in Inositol Phosphates: Linking Agriculture and the Environment, pp. 97–110, CAB International, Oxfordshire, UK, 2007. View at Google Scholar
  36. Y. Lindqvist, G. Schneider, and P. Vihko, “Crystal structures of rat acid phosphatase complexed with the transition-state analogs vanadate and molybdate. Implications for the reaction mechanism,” European Journal of Biochemistry, vol. 221, no. 1, pp. 139–142, 1994. View at Google Scholar · View at Scopus
  37. K. S. Porvari, A. M. Herrala, R. M. Kurkela et al., “Site-directed mutagenesis of prostatic acid phosphatase. Catalytically important aspartic acid 258, substrate specificity, and oligomerization,” Journal of Biological Chemistry, vol. 269, no. 36, pp. 22642–22646, 1994. View at Google Scholar · View at Scopus
  38. A. Ikai, “Thermostability and aliphatic index of globular proteins,” Journal of Biochemistry, vol. 88, no. 6, pp. 1895–1898, 1980. View at Google Scholar · View at Scopus
  39. N. D. Rawlings, F. R. Morton, and A. J. Barrett, “MEROPS: the peptidase database,” Nucleic Acids Research, vol. 34, pp. D270–D272, 2006. View at Google Scholar · View at Scopus
  40. K. Guruprasad, B. V. B. Reddy, and M. W. Pandit, “Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence,” Protein Engineering, vol. 4, no. 2, pp. 155–161, 1990. View at Google Scholar · View at Scopus
  41. S. Rogers, R. Wells, and M. Rechsteiner, “Amino acid sequences common to rapidly degraded proteins: the PEST hypothesis,” Science, vol. 234, no. 4774, pp. 364–368, 1986. View at Google Scholar · View at Scopus
  42. J. Gough, “The SUPERFAMILY database in structural genomics,” Acta Crystallographica Section D, vol. 58, no. 11, pp. 1897–1900, 2002. View at Publisher · View at Google Scholar · View at Scopus