- About this Journal ·
- Abstracting and Indexing ·
- Advance Access ·
- Aims and Scope ·
- Annual Issues ·
- Article Processing Charges ·
- Articles in Press ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents
BioMed Research International
Volume 2014 (2014), Article ID 753428, 10 pages
Integration of Residue Attributes for Sequence Diversity Characterization of Terpenoid Enzymes
Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan
Received 1 November 2013; Accepted 21 February 2014; Published 11 May 2014
Academic Editor: Samuel Kuria Kiboi
Copyright © 2014 Nelson Kibinge et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
- L. D. Stein, “Integrating biological databases,” Nature Reviews Genetics, vol. 4, no. 5, pp. 337–345, 2003.
- M. He and S. Petoukhov, Mathematics of Bioinformatics: Theory, Methods and Applications, vol. 19, John Wiley & Sons, 2011.
- L. Kong, Y. Zhang, Z.-Q. Ye et al., “CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine,” Nucleic acids research, vol. 35, supplement 2, pp. W345–W349, 2007.
- M. Vendruscolo and G. G. Tartaglia, “Towards quantitative predictions in cell biology using chemical properties of proteins,” Molecular BioSystems, vol. 4, no. 12, pp. 1170–1175, 2008.
- A. Coghlan, D. A. Mac Dónaill, and N. H. Buttimore, “Representation of amino acids as five-bit or three-bit patterns for filtering protein databases,” Bioinformatics, vol. 17, no. 8, pp. 676–685, 2001.
- G. White and W. Seffens, “Using a neural network to backtranslate amino acid sequences,” Electronic Journal of Biotechnology, vol. 1, no. 3, pp. 17–18, 1998.
- S. Henikoff and J. G. Henikoff, “Amino acid substitution matrices from protein blocks,” Proceedings of the National Academy of Sciences of the United States of America, vol. 89, no. 22, pp. 10915–10919, 1992.
- O. Weiss, M. A. Jiménez-Montaño, and H. Herzel, “Information content of protein sequences,” Journal of Theoretical Biology, vol. 206, no. 3, pp. 379–386, 2000.
- S. Kawashima and M. Kanehisa, “AAindex: amino acid index database,” Nucleic Acids Research, vol. 28, no. 1, p. 374, 2000.
- W. R. Atchley, J. Zhao, A. D. Fernandes, and T. Drüke, “Solving the protein sequence metric problem,” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 18, pp. 6395–6400, 2005.
- W. J. Krzanowski, Principles of Multivariate Analysis, Oxford University Press, 2000.
- L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
- R. Grantham, “Amino acid difference formula to help explain protein evolution,” Science, vol. 185, no. 4154, pp. 862–864, 1974.
- A. L. Boulesteix, S. Janitza, J. Kruppa, and I. R. Konig, “Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 2, no. 6, pp. 493–507, 2012.
- V. Svetnik, A. Liaw, C. Tong, J. Christopher Culberson, R. P. Sheridan, and B. P. Feuston, “Random forest: a classification and regression tool for compound classification and QSAR modeling,” Journal of Chemical Information and Computer Sciences, vol. 43, no. 6, pp. 1947–1958, 2003.
- R. Genuer, J.-M. Poggi, and C. Tuleau-Malot, “Variable selection using random forests,” Pattern Recognition Letters, vol. 31, no. 14, pp. 2225–2236, 2010.
- M. Kanehisa and S. Goto, “KEGG: kyoto encyclopedia of genes and genomes,” Nucleic Acids Research, vol. 28, no. 1, pp. 27–30, 2000.
- R. Caspi, H. Foerster, C. A. Fulcher et al., “The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases,” Nucleic Acids Research, vol. 36, no. 1, pp. D623–D631, 2008.
- Y. Shinbo, Y. Nakamura, M. Altaf-Ul-Amin et al., “KNApSAcK: a comprehensive species-metabolite relationship database,” in Plant Metabolomics, pp. 165–181, Springer, 2006.
- J. Bohlmann, G. Meyer-Gauen, and R. Croteau, “Plant terpenoid synthases: molecular biology and phylogenetic analysis,” Proceedings of the National Academy of Sciences of the United States of America, vol. 95, no. 8, pp. 4126–4133, 1998.
- I. Jollie, Principal Component Analysis, Wiley Online Library, 2005.
- R. L. Tatusov, D. A. Natale, I. V. Garkavtsev et al., “The COG database: new developments in phylogenetic classification of proteins from complete genomes,” Nucleic Acids Research, vol. 29, no. 1, pp. 22–28, 2001.
- H.-L. Huang, I.-C. Lin, Y.-F. Liou et al., “Predicting and analyzing DNA-binding domains using a systematic approach to identifying a set of informative physicochemical and biochemical properties,” BMC Bioinformatics, vol. 12, no. 1, article S47, 2011.
- R. Díaz-Uriarte and S. Alvarez de Andrés, “Gene selection and classification of microarray data using random forest,” BMC Bioinformatics, vol. 7, article 3, 2006.
- M. Sandri and P. Zuccolotto, “Variable selection using random forests,” in Data Analysis, Classification and the Forward Search, pp. 263–270, Springer, 2006.
- C. Strobl, A.-L. Boulesteix, T. Kneib, T. Augustin, and A. Zeileis, “Conditional variable importance for random forests,” BMC Bioinformatics, vol. 9, article 307, 2008.
- P. Kline, An Easy Guide to Factor Analysis, Routledge, 1994.
- J. D. Connolly and R. A. Hill, Dictionary of Terpenoids. 1. Mono-and Sesquiterpenoids, vol. 1, CRC Press, 1991.
- J. Degenhardt, T. G. Köllner, and J. Gershenzon, “Monoterpene and sesquiterpene synthases and the origin of terpene skeletal diversity in plants,” Phytochemistry, vol. 70, no. 15-16, pp. 1621–1637, 2009.
- S. Ikeda, T. Abe, Y. Nakamura, et al., “Systematization of the protein sequence diversity in enzymes related to secondary metabolic pathways in plants, in the context of big data biology inspired by the KNApSAcK Motorcycle database,” Plant and Cell Physiology, vol. 54, no. 5, pp. 711–727, 2013.
- R. Staden, “Sequence data handling by computer,” Nucleic Acids Research, vol. 4, no. 11, pp. 4037–4051, 1977.
- D. C. Hyatt, B. Youn, Y. Zhao et al., “Structure of limonene synthase, a simple model for terpenoid cyclase catalysis,” Proceedings of the National Academy of Sciences of the United States of America, vol. 104, no. 13, pp. 5360–5365, 2007.
- D. A. Nagegowda, M. Gutensohn, C. G. Wilkerson, and N. Dudareva, “Two nearly identical terpene synthases catalyze the formation of nerolidol and linalool in snapdragon flowers,” Plant Journal, vol. 55, no. 2, pp. 224–239, 2008.
- N. J. Nieuwenhuizen, M. Y. Wang, A. J. Matich et al., “Two terpene synthases are responsible for the major sesquiterpenes emitted from the flowers of kiwifruit (Actinidia deliciosa),” Journal of Experimental Botany, vol. 60, no. 11, pp. 3203–3219, 2009.