- About this Journal ·
- Abstracting and Indexing ·
- Advance Access ·
- Aims and Scope ·
- Article Processing Charges ·
- Articles in Press ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents
Volume 35 (2013), Issue 5, Pages 513–523
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification
1Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA
2Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
3Knowledge Discovery and Informatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA
4Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT 84132, USA
5Department of Biochemistry and Molecular Biology, University of Texas Medical School, Houston, TX 77030, USA
Received 19 March 2013; Accepted 13 August 2013
Academic Editor: Sheng Pan
Copyright © 2013 Jing Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
- D. Ghosh and L. M. Poisson, “‘Omics’ data and levels of evidence for biomarker discovery,” Genomics, vol. 93, no. 1, pp. 13–16, 2009.
- B. P. Bradley, “Finding biomarkers is getting easier,” Ecotoxicology, vol. 21, no. 3, pp. 631–636.
- Z. Feng, R. Prentice, and S. Srivastava, “Research issues and strategies for genomic and proteomic biomarker discovery and validation: a statistical perspective,” Pharmacogenomics, vol. 5, no. 6, pp. 709–719, 2004.
- J. E. McDermott, J. Wang, H. D. Mitchell, et al., “Challenges in biomarker discovery: combining expert insights with statistical analysis of complex omics data,” Expert Opinion on Medical Diagnostics, vol. 7, no. 1, pp. 37–51, 2013.
- T. Wei, B. Liao, L. Ackermann et al., “Data-driven analysis approach for biomarker discovery using molecular-profiling technologies,” Biomarkers, vol. 10, no. 2-3, pp. 153–172, 2005.
- L. Chen, C. Wang, I.-M. Shih et al., “Biomarker identification by knowledge-driven multi-level ICA and motif analysis,” in Proceedings of the 6th International Conference on Machine Learning and Applications (ICMLA '07), pp. 560–566, December 2007.
- Z. Zhang, Y. Yu, F. Xu et al., “Combining multiple serum tumor markers improves detection of stage I epithelial ovarian cancer,” Gynecologic Oncology, vol. 107, no. 3, pp. 526–531, 2007.
- S. M. Hill, R. M. Neve, N. Bayani, et al., “Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology,” BMC Bioinformatics, vol. 13, article 94, pp. 94–109, 2012.
- D. L. Hoyert, K. D. Kochanek, and S. L. Murphy, “Deaths: final data for 1997,” National Vital Statistics Reports, vol. 47, no. 19, pp. 1–104, 1999.
- A. D. Lopez and C. C. Murray, “The global burden of disease, 1990–2020,” Nature Medicine, vol. 4, no. 11, pp. 1241–1243, 1998.
- S. R. Rosenberg and R. Kalhan, “Biomarkers in chronic obstructive pulmonary disease,” Translational Research, vol. 159, no. 4, pp. 228–237, 2012.
- Y. Zhou, D. J. Schneider, and M. R. Blackburn, “Adenosine signaling and the regulation of chronic lung disease,” Pharmacology and Therapeutics, vol. 123, no. 1, pp. 105–116, 2009.
- M. R. Blackburn, S. K. Datta, and R. E. Kellems, “Adenosine deaminase-deficient mice generated using a two-stage genetic engineering strategy exhibit a combined immunodeficiency,” Journal of Biological Chemistry, vol. 273, no. 9, pp. 5093–5100, 1998.
- H. Zhong, J. L. Chunn, J. B. Volmer, J. R. Fozard, and M. R. Blackburn, “Adenosine-mediated mast cell degranulation in adenosine deaminase-deficient mice,” Journal of Pharmacology and Experimental Therapeutics, vol. 298, no. 2, pp. 433–440, 2001.
- M. R. Blackburn, J. B. Volmer, J. L. Thrasher et al., “Metabolic consequences of adenosine deaminase deficiency in mice are associated with defects in alveogenesis, pulmonary inflammation, and airway obstruction,” Journal of Experimental Medicine, vol. 192, no. 2, pp. 159–170, 2000.
- M. R. Blackburn, C. G. Lee, H. W. Young et al., “Adenosine mediates IL-13-induced inflammation and remodeling in the lung and interacts in an IL-13-adenosine amplification pathway,” Journal of Clinical Investigation, vol. 112, no. 3, pp. 332–344, 2003.
- A. V. Sauer, I. Brigida, N. Carriglio, et al., “Autoimmune dysregulation and purine metabolism in adenosine deaminase deficiency,” Frontiers in Immunology, vol. 3, pp. 1–19, 2012.
- H. Jin, B.-J. Webb-Robertson, E. S. Peterson et al., “Smoking, COPD, and 3-nitrotyrosine levels of plasma proteins,” Environmental Health Perspectives, vol. 119, no. 9, pp. 1314–1320, 2011.
- M. M. Matzke, J. N. Brown, M. A. Gritsenko, et al., “A comparative analysis of computational approaches to relative protein quantification using peptide peak intensities in label-free LC-MS proteomics experiments,” Proteomics, vol. 13, no. 3-4, pp. 493–503, 2013.
- B.-J. M. Webb-Robertson, L. A. McCue, K. M. Waters et al., “Combined statistical analyses of peptide intensities and peptide occurrences improves identification of significant peptides from MS-based proteomics data,” Journal of Proteome Research, vol. 9, no. 11, pp. 5748–5756, 2010.
- B.-J. M. Webb-Robertson, M. M. Matzke, J. M. Jacobs, J. G. Pounds, and K. M. Waters, “A statistical selection strategy for normalization procedures in LC-MS proteomics experiments through dataset-dependent ranking of normalization scaling factors,” Proteomics, vol. 11, no. 24, pp. 4736–4741, 2011.
- M. M. Matzke, K. M. Waters, T. O. Metz et al., “Improved quality control processing of peptide-centric LC-MS proteomics data,” Bioinformatics, vol. 27, no. 20, Article ID btr479, pp. 2866–2872, 2011.
- P. Wang, H. Tang, H. Zhang, J. Whiteaker, A. G. Paulovich, and M. Mcintosh, “Normalization regarding non-random missing values in high-throughput mass spectrometry data,” Pacific Symposium on Biocomputing, pp. 315–326, 2006.
- A. D. Polpitiya, W.-J. Qian, N. Jaitly et al., “DAnTE: A statistical tool for quantitative analysis of -omics data,” Bioinformatics, vol. 24, no. 13, pp. 1556–1558, 2008.
- T. Schneider, “Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values,” Journal of Climate, vol. 14, no. 5, pp. 853–871, 2001.
- M. Ashburner, C. A. Ball, J. A. Blake et al., “Gene ontology: tool for the unification of biology. The Gene Ontology Consortium,” Nature Genetics, vol. 25, no. 1, pp. 25–29, 2000.
- C. Posse, A. Sanfilippo, B. Gopalan, et al., “Cross-ontological analytics: combining associative and hierarchical relations in the gene ontologies to assess gene product similarity,” in Computational Science, Lecture Notes in Computer Science, pp. 871–878, 2006.
- G. Yu, F. Li, Y. Qin, X. Bo, Y. Wu, and S. Wang, “GOSemSim: an R package for measuring semantic similarity among GO terms and gene products,” Bioinformatics, vol. 26, no. 7, Article ID btq064, pp. 976–978, 2010.
- D. H. von Seggern, CRC Standard Curves and Surfaces With Mathematica, Applied Mathematics & Nonlinear Science, Chapman and Hall/CRC, London, UK, 2nd edition, 2006.
- D. Hanisch, A. Zien, R. Zimmer, and T. Lengauer, “Co-clustering of biological networks and gene expression data,” Bioinformatics, vol. 18, supplement 1, pp. S145–S154, 2002.
- J. H. Ward, “Hierarchical grouping to optimize an objective function,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 236–244, 1963.
- J. E. McDermott, H. Shankaran, A. J. Eisfeld et al., “Conserved host response to highly pathogenic avian influenza virus infection in human cell culture, mouse and macaque model systems,” BMC Systems Biology, vol. 5, article 190, pp. 190–212, 2011.
- B.-J. M. Webb-Robertson, L. A. McCue, N. Beagley et al., “A Bayesian integration model of high-throughput proteomics and metabolomics data for improved early detection of microbial infections,” Pacific Symposium on Biocomputing, pp. 451–463, 2009.
- M. Ahdesmäki and K. Strimmer, “Feature selection in omics prediction problems using cat scores and false nondiscovery rate control,” Annals of Applied Statistics, vol. 4, no. 1, pp. 503–519, 2010.
- A. F. Atiya, “Estimating the posterior probabilities using the K-nearest neighbor rule,” Neural Computation, vol. 17, no. 3, pp. 731–740, 2005.
- P. MacCullagh and J. A. Nelder, Generalized Linear Models, Monographs on Statistics and Applied Probability, Chapman and Hall/CRC, London, UK, 1989.
- T. Mitchell, B. Buchanan, G. Dejong, et al., “Machine learning,” Annual Review of Computer Science, vol. 4, pp. 417–433, 1989.
- N. Beagley, K. G. Stratton, and B.-J. M. Webb-Robertson, “VIBE 2.0: visual integration for bayesian evaluation,” Bioinformatics, vol. 26, no. 2, pp. 280–282, 2010.
- S. Oh, D. D. Kang, G. N. Brock, and G. C. Tseng, “Biological impact of missing-value imputation on downstream analyses of gene expression profiles,” Bioinformatics, vol. 27, no. 1, Article ID btq613, pp. 78–86, 2011.
- E. Younesi, L. Toldo, B. Muller, et al., “Mining biomarker information in biomedical literature,” BMC Medical Informatics and Decision Making, vol. 12, article 148, pp. 148–160, 2012.
- R. Nugent and M. Meila, “An overview of clustering applied to molecular biology,” Methods in Molecular Biology, vol. 620, pp. 369–404, 2010.
- W. S. Noble, “What is a support vector machine?” Nature Biotechnology, vol. 24, no. 12, pp. 1565–1567, 2006.
- M. G. Schrauder, R. Strick, R. Schulz-Wendtland et al., “Circulating micro-RNAs as potential blood-based markers for early stage breast cancer detection,” PLoS ONE, vol. 7, no. 1, Article ID e29770, 2012.
- C. Kingsford and S. L. Salzberg, “What are decision trees?” Nature Biotechnology, vol. 26, no. 9, pp. 1011–1013, 2008.
- R. Díaz-Uriarte and S. Alvarez de Andrés, “Gene selection and classification of microarray data using random forest,” BMC Bioinformatics, vol. 7, article 3, 2006.
- M. H. Zweig and G. Campbell, “Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine,” Clinical Chemistry, vol. 39, no. 4, pp. 561–577, 1993.
- D. M. V. Powers, “Evaluation: from precision, recall and Fmeasure to ROC, informedness, markedness & correlation,” Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63, 2011.