About this Journal Submit a Manuscript Table of Contents
Advances in Bioinformatics
Volume 2013 (2013), Article ID 790567, 10 pages
http://dx.doi.org/10.1155/2013/790567
Research Article

Comparing Imputation Procedures for Affymetrix Gene Expression Datasets Using MAQC Datasets

1Department of Biostatistics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
2Center for Computational Research, University at Buffalo, NYS Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY 14203, USA
3Department of Biostatistics, SUNY University at Buffalo, Buffalo, NY 14214, USA

Received 26 June 2013; Accepted 28 August 2013

Academic Editor: Shandar Ahmad

Copyright © 2013 Sreevidya Sadananda Sadasiva Rao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. O. Troyanskaya, M. Cantor, G. Sherlock et al., “Missing value estimation methods for DNA microarrays,” Bioinformatics, vol. 17, no. 6, pp. 520–525, 2001. View at Scopus
  2. H. Wold, “Path models with latent variables: the NIPALS approach,” in Quantitative Sociology: International Perspectives on Mathematical and Statistical Modeling, pp. 307–357, 1975.
  3. S. Oba, M. Sato, I. Takemasa, M. Monden, K. Matsubara, and S. Ishii, “A Bayesian missing value estimation method for gene expression profile data,” Bioinformatics, vol. 19, no. 16, pp. 2088–2096, 2003. View at Publisher · View at Google Scholar · View at Scopus
  4. T. H. Bø, B. Dysvik, and I. Jonassen, “LSimpute: accurate estimation of missing values in microarray data with least squares methods,” Nucleic Acids Research, vol. 32, no. 3, p. e34, 2004. View at Scopus
  5. H. Kim, G. H. Golub, and H. Park, “Missing value estimation for DNA microarray gene expression data: local least squares imputation,” Bioinformatics, vol. 21, no. 2, pp. 187–198, 2005. View at Publisher · View at Google Scholar · View at Scopus
  6. M. Ouyang, W. J. Welsh, and P. Georgopoulos, “Gaussian mixture clustering and imputation of microarray data,” Bioinformatics, vol. 20, no. 6, pp. 917–923, 2004. View at Publisher · View at Google Scholar · View at Scopus
  7. J. C. Miecznikowski, S. Damodaran, K. F. Sellers, D. E. Coling, R. Salvi, and R. A. Rabin, “A comparison of imputation procedures and statistical tests for the analysis of two-dimensional electrophoresis data,” Proteome Science, vol. 9, p. 14, 2011. View at Publisher · View at Google Scholar · View at Scopus
  8. G. N. Brock, J. R. Shaffer, R. E. Blakesley, M. J. Lotz, and G. C. Tseng, “Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes,” BMC Bioinformatics, vol. 9, no. 1, p. 12, 2008. View at Publisher · View at Google Scholar · View at Scopus
  9. M. Celton, A. Malpertuy, G. Lelandais, and A. G. de Brevern, “Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments,” BMC Genomics, vol. 11, no. 1, p. 15, 2010. View at Publisher · View at Google Scholar · View at Scopus
  10. S. Oh, D. D. Kang, G. N. Brock, and G. C. Tseng, “Biological impact of missing-value imputation on downstream analyses of gene expression profiles,” Bioinformatics, vol. 27, no. 1, Article ID btq613, pp. 78–86, 2011. View at Publisher · View at Google Scholar · View at Scopus
  11. R. Mei, X. Di, T. B. Ryder et al., “Analysis of high density expression microarrays with signed-rank call algorithms,” Bioinformatics, vol. 18, no. 12, pp. 1593–1599, 2002. View at Publisher · View at Google Scholar · View at Scopus
  12. R. Gentleman, V. Carey, W. Huber, R. Irizarry, and S. Dudoit, “Bioinformatics and computational biology solutions using R and Bioconductor,” Statistics for Biology and Health, 2005.
  13. L. Gautier, L. Cope, B. M. Bolstad, and R. A. Irizarry, “Affy-Analysis of Affymetrix GeneChip data at the probe level,” Bioinformatics, vol. 20, no. 3, pp. 307–315, 2004. View at Publisher · View at Google Scholar · View at Scopus
  14. L. Shi, “The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements,” Nature Biotechnology, vol. 24, no. 9, pp. 1151–1161, 2006. View at Publisher · View at Google Scholar · View at Scopus
  15. J. J. Chen, H. Hsueh, R. R. Delongchamp, C. Lin, and C. Tsai, “Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data,” BMC Bioinformatics, vol. 8, no. 1, p. 412, 2007. View at Publisher · View at Google Scholar · View at Scopus
  16. L. Shi, W. D. Jones, R. V. Jensen et al., “The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies,” BMC Bioinformatics, vol. 9, supplement 9, p. S10, 2008. View at Publisher · View at Google Scholar · View at Scopus
  17. S. E. Choe, M. Boutros, A. M. Michelson, G. M. Church, and M. S. Halfon, “Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset,” Genome Biology, vol. 6, no. 2, p. R16, 2005. View at Scopus
  18. Q. Zhu, J. C. Miecznikowski, and M. S. Halfon, “Preferred analysis methods for Affymetrix GeneChips. II. An expanded, balanced, wholly-defined spike-in dataset,” BMC Bioinformatics, vol. 11, no. 1, p. 285, 2010. View at Publisher · View at Google Scholar · View at Scopus
  19. Q. Zhu, J. C. Miecznikowski, and M. S. Halfon, “A wholly defined Agilent microarray spike-in dataset,” Bioinformatics, vol. 27, no. 9, Article ID btr135, pp. 1284–1289, 2011. View at Publisher · View at Google Scholar · View at Scopus
  20. I. Affymetrix, “Statistical algorithms description document,” Technical Paper, 2002.
  21. C. L. Wilson and C. J. Miller, “Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis,” Bioinformatics, vol. 21, no. 18, pp. 3683–3685, 2005. View at Publisher · View at Google Scholar · View at Scopus
  22. R. C. Gentleman, V. J. Carey, D. M. Bates et al., “Bioconductor: open software development for computational biology and bioinformatics,” Genome Biology, vol. 5, no. 10, p. R80, 2004. View at Scopus
  23. T. Hastie, R. Tibshirani, B. Narasimhan, and G. Chu, Impute: Imputation for Microarray Data, 1999, R package version 1.10.0.
  24. T.H. BBø, B. Dysvik, and I. Jonassen, “Lsimpute: Accurate estimation of missing values in microarray data with least squares methods,” 2005, http://www.ii.uib.no/~trondb/imputation/.
  25. D. V. Nguyen, N. Wang, and R. J. Carroll, “Evaluation of missing value estimation for microarray data,” Journal of Data Science, vol. 2, no. 4, pp. 347–370, 2004.
  26. W. Stacklies and H. Redestig, PcaMethods: A Collection of PCA Methods, 2007, R package version 1.18.0.
  27. S. S. Sadasiva Rao, L. A. Shepherd, A. E. Bruno, S. Liu, and J. C. Miecznikowski, “A full analysis of imputation procedures for Affymetrix gene expression datasets,” Technical Report 1202, SUNY University at Buffalo-Department of Biostatistics, Buffalo, NY, USA, 2012.
  28. T. A. Patterson, E. K. Lobenhofer, S. B. Fulmer-Smentek et al., “Performance comparison of one-color and two-color platforms within the MicroArray Quality Control (MAQC) project,” Nature Biotechnology, vol. 24, no. 9, pp. 1140–1150, 2006. View at Publisher · View at Google Scholar · View at Scopus
  29. Z. Wen, C. Wang, Q. Shi et al., “Evaluation of gene expression data generated from expired Affymetrix GeneChip® microarrays using MAQC reference RNA samples,” BMC Bioinformatics, vol. 11, supplement 6, p. S10, 2010. View at Publisher · View at Google Scholar · View at Scopus
  30. J. Luo, M. Schumacher, A. Scherer et al., “A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data,” Pharmacogenomics Journal, vol. 10, no. 4, pp. 278–291, 2010. View at Publisher · View at Google Scholar · View at Scopus
  31. K. Kadota and K. Shimizu, “Evaluating methods for ranking differentially expressed genes applied to microArray quality control data,” BMC Bioinformatics, vol. 12, no. 1, p. 227, 2011. View at Publisher · View at Google Scholar · View at Scopus
  32. T. Aittokallio, “Dealing with missing values in large-scale studies: microarray data imputation and beyond,” Briefings in Bioinformatics, vol. 11, no. 2, Article ID bbp059, pp. 253–264, 2009. View at Publisher · View at Google Scholar · View at Scopus
  33. J. Tuikkala, L. L. Elo, O. S. Nevalainen, and T. Aittokallio, “Missing value imputation improves clustering and interpretation of gene expression microarray data,” BMC Bioinformatics, vol. 9, no. 1, p. 202, 2008. View at Publisher · View at Google Scholar · View at Scopus
  34. A. Liew, N. Law, and H. Yan, “Missing value imputation for gene expression data: computational techniques to recover missing data from available information,” Briefings in Bioinformatics, vol. 12, no. 5, Article ID bbq080, pp. 498–513, 2011. View at Publisher · View at Google Scholar · View at Scopus
  35. B. M. Bolstad, R. A. Irizarry, M. Åstrand, and T. P. Speed, “A comparison of normalization methods for high density oligonucleotide array data based on variance and bias,” Bioinformatics, vol. 19, no. 2, pp. 185–193, 2003. View at Publisher · View at Google Scholar · View at Scopus
  36. R. A. Irizarry, B. M. Bolstad, F. Collin, L. M. Cope, B. Hobbs, and T. P. Speed, “Summaries of Affymetrix GeneChip probe level data,” Nucleic Acids Research, vol. 31, no. 4, p. e15, 2003. View at Scopus
  37. R. A. Irizarry, B. Hobbs, F. Collin et al., “Exploration, normalization, and summaries of high density oligonucleotide array probe level data,” Biostatistics, vol. 4, no. 2, pp. 249–264, 2003. View at Scopus
  38. Z. Wu, R. A. Irizarry, R. Gentleman, F. Martinez-Murillo, and F. Spencer, “A model-based background adjustment for oligonucleotide expression arrays,” Journal of the American Statistical Association, vol. 99, no. 468, pp. 909–917, 2004. View at Publisher · View at Google Scholar · View at Scopus
  39. A. R. Dabney and J. D. Storey, “A reanalysis of a published Affymetrix GeneChip control dataset,” Genome Biology, vol. 7, no. 3, p. 401, 2006. View at Publisher · View at Google Scholar · View at Scopus
  40. D. P. Gaile and J. C. Miecznikowski, “Putative null distributions corresponding to tests of differential expression in the Golden Spike dataset are intensity dependent,” BMC Genomics, vol. 8, no. 1, p. 105, 2007. View at Publisher · View at Google Scholar · View at Scopus
  41. J. M. Perkel, “Six things you won't find in the MAQC,” The Scientist, vol. 20, no. 11, p. 68, 2007.
  42. P. Liang, “MAQC papers over the cracks,” Nature Biotechnology, vol. 25, no. 1, pp. 27–28, 2007. View at Publisher · View at Google Scholar · View at Scopus
  43. L. Shi, W. D. Jones, R. V. Jensen et al., “Reply to MAQC papers over the cracks,” Nature Biotechnology, vol. 25, pp. 28–29, 2007.