Table of Contents Author Guidelines Submit a Manuscript
Advances in Bioinformatics
Volume 2018 (2018), Article ID 9391635, 9 pages
https://doi.org/10.1155/2018/9391635
Research Article

Framework for Parallel Preprocessing of Microarray Data Using Hadoop

1Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Malaysia
2Faculty of Creative Multimedia, Multimedia University, 63100 Cyberjaya, Selangor, Malaysia

Correspondence should be addressed to Ravie Chandren Muniyandi; ym.ude.mku@eivar, Mahdi Sahlabadi; moc.liamg@2002idabalhas, and Hossein Golshanbafghy; moc.liamg@nahslog.h

Received 9 September 2017; Revised 29 January 2018; Accepted 13 February 2018; Published 29 March 2018

Academic Editor: Florentino Fdez-Riverola

Copyright © 2018 Amirhossein Sahlabadi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. T. D. Pham, D. Beck, and H. Yan, “Spectral pattern comparison methods for cancer classification based on microarray gene expression data,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 53, no. 11, pp. 2425–2430, 2006. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  2. A. Lehmussola, O. Yli-Harja, and S. Hautaniemi, “DNA microarray data preprocessing,” in Proceedings of the 1st International Symposium on Control, Communications and Signal Processing, pp. 751–754, 2004.
  3. Z. Chen, M. McGee, Q. Liu, and R. H. Scheuermann, “A distribution free summarization method for Affymetrix GeneChip® arrays,” Bioinformatics, vol. 23, no. 3, pp. 321–327, 2007. View at Publisher · View at Google Scholar · View at Scopus
  4. F. F. Millenaar, J. Okyere, S. T. May, M. van Zanten, L. A. C. J. Voesenek, and A. J. M. Peeters, “How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results,” BMC Bioinformatics, vol. 7, article no. 137, 2006. View at Publisher · View at Google Scholar · View at Scopus
  5. A. C. Richard, P. A. Lyons, J. E. Peters et al., “Comparison of gene expression microarray data with count-based RNA measurements informs microarray interpretation,” BMC Genomics, vol. 15, no. 1, article no. 649, 2014. View at Publisher · View at Google Scholar · View at Scopus
  6. Dirk deRoos, “HADOOP INTEGRATION WITH R,” 2014, http://www.dummies.com/programming/big-data/hadoop/hadoop-integration-with-r/.
  7. D. Bates, P. Dalgaard, R. Gentleman, and J. Chambers, “what is R?” 2000, https://www.r-project.org/about.html.
  8. P. Zikopoulos, R. B. Melnyk, B. Brown, and R. C. Dirk deRoos, “Hadoop for dummies,” 2014, http://www.dummies.com/programming/big-data/hadoop/hadoop-integration-with-r/.
  9. DNA Microarray Technology, “National Human Genome Research Institute, United States, Lab Report Janurary,” 2015.
  10. Q. De Clerck, “Analyzing and Benchmarking Genomic Preprocessing and Batch Effect Removal Methods in Big Data Infrastructure,” in Analyzing and Benchmarking Genomic Preprocessing and Batch Effect Removal Methods in Big Data Infrastructure, chapters 2, 3, pp. 1–110, Verije university, Brussel, Belgiuim, 2014. View at Google Scholar
  11. S. Niu, G. Yang, N. Sarma et al., “Combining Hadoop and GPU to preprocess large Affymetrix microarray data,” in Proceedings of the 2nd IEEE International Conference on Big Data, IEEE Big Data 2014, pp. 692–700, October 2014. View at Publisher · View at Google Scholar · View at Scopus
  12. J. M. Freudenberg, Comparison of background correction and normalization procedures forhigh-density oligonucleotide microarrays, Universität Leipzig, Germany: Interdisciplinary Centre for Bioinformatics, 3rd edition, 2005.
  13. R. Fajriyah, Microarray Data Analysis: Background Correction and Diferentially Expressed Genes, Technischen Universitat Graz, Styria, Austria, 2015.
  14. Y. Abagyan and R. Zhou, “Algorithms for high-density oligonucleotide array,” Curr Opin Drug Discov Devel, vol. 6, no. 3, pp. 339–345, 2003. View at Google Scholar
  15. Anon, Summarizing Oligonucleotide Expression Data, Virginia commonwealth University, Virgini, 2010.
  16. R. A. Irizarry, B. M. Bolstad, F. Collin, L. M. Cope, B. Hobbs, and T. P. Speed, “Summaries of Affymetrix GeneChip probe level data,” Nucleic Acids Research, vol. 31, no. 4, article e15, 2003. View at Publisher · View at Google Scholar · View at Scopus
  17. B. Milo Bolstad, Low-level Analysis of High-density Oligonucleotide Array Data: Background,Normalization and Summarization, University of california, 2004.
  18. L. Gautier, L. Cope, B. M. Bolstad, and R. A. Irizarry, “Affy—analysis of Affymetrix GeneChip data at the probe level,” Bioinformatics, vol. 20, no. 3, pp. 307–315, 2004. View at Publisher · View at Google Scholar · View at Scopus
  19. M. Cannataro, Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine, and Healthcare, IGI Global, Catanzaro, Italy, 1st edition, 2009. View at Publisher · View at Google Scholar
  20. H. Pietro and M. G. Cannataro, “Parallel Pre-processing of Affymetrix Microarray Data,” in in Euro-Par 2010 Parallel Processing Workshops: HeteroPar, HPCC, HiBB, CoreGrid, UCHPC, HPCF, PROPER, CCPI, VHPC, R. G. Mario, Ed., pp. 225–232, Springer, Ischia, Italy, 2010. View at Google Scholar
  21. M. Cannataro and P. H. Guzzi, “The role of parallelism, web services and ontologies in bioinformatics and omics data management and analysis,” EMBnet.journal, vol. 19, no. B, p. 59, 2013. View at Publisher · View at Google Scholar
  22. A. Mohiuddin, A. S. M. Raju Chowdhury, A. Mustaq, and M. H. Rafee, “An Advanced Survey on Cloud Computing and State-of-the-art Research Issues,” International Journal of Computer Science Issues (IJCSI), vol. 9, no. 1, pp. 201–207, 2012. View at Google Scholar
  23. J. T. Dudley and A. J. Butte, “In silico research in the era of cloud computing,” Nature Biotechnology, vol. 28, no. 11, pp. 1181–1185, 2010. View at Publisher · View at Google Scholar · View at Scopus
  24. G. Agapito, M. Cannataro, P. H. Guzzi, F. Marozzo, D. Talia, and P. Trunfio, “Cloud4SNP: Distributed analysis of SNP microarray data on the cloud,” in Proceedings of the 2013 4th ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics, ACM-BCB 2013, pp. 468–475, Wshington DC, USA, September 2013. View at Publisher · View at Google Scholar · View at Scopus
  25. B. Calabrese and M. Cannataro, “Bioinformatics and microarray data analysis on the cloud,” Methods in Molecular Biology, vol. 1375, pp. 25–39, 2016. View at Publisher · View at Google Scholar · View at Scopus
  26. G. Agapito, P. H. Guzzi, and M. Cannataro, “Parallel processing of genomics data,” Numerical Computations: Theory And Algorithms (Numta–2016), 2016. View at Google Scholar
  27. M. Grossman, M. Breternitz, and V. Sarkar, “HadoopCL: MapReduce on distributed heterogeneous platforms through seamless integration of hadoop and OpenCL,” in Proceedings of the 2013 IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013, pp. 1918–1927, Washington, DC, USA, May 2013. View at Publisher · View at Google Scholar · View at Scopus
  28. Tom Preston-Werner Chris Wanstrath, 2008, https://github.com/RevolutionAnalytics/RHadoop/wiki.
  29. D. deRoos, P. C. Zikopoulos, B. Roman, B. Brown, and C. Rafael, Hadoop for Dummies, John Wiley & Sons, Hoboken, NJ, USA, 1st edition, 2014.
  30. D. Parveen Kumar, Big Data Analatics with R and Hadoop, Department of Computer Science & Engineering Yogi Vemana University, 2016.
  31. J. George, O. Senko, B. Mow et al., Genetic Reclassification of Histologic Grade Delineates New Clinical Subtypes of Breast Cancer, 2016, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4922.
  32. MOAC DTC, Reading the NCBI's GEO microarray SOFT files in R/BioConductor, Engineering and Physical Science Research Council, England, science report, 2007.
  33. Ulrich Mansmann Markus Schmidberger, “Parallelized preprocessing algorithms for high-density oligonucleotide arrays,” in proceedings of the Parallel and Distributed Processing, 2008., IPDPS, 2008, April 2008.