Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2014, Article ID 213656, 15 pages
http://dx.doi.org/10.1155/2014/213656
Review Article

A Review of Feature Extraction Software for Microarray Gene Expression Data

Artificial Intelligence and Bioinformatics Research Group, Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia

Received 23 April 2014; Revised 24 July 2014; Accepted 24 July 2014; Published 31 August 2014

Academic Editor: Dongchun Liang

Copyright © 2014 Ching Siang Tan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. S. Van Sanden, D. Lin, and T. Burzykowski, “Performance of gene selection and classification methods in a microarray setting: a simulation study,” Communications in Statistics. Simulation and Computation, vol. 37, no. 1-2, pp. 409–424, 2008. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet · View at Scopus
  2. Q. Liu, A. H. Sung, Z. Chen et al., “Gene selection and classification for cancer microarray data based on machine learning and similarity measures,” BMC Genomics, vol. 12, supplement 5, article S1, 2011. View at Publisher · View at Google Scholar · View at Scopus
  3. M. Gheorghe and V. Mitrana, “A formal language-based approach in biology,” Comparative and Functional Genomics, vol. 5, no. 1, pp. 91–94, 2004. View at Publisher · View at Google Scholar · View at Scopus
  4. P. G. Higgs and T. Attwood, “Bioinformatics and molecular evolution,” Comparative and Functional Genomics, vol. 6, pp. 317–319, 2005. View at Google Scholar
  5. S. Lê, J. Josse, and F. Husson, “FactoMineR: an R package for multivariate analysis,” Journal of Statistical Software, vol. 25, no. 1, pp. 1–18, 2008. View at Google Scholar · View at Scopus
  6. F. Hussen, J. Josse, S. Le, and J. Mazet, “Package ‘FactoMineR’,” 2013, http://cran.r-project.org/web/packages/FactoMineR/FactoMineR.pdf.
  7. I. Hoffmann, “Principal Component Analysis with FactoMineR,” 2010, http://www.statistik.tuwien.ac.at/public/filz/students/seminar/ws1011/hoffmann_ausarbeitung.pdf.
  8. D. Beaton, C. R. C. Fatt, and H. Abdi, Package 'ExPosition', 2013, http://cran.r-project.org/web/packages/ExPosition/ExPosition.pdf.
  9. A. Lucas, “Package ‘amap’,” 2013, http://cran.r-project.org/web/packages/amap/vignettes/amap.pdf.
  10. J. Thioulouse, D. Chessel, S. Dolédec, and J.-M. Olivier, “ADE-4: a multivariate analysis and graphical display software,” Journal of Statistics and Computing, vol. 7, no. 1, pp. 75–83, 1997. View at Publisher · View at Google Scholar · View at Scopus
  11. A. C. Culhane, J. Thioulouse, G. Perriere, and D. G. Higgins, “MADE4: an R package for multivariate analysis of gene expression data,” Bioinformatics, vol. 21, no. 11, pp. 2789–2790, 2005. View at Google Scholar
  12. I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Elsevier, 2nd edition, 2005.
  13. F. W. Young and C. M. Bann, “ViSta: a visual statistics system,” in Statistical Computing Environments for Social Research, R. A. Stine and J. Fox, Eds., pp. 207–235, Sage, 1992. View at Google Scholar
  14. D. Grapov and J. W. Newman, “imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel,” Bioinformatics, vol. 28, no. 17, Article ID bts439, pp. 2288–2290, 2012. View at Publisher · View at Google Scholar · View at Scopus
  15. The MathWorks, Statistics Toolbox for Use with MATLAB, User Guide Version 4, 2003, http://www.pi.ingv.it/~longo/CorsoMatlab/OriginalManuals/stats.pdf.
  16. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–18, 2009. View at Publisher · View at Google Scholar
  17. NAG Toolbox for Matlab: g03aa, G03-Multivariate Methods, http://www.nag.com/numeric/MB/manual_22_1/pdf/G03/g03aa.pdf.
  18. J. L. Marchini, C. Heaton, and B. D. Ripley, “Package ‘fastICA’,” http://cran.r-project.org/web/packages/fastICA/fastICA.pdf.
  19. K. Nordhausen, J-F. Cardoso, J. Miettinen, H. Oja, E. Ollila, and S. Taskinen, “Package ‘JADE’,” http://cran.r-project.org/web/packages/JADE/JADE.pdf.
  20. D. Keith, C. Hoge, R. Frank, and A. D. Malony, HiPerSAT Technical Report, 2005, http://nic.uoregon.edu/docs/reports/HiPerSATTechReport.pdf.
  21. A. Biton, A. Zinovyev, E. Barillot, and F. Radvanyi, “MineICA: independent component analysis of transcriptomic data,” 2013, http://www.bioconductor.org/packages/2.13/bioc/vignettes/MineICA/inst/doc/MineICA.pdf.
  22. J. Karnanen, “Independent component analysis using score functions from the Pearson system,” 2006, http://cran.r-project.org/web/packages/PearsonICA/PearsonICA.pdf.
  23. A. Teschenforff, Independent Component Analysis Using Maximum Likelihood, 2012, http://cran.r-project.org/web/packages/mlica2/mlica2.pdf.
  24. M. Barker and W. Rayens, “Partial least squares for discrimination,” Journal of Chemometrics, vol. 17, no. 3, pp. 166–173, 2003. View at Publisher · View at Google Scholar · View at Scopus
  25. K. Jørgensen, V. Segtnan, K. Thyholt, and T. Næs, “A comparison of methods for analysing regression models with both spectral and designed variables,” Journal of Chemometrics, vol. 18, no. 10, pp. 451–464, 2004. View at Publisher · View at Google Scholar · View at Scopus
  26. K. H. Liland and U. G. Indahl, “Powered partial least squares discriminant analysis,” Journal of Chemometrics, vol. 23, no. 1, pp. 7–18, 2009. View at Publisher · View at Google Scholar · View at Scopus
  27. N. Krämer, A. Boulesteix, and G. Tutz, “Penalized Partial Least Squares with applications to B-spline transformations and functional data,” Chemometrics and Intelligent Laboratory Systems, vol. 94, no. 1, pp. 60–69, 2008. View at Publisher · View at Google Scholar · View at Scopus
  28. D. Chung and S. Keles, “Sparse partial least squares classification for high dimensional data,” Statistical Applications in Genetics and Molecular Biology, vol. 9, no. 1, 2010. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  29. N. Kramer and M. Sugiyama, “The degrees of freedom of partial least squares regression,” Journal of the American Statistical Association, vol. 106, no. 494, pp. 697–705, 2011. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  30. S. Chakraborty and S. Datta, “Surrogate variable analysis using partial least squares (SVA-PLS) in gene expression studies,” Bioinformatics, vol. 28, no. 6, pp. 799–806, 2012. View at Publisher · View at Google Scholar · View at Scopus
  31. G. Sanchez and L. Trinchera, Tools for Partial Least Squares Path Modeling, 2013, http://cran.r-project.org/web/packages/plspm/plspm.pdf.
  32. F. Bertrand, N. Meyer, and M. M. Bertrand, “Partial Least Squares Regression for generalized linear models,” http://cran.r-project.org/web/packages/plsRglm/plsRglm.pdf.
  33. M. Gutkin, R. Shamir, and G. Dror, “SlimPLS: a method for feature selection in gene expression-based disease classification,” PLoS ONE, vol. 4, no. 7, Article ID e6416, 2009. View at Publisher · View at Google Scholar · View at Scopus
  34. H. Diedrich and M. Abel, “Package ‘lle’,” http://cran.r-project.org/web/packages/lle/lle.pdf.
  35. C. Bartenhagen, “RDRToolbox: a package for nonlinear dimension reduction with Isomap and LLE,” 2013, http://bioconductor.org/packages/2.13/bioc/vignettes/RDRToolbox/inst/doc/vignette.pdf.
  36. F. Pedregosa, G. Varoquaux, A. Gramfort et al., “Scikit-learn: machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011. View at Google Scholar
  37. S. Penel, Package “ade4”, 2013, http://cran.r-project.org/web/packages/ade4/ade4.pdf.
  38. S. Dray and A. Dufour, “The ade4 package: implementing the duality diagram for ecologists,” Journal of Statistical Software, vol. 22, no. 4, pp. 1–20, 2007. View at Google Scholar · View at Scopus
  39. K. Moorthy, M. S. Mohamad, S. Deris, and Z. Ibrahim, “Multivariate analysis of gene expression data and missing value imputation based on llsimpute algorithm,” International Journal of Innovative Computing, Information and Control, vol. 6, no. 5, pp. 1335–1339, 2012. View at Google Scholar · View at Scopus
  40. A. Culhane, Package ‘made4’, 2013, http://bioconductor.org/packages/release/bioc/manuals/made4/man/made4.pdf.
  41. C. V. Subbulakshmi, S. N. Deepa, and N. Malathi, “Comparative analysis of XLMiner and WEKA for pattern classification,” in Proceedings of the IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT '12), pp. 453–457, Ramanathapuram Tamil Nadu, India, August 2012. View at Publisher · View at Google Scholar · View at Scopus
  42. S. Jothi and S. Anita, “Data mining classification techniques applied for cancer disease—a case study using Xlminer,” International Journal of Engineering Research & Technology, vol. 1, no. 8, 2012. View at Google Scholar
  43. T. Anh and S. Magi, Principal Component Analysis: Final Paper in Financial Pricing, National Cheng Kung University, 2009.
  44. L. Tierney, Lisp-Stat: An Object-Oriented Environment for Statistical Computing & Dynamic Graphics, Addison-Wesley, Reading, Mass, USA, 1990.
  45. F. W. Young and D. J. Lubinsky, “Guiding data analysis with visual statistical strategies,” Journal of Computational and Graphical Statistics, vol. 4, pp. 229–250, 1995. View at Google Scholar
  46. F. W. Young and J. B. Smith, “Towards a structured data analysis environment: a cognition-based design,” in Computing and Graphics in Statistics, A. Buja and P. A. Tukey, Eds., vol. 36, pp. 253–279, Springer, New York, NY, USA, 1991. View at Google Scholar
  47. F. W. Young, R. A. Faldowski, and M. M. McFarlane, “Multivariate statistical visualization,” in Handbook of Statistics, C. R. Rao, Ed., pp. 958–998, 1993. View at Google Scholar
  48. M. McFarlane and F. W. Young, “Graphical sensitivity analysis for multidimensional scaling,” Journal of Computational and Graphical Statistics, vol. 3, no. 1, pp. 23–34, 1994. View at Google Scholar
  49. P. M. Valero-Mora and R. D. Ledesma, “Using interactive graphics to teach multivariate data analysis to psychology students,” Journal of Statistics Education, vol. 19, no. 1, 2011. View at Google Scholar · View at Scopus
  50. E. Frank, M. Hall, G. Holmes et al., “Weka—a machine learning workbench for data mining,” in Data Mining and Knowledge Discovery Handbook, O. Maimon and L. Rokach, Eds., pp. 1269–1277, 2010. View at Google Scholar
  51. S. S. Prabhume and S. R. Sathe, “Reconstruction of a complete dataset from an incomplete dataset by PCA (principal component analysis) technique: some results,” International Journal of Computer Science and Network Security, vol. 10, no. 12, pp. 195–199, 2010. View at Google Scholar
  52. J. Khan, J. S. Wei, M. Ringnér et al., “Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks,” Nature Medicine, vol. 7, no. 6, pp. 673–679, 2001. View at Publisher · View at Google Scholar · View at Scopus
  53. P. Comon, “Independent component analysis, a new concept?” Signal Processing, vol. 36, no. 3, pp. 287–314, 1994. View at Publisher · View at Google Scholar · View at Scopus
  54. A. Hyvärinen, “Fast and robust fixed-point algorithms for independent component analysis,” IEEE Transactions on Neural Networks, vol. 10, no. 3, pp. 626–634, 1999. View at Publisher · View at Google Scholar · View at Scopus
  55. V. Zarzoso and P. Comon, “Comparative speed analysis of FastICA,” in Independent Component Analysis and Signal Separation, M. E. Davies, C. J. James, S. A. Abdallah, and M. D. Plumbley, Eds., vol. 4666 of Lecture Notes in Computer Science, pp. 293–300, Springer, Berlin, Germany, 2007. View at Google Scholar
  56. J. F. Cardoso and A. Souloumiac, “Blind beamforming for non-Gaussian signals,” IEE Proceedings, Part F: Radar and Signal Processing, vol. 140, no. 6, pp. 362–370, 1993. View at Google Scholar · View at Scopus
  57. A. Belouchrani, K. Abed-Meraim, J. Cardoso, and E. Moulines, “A blind source separation technique using second-order statistics,” IEEE Transactions on Signal Processing, vol. 45, no. 2, pp. 434–444, 1997. View at Publisher · View at Google Scholar · View at Scopus
  58. L. Tong, V. C. Soon, Y. F. Huang, and R. Liu, “AMUSE: a new blind identification algorithm,” in Proceedings of the IEEE International Symposium on Circuits and Systems, pp. 1784–1787, May 1990. View at Scopus
  59. S. Amari, A. Cichocki, and H. H. Yang, “A new learning algorithm for blind signal separation,” in Proceedings of the Advances in Neural Information Processing Systems Conference, pp. 757–763, 1996.
  60. A. Delorme and S. Makeig, “EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis,” Journal of Neuroscience Methods, vol. 134, no. 1, pp. 9–21, 2004. View at Publisher · View at Google Scholar · View at Scopus
  61. A. Biton, “Package ‘MineICA’,” 2013, http://www.bioconductor.org/packages/2.13/bioc/manuals/MineICA/man/MineICA.pdf.
  62. A. Hyvaerinen, J. Karhunen, and E. Oja, Independent Component Analysis, John Wiley & Sons, New York, NY, USA, 2001.
  63. M. Schmidt, D. Böhm, C. von Törne et al., “The humoral immune system has a key prognostic impact in node-negative breast cancer,” Cancer Research, vol. 68, no. 13, pp. 5405–5413, 2008. View at Publisher · View at Google Scholar · View at Scopus
  64. I. S. Helland, “On the structure of partial least squares regression,” Communications in Statistics. Simulation and Computation, vol. 17, no. 2, pp. 581–607, 1988. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  65. H. Martens and T. Naes, Multivariate calibration, John Wiley & Sons, London, UK, 1989. View at MathSciNet
  66. N. Krämer and A. Boulesteix, “Package “ppls”,” 2013, http://cran.rproject. org/web/packages/ppls/ppls.pdf.
  67. H. Wold, “Soft modeling: the basic design and some extensions,” in Systems under Indirect Observations: Causality, Structure, Prediction, K. G. Joreskog and H. Wold, Eds., Part 2, pp. 1–54, North-Holland, Amsterdam, The Netherlands, 1982. View at Google Scholar
  68. H. Akaike, “Likelihood and the Bayes procedure,” Trabajos de Estadística y de Investigación Operativa, vol. 31, no. 1, pp. 143–166, 1980. View at Google Scholar
  69. T. R. Golub, D. K. Slonim, P. Tamayo et al., “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring,” Science, vol. 286, no. 5439, pp. 531–537, 1999. View at Publisher · View at Google Scholar · View at Scopus
  70. S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2326, 2000. View at Publisher · View at Google Scholar · View at Scopus
  71. D. Ridder and R. P. W. Duin, Locally Linear Embedding, University of Technology, Delft, The Netherlands, 2002.