Review Article

A Review of Feature Extraction Software for Microarray Gene Expression Data

Table 1

A summary for PCA software.

NumberSoftwareAuthor/yearLanguageFeatures

1FactoMineR Lê et al. [5]R(i) Various dimension reduction methods such as PCA, CA, and MCA
(ii) Different types of variables, data structures, and supplementary information are considered
(iii) The PCA function can handle missing values

2ExPositionBeaton et al. [8]R(i) Numerous multivariate analysis methods such as PCA and Generalized Principal Component Analysis (GPCA)
(ii) Can be applied to quantitative and qualitative data
(iii) Implementation of Singular Value Decomposition

3amapLucas [9]R(i) Different types of PCA are provided: PCA, Generalized PCA, and Robust PCA
(ii) Clustering methods are provided such as hierarchical clustering and -means clustering
(iii) Plotting function for PCA
(iv) Computing distance and dissimilarity matrices

4ADE-4Thioulouse et al. [10]RA variety of methods such as PCA, CA, Principal Analysis Regression, PLS, and others are offered

5MADE4 Culhane et al. [11]R(i) Functions provided by ADE-4
(ii) Integration of multiple datasets for multivariate analysis
(iii) Functions for visualizing and plotting the results of analysis, including 3D plots
(iv) Addition of LLSimpute algorithm for imputation of missing values

6XLMinerWitten and Frank [12]Implemented in Excel(i) Provision of data reduction methods such as PCA
(ii) Can be used for classification, clustering, data preprocessing, data normalization, and others

7ViSta Young et al. [13]C++, Fortran, XLisp, and ViDAL(i) Multivariate analysis methods are offered such as PCA, Interactive Cluster Analysis, and Parallel Boxplots
(ii) Provision of dynamic and high-interaction visualization for displaying multiple views of data

8imDEVGrapov and Newman [14]Visual Basic and R(i) Data preprocessing: missing values imputation and data transformations
(ii) Clustering methods are offered
(iii) Dimension reduction methods: PCA and ICA
(iv) Feature selection methods
(v) Visualization of data dependencies

9Statistics ToolboxThe MathWorks [15]MATLAB(i) Multivariate statistics such as PCA, clustering, and others
(ii) Statistical plots, probability distributions, linear models, nonlinear models for regression, and others are provided

10WekaHall et al. [16]JavaA variety of machine learning algorithms are provided such as feature selection, data preprocessing, regression, dimension reduction, classification, and clustering methods

11NAG LibraryNAG Toolbox for MATLAB
[17]
Fortran and C(i) Provision of more than 1700 mathematical and statistical algorithms
(ii) Multivariate analysis using PCA can be implemented using the g03aa routine