Review Article

Intelligence Algorithms for Protein Classification by Mass Spectrometry

Table 1

Typical classification algorithms and their characters and samples.

MethodFeatureSamples

Logistic Regressioncan predicate the peak intensity patterns exactly and simplify a SVD decomposition [13].Tandem mass spectrometry

KNN algorithmby Euclidean distance or by one minus correlation. [11]ovarian cancer MALDI-MS database
a modification of Euclidean distance formula [16].patients with mild cognitive impairment and patients with clinical symptoms of Alzheimer’s disease [16].

Support vector machinesusing 4 genescolon cancer database
suitable for noisy high-throughput proteomics and microarray data and outperforming in the robustness to noiseSELDI-TOF-MS
an unsupervised feature selection phase, restriction of the coefficient of variation and wavelet analysis for classification [17].ovarian cancer database [17].

Decision tree algorithma new high-throughput proteomic classification system, and developed by a nine-protein mass pattern [18]blood samples from prostate cancers and healthy man cohort [18]

Classification treepartitioning the learning sample into smaller and smaller subsamples to ensure the disease status within each subsample is relatively homogeneous [19].clinical specimens [19].
combining MALDI-TOF MS with WCX magnetic beads, and with high sensitivity (98.3%) and high specificity (84.4%) [20].patients with pulmonary tuberculosis [20].
boosted feature extraction coupled with the nearest centroid classifier with high accuracy [21].OCWCX2a [21].

Random Forestused as both feature extractors and classifier and suit for the small sample [4].serum samples from patients with ovarian cancer [4].
a complex proteome with a wide range of protein concentrations [22].signature peptides [22]
nonlinear random and combined with a discrete mapping approach [23].phosphorylation data set [23].

Neural Networks algorithma multilayer perceptron ANN with a backpropagation algorithm [24].SELDI-MS data [24].
using Naive Bayes with a multilayer perceptron [25].mass data set with InfoGain and Relief-F [25].
basing on SRNG and FLSOM [26].breast cancer listeria and tissue data set [26].
convolutional neural networks [27].Q-TOF and IT [27].