Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2017 (2017), Article ID 2437608, 11 pages
Research Article

Metabolomic Biomarker Identification in Presence of Outliers and Missing Values

1Bioinformatics Lab, Department of Statistics, Rajshahi University, Rajshahi, Bangladesh
2Department of Statistics, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj, Bangladesh
3Department of Statistics, Begum Rokeya University, Rangpur, Bangladesh
4Institute of Biological Sciences, Rajshahi University, Rajshahi, Bangladesh

Correspondence should be addressed to Nishith Kumar;

Received 24 October 2016; Revised 15 December 2016; Accepted 18 January 2017; Published 14 February 2017

Academic Editor: Peter J. Oefner

Copyright © 2017 Nishith Kumar et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Metabolomics is the sophisticated and high-throughput technology based on the entire set of metabolites which is known as the connector between genotypes and phenotypes. For any phenotypic changes, potential metabolite (biomarker) identification is very important because it provides diagnostic as well as prognostic markers and can help to develop new biomolecular therapy. Biomarker identification from metabolomics data analysis is hampered by the use of high-throughput technology that provides high dimensional data matrix which contains missing values as well as outliers. However, missing value imputation and outliers handling techniques play important role in identifying biomarker correctly. Although several missing value imputation techniques are available, outliers deteriorate the accuracy of imputation as well as the accuracy of biomarker identification. Therefore, in this paper we have proposed a new biomarker identification technique combining the groupwise robust singular value decomposition, -test, and fold-change approach that can identify biomarkers more correctly from metabolomics dataset. We have also compared the performance of the proposed technique with those of other traditional techniques for biomarker identification using both simulated and real data analysis in absence and presence of outliers. Using our proposed method in hepatocellular carcinoma (HCC) dataset, we have also identified the four upregulated and two downregulated metabolites as potential metabolomic biomarkers for HCC disease.