About this Journal Submit a Manuscript Table of Contents
BioMed Research International
Volume 2014 (2014), Article ID 145243, 14 pages
Research Article

Integration of High-Volume Molecular and Imaging Data for Composite Biomarker Discovery in the Study of Melanoma

1Department of Computer Science and Biomedical Informatics, University of Thessaly, Papasiopoulou 2-4, 35100 Lamia, Greece
2Department of Digital Systems, University of Piraeus, Grigoriou Lampraki 126, 18532 Piraeus, Greece
3Metabolic Engineering and Bioinformatics Programme, Institute of Biology, Medicinal Chemistry and Biotechnology, National Hellenic Research Foundation, 48 Vasileos Constantinou Avenue, 11635 Athens, Greece

Received 29 April 2013; Revised 28 September 2013; Accepted 12 October 2013; Published 16 January 2014

Academic Editor: Hesham H. Ali

Copyright © 2014 Konstantinos Moutselos et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


In this work the effects of simple imputations are studied, regarding the integration of multimodal data originating from different patients. Two separate datasets of cutaneous melanoma are used, an image analysis (dermoscopy) dataset together with a transcriptomic one, specifically DNA microarrays. Each modality is related to a different set of patients, and four imputation methods are employed to the formation of a unified, integrative dataset. The application of backward selection together with ensemble classifiers (random forests), followed by principal components analysis and linear discriminant analysis, illustrates the implication of the imputations on feature selection and dimensionality reduction methods. The results suggest that the expansion of the feature space through the data integration, achieved by the exploitation of imputation schemes in general, aids the classification task, imparting stability as regards the derivation of putative classifiers. In particular, although the biased imputation methods increase significantly the predictive performance and the class discrimination of the datasets, they still contribute to the study of prominent features and their relations. The fusion of separate datasets, which provide a multimodal description of the same pathology, represents an innovative, promising avenue, enhancing robust composite biomarker derivation and promoting the interpretation of the biomedical problem studied.