Table of Contents Author Guidelines Submit a Manuscript
Computational and Mathematical Methods in Medicine
Volume 2013 (2013), Article ID 340678, 8 pages
Research Article

SNP Selection in Genome-Wide Association Studies via Penalized Support Vector Machine with MAX Test

1Department of Statistics and Information Science, Dongguk University, Gyeongju 780-714, Republic of Korea
2Samsung Cancer Research Institute, Samsung Medical Center, Seoul 137-710, Republic of Korea
3Department of Medical Oncology and Hematology, Princess Margaret Hospital, University of Toronto, Toronto, ON, Canada M5G 2M9
4Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA

Received 22 May 2013; Revised 14 August 2013; Accepted 22 August 2013

Academic Editor: Wenqing He

Copyright © 2013 Jinseog Kim et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


One of main objectives of a genome-wide association study (GWAS) is to develop a prediction model for a binary clinical outcome using single-nucleotide polymorphisms (SNPs) which can be used for diagnostic and prognostic purposes and for better understanding of the relationship between the disease and SNPs. Penalized support vector machine (SVM) methods have been widely used toward this end. However, since investigators often ignore the genetic models of SNPs, a final model results in a loss of efficiency in prediction of the clinical outcome. In order to overcome this problem, we propose a two-stage method such that the the genetic models of each SNP are identified using the MAX test and then a prediction model is fitted using a penalized SVM method. We apply the proposed method to various penalized SVMs and compare the performance of SVMs using various penalty functions. The results from simulations and real GWAS data analysis show that the proposed method performs better than the prediction methods ignoring the genetic models in terms of prediction power and selectivity.