Table of Contents Author Guidelines Submit a Manuscript
Advances in Bioinformatics
Volume 2013 (2013), Article ID 618461, 8 pages
http://dx.doi.org/10.1155/2013/618461
Research Article

Identification of Robust Pathway Markers for Cancer through Rank-Based Pathway Activity Inference

Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843-3128, USA

Received 30 November 2012; Accepted 19 January 2013

Academic Editor: Hazem Nounou

Copyright © 2013 Navadon Khunlertgit and Byung-Jun Yoon. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

One important problem in translational genomics is the identification of reliable and reproducible markers that can be used to discriminate between different classes of a complex disease, such as cancer. The typical small sample setting makes the prediction of such markers very challenging, and various approaches have been proposed to address this problem. For example, it has been shown that pathway markers, which aggregate the gene activities in the same pathway, tend to be more robust than gene markers. Furthermore, the use of gene expression ranking has been demonstrated to be robust to batch effects and that it can lead to more interpretable results. In this paper, we propose an enhanced pathway activity inference method that uses gene ranking to predict the pathway activity in a probabilistic manner. The main focus of this work is on identifying robust pathway markers that can ultimately lead to robust classifiers with reproducible performance across datasets. Simulation results based on multiple breast cancer datasets show that the proposed inference method identifies better pathway markers that can predict breast cancer metastasis with higher accuracy. Moreover, the identified pathway markers can lead to better classifiers with more consistent classification performance across independent datasets.