Table of Contents
ISRN Bioinformatics
Volume 2013 (2013), Article ID 404717, 15 pages
http://dx.doi.org/10.1155/2013/404717
Research Article

Exploiting Identifiability and Intergene Correlation for Improved Detection of Differential Expression

1Department of Electrical and Computer Engineering, Michigan State University, 2120 EB, East Lansing, MI 48824, USA
2Department of Molecular Biology & Biochemistry, Carcinogenesis Laboratory, Michigan State University, 341 FST, East Lansing, MI 48824, USA

Received 13 October 2012; Accepted 19 November 2012

Academic Editors: H. Ma, K. Mizuguchi, and H.-C. Yang

Copyright © 2013 J. R. Deller et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Accurate differential analysis of microarray data strongly depends on effective treatment of intergene correlation. Such dependence is ordinarily accounted for in terms of its effect on significance cutoffs. In this paper, it is shown that correlation can, in fact, be exploited to share information across tests and reorder expression differentials for increased statistical power, regardless of the threshold. Significantly improved differential analysis is the result of two simple measures: (i) adjusting test statistics to exploit information from identifiable genes (the large subset of genes represented on a microarray that can be classified a priori as nondifferential with very high confidence], but (ii) doing so in a way that accounts for linear dependencies among identifiable and nonidentifiable genes. A method is developed that builds upon the widely used two-sample t-statistic approach and uses analysis in Hilbert space to decompose the nonidentified gene vector into two components that are correlated and uncorrelated with the identified set. In the application to data derived from a widely studied prostate cancer database, the proposed method outperforms some of the most highly regarded approaches published to date. Algorithms in MATLAB and in R are available for public download.