Research Article  Open Access
WenKuei Chien, Chuhsing Kate Hsiao, "Applications of Bayesian Gene Selection and Classification with Mixtures of Generalized Singular Priors", Computational and Mathematical Methods in Medicine, vol. 2013, Article ID 420412, 11 pages, 2013. https://doi.org/10.1155/2013/420412
Applications of Bayesian Gene Selection and Classification with Mixtures of Generalized Singular Priors
Abstract
Recent advancement in microarray technologies has led to a collection of an enormous number of genetic markers in disease association studies, and yet scientists are interested in selecting a smaller set of genes to explore the relation between genes and disease. Current approaches either adopt a single marker test which ignores the possible interaction among genes or consider a multistage procedure that reduces the large size of genes before evaluation of the association. Among the latter, Bayesian analysis can further accommodate the correlation between genes through the specification of a multivariate prior distribution and estimate the probabilities of association through latent variables. The covariance matrix, however, depends on an unknown parameter. In this research, we suggested a reference hyperprior distribution for such uncertainty, outlined the implementation of its computation, and illustrated this fully Bayesian approach with a colon and leukemia cancer study. Comparison with other existing methods was also conducted. The classification accuracy of our proposed model is higher with a smaller set of selected genes. The results not only replicated findings in several earlier studies, but also provided the strength of association with posterior probabilities.
1. Introduction
Recent advancement in oligonucleotide microarray technologies has resulted in production of thousands of gene expression levels in a single experiment. With such vast amount of data, one major task for researchers is to develop classification rules for prediction of cancers or cancer subtypes based on gene expression levels of tissue samples. The accuracy of such classification rules may be crucial for diagnosis and treatment, since different cancer subtypes may require different targetspecific therapies. However, the development of good and efficient classification rules has not been straightforward, either because of the huge number of genes collected from a relatively small number of tissue samples or because of the model complexity associated with the biological mechanism. The identification of a smaller set of relevant genes to characterize different disease classes, therefore, has been a challenging task. Procedures which are efficient in gene selection as well as in classification do play an important role in cancer research.
Many approaches have been proposed for classes classification. For example, several analyses identified a subset of classifying genes with statistics, regression model approach, mixture model, Wilcoxon score test, or the betweenwithin classes sum of squares (BSS/WSS) [1–7]. These methods are univariate in the sense that each gene is tested individually. Others started with an initial step of dimension reduction before classification procedures, such as the principle components analysis (PCA) [8–10] and the partial least squares algorithm (PLS algorithm) [11–15]. These methods may reduce dimension (the number of genes) effectively but may not be biologically interpretable. To capture the genegene correlations, researchers proposed the pairbased method [16], correlationbased feature selection [17], and the Markov random field prior [18]. Although these methods can model the genegene interaction, they can be computationally timeconsuming.
Bayesian approach can accommodate naturally the interplay between genes via prior distributions, under the setting of regression models. Examples included the Bayesian hierarchical mixture model [19–21] and a logistic or probit link with latent variables and stochastic search variable selection (SSVS) procedure for binary and multicategorical phenotypes [22–25]. To consider all genes simultaneously, most Bayesian approaches adopt a multivariate analysis with a natural conjugate prior , called prior, for the regression parameters [26]. This a priori distribution utilizes the design matrix as the prior covariance matrix of and can lead to a relatively simple posterior distribution. However, if the number of genes is much larger than the number of samples available, the dimension of becomes large and a high degree of multicollinearity may occur. In that case, the covariance matrix of Zellner’s prior becomes nearly singular. Modifications included the prior distribution with the MoorePenrose generalized inverse matrix [27] and use of a ridge parameter [28, 29]. Alternatively, other researchers focused on the scalar in which controls the expected size of the nonzero regression coefficients. For instance, it was reported that the final results are insensitive to the values of between 10 and 100, and the value has been suggested after extensive examinations [30]. Instead of fixing at a constant, George and Foster [31] proposed an empirical Bayes estimate for , while Liang and colleagues [32] suggested a hyper prior, a special case of the incomplete inversegamma prior in Cui and George [33].
The main purpose of this research is the application of fully Bayesian approaches with a hyperprior on . Specifically we adopted an inversegamma prior which was commented earlier that it could lead to computational difficulty. Therefore, we outlined a MCMC algorithm and demonstrated its implementation. In this paper, we considered a probit regression model for classification with SSVS to identify the influential genes, augmented the response variables with latent variables , and converted the probit model to a Gaussian regression problem with the generalized singular prior (prior). For the choice of , we assigned a hyperprior for the uncertainty in . This hyperprior is intuitive and differs from those in [32, 33]. Finally, we defined an indicator variable for the th gene and perform MCMC methods to generate posterior samples for gene selection and class classification. The rest of the paper is arranged as follows. In Section 2, we briefly described the model specification including the data augmentation approach and SSVS methods. Under this hyperprior on , we also demonstrated the implementation of the Bayesian inference. Applications of three cancer studies, acute leukemia, colon cancer, and large Bcell lymphoma (DLBCL), were presented in Section 3. Conclusion and discussion were given in Section 4.
2. Model and Notation
Let indicate the observed data, where denotes the expression level of the th gene from the th sample and denotes the response vector, where indicates that sample is a cancer tissue and for normal tissue. Assume that are independent random variables with .
2.1. Probit Model with Latent Variable
The gene expression measurements can be linked to the response outcome with a probit regression model: where represents the intercept, is the th row in the design matrix , is the vector of regression coefficients, and is the standard normal cumulative distribution function.
To perform statistical inference under this probit regression model, we first adopt independent latent variables , where and the corresponds to the disease status as The use of such latent variables helps to determine which category the th sample is to be classified. Note that multiplying a constant on both sides in (3) does not change the model; thus a unit variance is considered for .
If a noninformative prior is assumed for , then the posterior covariance matrix of given becomes . However, due to the enormous size of microarray data, may be nearly singular, and variable selection for dimension reduction is needed. We define for variable selection the vector whose elements are all binary, where Given , we denote as the number of 1’s in and a reduced vector containing the regression coefficients if its corresponding is 1. Accordingly, for all , the corresponding columns in are collected to build , an reduced gene expression matrix. Given , the probit regression model in (3) can be written as where is the th row in .
2.2. Choice of Prior Distributions
To complete the model specification, we assign a normal prior for the intercept with a large indicating no a priori information. For the regression parameters, the commonly applied prior may not work if the sample size is less than the number , leading to the results that is not of full rank and does not exist. Therefore, we consider the prior distribution with as the pseudoinverse of for conditioning on , . This would solve the singularity problem. Next, we assign for and the priors and assume that are independent for . Note that here the ’s are of small values, implying a small set of influential genes.
We now complete the model specification:
Note that if the th sample is a cancer tissue, is the intercept, is the vector of regression coefficients, is the standard normal cumulative distribution function, and is the design matrix:
And contains the binary , where if the th gene is selected , is a reduced vector containing the regression coefficients if its corresponding is 1, is the number of 1’s in , and is the th row in .
2.3. Computation and Posterior Inference
Based on the prior distributions specified in previous sections, the joint posterior distribution can be derived as where and are the nonzero eigenvalues of . From (10), given is a multivariate normal distribution with a covariance matrix . In the case where is not of full column rank, the problem of convergence may occur in the MCMC algorithm because the covariance matrix is not positive definite and the multivariate normal distribution becomes degenerated. To avoid this problem and speed up the computations, we integrate out and in (10) following Yang and Song’s [27] suggestion and derive where . As the posterior distribution is not available in an explicit form, we use the MCMC technique to obtain posterior sample observations. The computational sampling scheme is as follows. (1)Draw from , where The conditional distribution of given is a multivariate truncated normal. Since it is difficult to directly sample from this distribution, we draw samples , , from , where is the vector of without the th element [34].(2)Draw from , where Similar to the above procedure, we draw samples , , from . It can be shown that where and are similar to with replaced by and , respectively.(3)Draw from , where The above distribution does not belong to any standard distribution, so we will use MetropolisHastings algorithm to sample .
The iteration therefore starts with initial values of , , and , and our MCMC procedures at the th iteration are as follows.
Step 1. Draw from , .
Step 2. For , calculate , generate a random number from , and let
Step 3. Draw from (17) by the following steps:(i)maximize (17) to obtain ;(ii)generate the proposal value
where follows a normal truncated in a positive region (a,b) with a density ;(iii)accept with the acceptance probability:
After the initial burnin period, we obtain the MCMC samples which are next used to estimate the posterior gene inclusion probability by
and genes with higher posterior inclusion probabilities are considered more relevant to classification.
2.4. Classification
To assess the performance of our procedures, testing data sets are considered. For example, a testing set is available, and the predictive probability of given is
Based on the MCMC samples, we estimate the probability with
When there are no testing sets available, we adopt the leaveoneout crossvalidation (LOOCV) method to evaluate the performance with the training data. Because the predictive probability for is where denotes the vector of without the th element. We estimate this probability based on the generated MCMC samples,
3. Applications
In this section, we applied the fully Bayesian approach and the reference prior to three cancer studies: colon cancer, leukemia, and a large Bcell lymphoma (DLBCL) study [35–37]. We also compared the performance of this approach with other existing gene selection and classification methods. These data have been extensively studied with various methods but we only included a limited set of them. Others can be found in the reference lists of the work cited here.
3.1. Colon Cancer Study
The data of the colon cancer study contained 2000 expression levels from 40 tumor and 22 normal colon tissues. These expression levels were first transformed with a base 10 logarithmic function and then standardized to zero mean and unit variance for each gene. We then performed the MCMC sampler fixing the in at 100 and for all . We burned in the first 12000 iterations, collected every 30th sample, and obtained 6700 posterior points in total for further analysis. The leading 20 genes with the largest posterior inclusion probabilities were presented in Table 1. This list was compared with the findings in three other studies [38–40] and similar findings were denoted in Table 1. The first 19 genes were identified in at least one of the three studies. For reference, Figure 1 displays the 100 largest posterior probabilities of the 100 corresponding genes.
 
Gene also identified in BenDor et al. [38]. ^{ b}Gene also identified in Furlanello et al. [39]. ^{ c}Gene also identified in Chu et al. [40]. 
For classification, we adopted the external leaveoneout crossvalidation (LOOCV) procedure to evaluate the performance of classification with the selected genes. The procedures were the following: (i) removing one sample from the training set; (ii) ranking the genes in terms of statistics using the remaining samples and retaining the top 50 genes as the starting set to reduce computational burden; (iii) selecting the most influential genes from the 50 genes based on our Bayesian method; and (iv) using these genes to classify the previously removed sample. The procedures were repeated for each sample in the dataset. With different choices of like , , and , the error rates were 0.1452, 0.1452, and 0.1129, respectively. The performance of other methods, including SVM [41]; classification tree followed by 1Nearestneighbor and LogitBoost with 100 iterations [42]; MAVELD [43]; IRWPLS [44]; supervised group Lasso (SGLasso, [45]) and MRMS [46]; and test for single markers in probit regression was summarized in Table 2. SVM had the smallest error rate, but it apparently included too many genes (1000 in this set). One other method MRMS+SVM+D1 performed better, with one more correct classification, than our proposed procedure when 6 or 10 genes were selected.
 
Proposed by Furey et al. [41]. ^{ b}Proposed by Dettling andBühlmann [42]. ^{ c}Proposed by Antoniadis et al. [43]. ^{ d}Proposed by Ding and Gentleman [44]. ^{ e}Proposed by Ma et al. [45]. ^{ f}Proposed by Maji and Paul [46]. 
3.2. Leukemia Study
Next we considered the leukemia study with gene expression levels from 72 tissues including 47 acute lymphoblastic leukemia (ALL) patients and 25 acute myeloid leukemia (AML) subjects. These data contained 38 training and 34 testing samples. The training data contained 27 ALL cases and 11 AML cases, whereas the testing data were with 20 ALL cases and 14 AML cases. As described in other studies [2], the preprocessing steps such as thresholding and filtering were applied first and then followed by a base 10 logarithmic transformation. A total of 3571 genes were left for analysis. Next, we standardized the data across samples, and we ranked these genes by the same MCMC procedures described earlier. The top 20 genes with the largest posterior inclusion probabilities were presented in Table 3, and genes identified by other studies [36, 41, 47, 48] were also noted. For reference, Figure 2 displays the 100 largest posterior probabilities of the 100 corresponding genes.
 
Gene also identified in Golub et al. [36]. ^{ b}Gene also identified in BenDor et al. [38]. ^{ c}Gene also identified in in Lee et al. [22]. 
For the classification procedure, similar to the procedures for colon cancer study, we selected most influential genes from a starting set of 50 genes and next used them to examine the testing data. With , 10, or 14 genes, only the 61st and 66th observations were misclassified by our procedure. We also compared the results with weighted voting machine [36], MAVELD [43], twostep EBM [47], KIGP + PK [48], and test for single markers with probit regression, as summarized in Table 4. Note that although MAVELD and twostep EBM methods performed better than our proposed procedure, both methods used more genes (50 and 512) and yet achieved only one less misclassification. Among this list, our procedure apparently considered a smaller set of genes with a satisfactory performance.
 
Proposed by Gloub et al. [36]. ^{ b}Proposed by Antoniadis et al. [43]. ^{ c}Proposed by Ji et al. [47]. ^{ d}Proposed by Zhao and Cheung [48]. 
3.3. Diffuse Large BCell Lymphoma (DLBCL) Study
This study collected 58 samples from DLBCL patients and 19 samples from follicular lymphoma [37]. The original dataset contained 7129 genes. After the preprocessing steps such as thresholding and filtering were applied and a base 10 logarithmic transformation was conducted, a total of 6285 genes were left for analysis. Next, we standardized the data across samples and ranked these genes by the same MCMC procedures described in earlier sections. The error rates for , 10, or 14 under LOOCV were 0.0519, 0.0649, and 0.0779, and the accuracy was between 0.92 and 0.95, as listed in Table 5. To achieve a smaller error rate, we considered and obtained a smaller rate 0.0390, the same rate achieved by the hyperbox enclosure (HBE) method [49]. Similar to the discussion in the previous two applications, our proposed model can achieve the same or smaller error rate with a smaller set of genes.

4. Conclusion and Discussion
In this Bayesian framework, we considered a mixture of prior to complete a fully Bayesian analysis for gene selection and cancer classification. Different from other existing methods that treated the as a fixed value, we incorporated its uncertainty by assuming a reference inversegamma prior distribution. Earlier studies mentioned this prior, but considered it difficult to derive posterior inference. We therefore outlined the implementation for computation under this model setting for future applications. This approach is more flexible in the process of model building. This model is able to evaluate how influential a gene can be with posterior probabilities that can be used next for variable selection. Such an approach is useful in biomedical interpretations for the selection of relevant genes for disease of interest. When compared with other existing methods, our proposed procedure achieves a better or comparable accurate rate in classification with fewer genes. In the analyses of colon cancer and leukemia studies, we replicate several relevant genes identified by other research groups. The findings have accumulated evidence for further laboratory research.
In the application section, we listed only the results from , 10, and 14 selected genes. Other values for have been tried and the performance remains good. For instance, the pink line in Figures 3 and 4 displays the accuracy of the proposed procedure when the number of selected genes varies between 5 and 20 for the colon cancer and leukemia study, respectively. For the colon cancer study, the largest accuracy 0.8871 occurs at , while other values of lead to the accuracy between 0.8387 and 0.8871. These correspond to at least 52 correctly identified subjects out of 62. For the leukemia study, the largest accuracy 0.9706 occurs at . Other values of all lead to an accuracy larger than 90% except when (accuracy is ). In addition, we compared the results under the proposed generalized prior with fixed at a constant. The colored lines in Figures 3 and 4 are for fixed at 5 (red line), 10 (blue), or 20 (black), respectively. Again, results under the prior distribution assumption lead to a higher accuracy with a less number of selected genes. Another issue is related to the choice of the number of genes in the starting set. We have considered 50 in all three applications. This value can certainly be changed. However, the computational complexity increased as the value becomes larger. This cost in computation remains a research topic for future research.
To compare the performance of a stochastic and a constant , we also conducted a small simulation study to investigate the effect of assigning a prior on versus fixing at different constant values. We used the R package penalizedSVM [50, 51] to simulate three data sets; each contains 500 genes with 15 genes associated with the disease. The numbers of training and testing sample were 200 and 40, respectively. We then conducted the gene selection procedures with a prior on , , , and at and recorded the accuracy under each setting. Figure 5 plots the average accuracy with the pink line standing for the accuracy under the mixtures of priors on , the black line for , the red line for , and the blue line for . It can be observed that only when is assigned with a very large number like 500, the corresponding accuracy can be slightly better than that under a prior for the uncertainty in . This again supports the use of the mixtures of priors for a better and robust result.
Here in this paper we have focused on the analysis of binary data. However, the probit regression model can be extended to a multinomial probit model to solve the multiclass problems, and the Bayesian inference can be carried out similarly. Such analysis will involve a larger computational load and further research in this direction is needed. Another point worth mentioning is the inclusion of interactions between genes. Further research can incorporate a power prior into the prior of [52] or include information on genegene network structure [18] to complete the procedure for variable selection.
Acknowledgment
Part of this research was supported by NSC 1002314B002107MY3.
References
 V. T. Chu, R. Gottardo, A. E. Raftery, R. E. Bumgarner, and K. Y. Yeung, “MeV+R: using MeV as a graphical user interface for Bioconductor applications in microarray analysis,” Genome Biology, vol. 9, no. 7, article R118, 2008. View at: Publisher Site  Google Scholar
 S. Dudoit, J. Fridlyand, and T. P. Speed, “Comparison of discrimination methods for the classification of tumors using gene expression data,” Journal of the American Statistical Association, vol. 97, no. 457, pp. 77–86, 2002. View at: Publisher Site  Google Scholar
 A. Hirakawa, Y. Sato, D. Hamada, and I. Yoshimura, “A new test statistic based on shrunken sample variance for identifying differentially expressed genes in small microarray experiments,” Bioinformatics and Biology Insights, vol. 2, pp. 145–156, 2008. View at: Google Scholar
 W. Pan, J. Lin, and C. T. Le, “A mixture model approach to detecting differentially expressed genes with microarray data,” Functional and Integrative Genomics, vol. 3, no. 3, pp. 117–124, 2003. View at: Publisher Site  Google Scholar
 K. Y. Yeung, R. E. Bumgarner, and A. E. Raftery, “Bayesian model averaging: development of an improved multiclass, gene selection and classification tool for microarray data,” Bioinformatics, vol. 21, no. 10, pp. 2394–2402, 2005. View at: Publisher Site  Google Scholar
 A. Gusnanto, A. Ploner, F. Shuweihdi, and Y. Pawitan, “Partial least squares and logistic regression randomeffects estimates for gene selection in supervised classification of gene expression data,” Journal of Biomedical Informatics, vol. 4, pp. 697–709, 2013. View at: Google Scholar
 Y. Liang, C. Liu, X. Z. Luan et al., “Sparse logistic regression with a L1/2 penalty for gene selection in cancer classification,” BMC Bioinformatics, vol. 14, article 198, 2013. View at: Google Scholar
 G.Z. Li, H.L. Bu, M. Q. Yang, X.Q. Zeng, and J. Y. Yang, “Selecting subsets of newly extracted features from PCA and PLS in microarray data analysis,” BMC Genomics, vol. 9, no. 2, article S24, 2008. View at: Publisher Site  Google Scholar
 A. Wang and E. A. Gehan, “Gene selection for microarray data analysis using principal component analysis,” Statistics in Medicine, vol. 24, no. 13, pp. 2069–2087, 2005. View at: Publisher Site  Google Scholar
 S. Bicciato, A. Luchini, and C. Di Bello, “PCA disjoint models for multiclass cancer analysis using gene expression data,” Bioinformatics, vol. 19, no. 5, pp. 571–578, 2003. View at: Publisher Site  Google Scholar
 X. Q. Zeng, G. Z. Li, M. Q. Yang, G. F. Wu, and J. Y. Yang, “Orthogonal projection weights in dimension reduction based on partial least squares,” International Journal of Computational Intelligence in Bioinformatics and Systems Biology, vol. 1, pp. 100–115, 2009. View at: Google Scholar
 A.L. Boulesteix and K. Strimmer, “Partial least squares: a versatile tool for the analysis of highdimensional genomic data,” Briefings in Bioinformatics, vol. 8, no. 1, pp. 32–44, 2007. View at: Publisher Site  Google Scholar
 D. V. Nguyen and D. M. Rocke, “Tumor classification by partial least squares using microarray gene expression data,” Bioinformatics, vol. 18, no. 1, pp. 39–50, 2002. View at: Google Scholar
 J. X. Liu, Y. Xu, C. H. Zheng, Y. Wang, and J. Y. Yang, “Characteristic gene selection via weighting principal components by singular values,” PLoS ONE, vol. 7, no. 7, Article ID e38873, 2012. View at: Publisher Site  Google Scholar
 S. Student and K. Fujarewicz, “Stable feature selection and classification algorithms for multiclass microarray data,” Biology Direct, vol. 7, article 33, 2012. View at: Google Scholar
 T. Bø and I. Jonassen, “New feature subset selection procedures for classification of expression profiles,” Genome Biology, vol. 3, no. 4, pp. 1–17, 2002. View at: Google Scholar
 Y. Wang, I. V. Tetko, M. A. Hall et al., “Gene selection from microarray data for cancer classification—a machine learning approach,” Computational Biology and Chemistry, vol. 29, no. 1, pp. 37–46, 2005. View at: Publisher Site  Google Scholar
 F. C. Stingo and M. Vannucci, “Variable selection for discriminant analysis with Markov random field priors for the analysis of microarray data,” Bioinformatics, vol. 27, no. 4, pp. 495–501, 2011. View at: Publisher Site  Google Scholar
 J. G. Ibrahim, M.H. Chen, and R. J. Gray, “Bayesian models for gene expression with DNA microarray data,” Journal of the American Statistical Association, vol. 97, no. 457, pp. 88–99, 2002. View at: Publisher Site  Google Scholar
 Y.C. Wei, S.H. Wen, P.C. Chen, C.H. Wang, and C. K. Hsiao, “A simple Bayesian mixture model with a hybrid procedure for genomewide association studies,” European Journal of Human Genetics, vol. 18, no. 8, pp. 942–947, 2010. View at: Publisher Site  Google Scholar
 B. Peng, D. Zhu, and B. P. Ander, “An Integrative Framework for Bayesian variable selection with informative priors for identifying genes and pathways,” PLoS ONE, vol. 8, no. 7, Article ID 0067672, 2013. View at: Publisher Site  Google Scholar
 K. E. Lee, N. Sha, E. R. Dougherty, M. Vannucci, and B. K. Mallick, “Gene selection: a Bayesian variable selection approach,” Bioinformatics, vol. 19, no. 1, pp. 90–97, 2003. View at: Publisher Site  Google Scholar
 N. Sha, M. Vannucci, M. G. Tadesse et al., “Bayesian variable selection in multinomial probit models to identify molecular signatures of disease stage,” Biometrics, vol. 60, no. 3, pp. 812–819, 2004. View at: Publisher Site  Google Scholar
 X. Zhou, K.Y. Liu, and S. T. C. Wong, “Cancer classification and prediction using logistic regression with Bayesian gene selection,” Journal of Biomedical Informatics, vol. 37, no. 4, pp. 249–259, 2004. View at: Publisher Site  Google Scholar
 J. G. Liao and K.V. Chin, “Logistic regression for disease classification using microarray data: model selection in a large p and small n case,” Bioinformatics, vol. 23, no. 15, pp. 1945–1951, 2007. View at: Publisher Site  Google Scholar
 A. Zellner, “On assessing prior distributions and Bayesian regression analysis with gprior distributions,” in Bayesian Inference and Decision Techniques: Essays in Honor of Bruno De Finetti, pp. 233–243, NorthHolland, Amsterdam, The Netherlands, 1986. View at: Google Scholar
 A.J. Yang and X.Y. Song, “Bayesian variable selection for disease classification using gene expression data,” Bioinformatics, vol. 26, no. 2, pp. 215–222, 2010. View at: Google Scholar
 M. Baragatti and D. Pommeret, “A study of variable selection using gprior distribution with ridge parameter,” Computational Statistics and Data Analysis, vol. 56, no. 6, pp. 1920–1934, 2012. View at: Publisher Site  Google Scholar
 E. Leya and M. F. J. Steel, “Mixtures of gpriors for Bayesian model averaging with economic applications,” Journal of Econometrics, vol. 171, no. 2, pp. 251–266, 2012. View at: Google Scholar
 M. Smith and R. Kohn, “Nonparametric regression using Bayesian variable selection,” Journal of Econometrics, vol. 75, no. 2, pp. 317–343, 1996. View at: Publisher Site  Google Scholar
 E. I. George and D. P. Foster, “Calibration and empirical bayes variable selection,” Biometrika, vol. 87, no. 4, pp. 731–747, 2000. View at: Google Scholar
 F. Liang, R. Paulo, G. Molina, M. A. Clyde, and J. O. Berger, “Mixtures of g priors for Bayesian variable selection,” Journal of the American Statistical Association, vol. 103, no. 481, pp. 410–423, 2008. View at: Publisher Site  Google Scholar
 W. Cui and E. I. George, “Empirical Bayes versus fully Bayes variable selection,” Journal of Statistical Planning and Inference, vol. 138, no. 4, pp. 888–900, 2008. View at: Publisher Site  Google Scholar
 C. P. Robert, “Convergence control methods for Markov chain Monte Carlo algorithms,” Statistical Science, vol. 10, pp. 231–253, 1995. View at: Google Scholar
 U. Alon, N. Barka, D. A. Notterman et al., “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays,” Proceedings of the National Academy of Sciences of the United States of America, vol. 96, no. 12, pp. 6745–6750, 1999. View at: Publisher Site  Google Scholar
 T. R. Golub, D. K. Slonim, P. Tamayo et al., “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring,” Science, vol. 286, no. 5439, pp. 531–527, 1999. View at: Publisher Site  Google Scholar
 M. A. Shipp, K. N. Ross, P. Tamayo et al., “Diffuse large Bcell lymphoma outcome prediction by geneexpression profiling and supervised machine learning,” Nature Medicine, vol. 8, no. 1, pp. 68–74, 2002. View at: Publisher Site  Google Scholar
 A. BenDor, L. Bruhn, N. Friedman, I. Nachman, M. Schummer, and Z. Yakhini, “Tissue classification with gene expression profiles,” Journal of Computational Biology, vol. 7, no. 34, pp. 559–583, 2000. View at: Publisher Site  Google Scholar
 C. Furlanello, M. Serafini, S. Merler, and G. Jurman, “Entropybased gene ranking without selection bias for the predictive classification of microarray data,” BMC Bioinformatics, vol. 4, article 54, 2003. View at: Publisher Site  Google Scholar
 W. Chu, Z. Ghahramani, F. Falciani, and D. L. Wild, “Biomarker discovery in microarray gene expression data with Gaussian processes,” Bioinformatics, vol. 21, no. 16, pp. 3385–3393, 2005. View at: Publisher Site  Google Scholar
 T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, “Support vector machine classification and validation of cancer tissue samples using microarray expression data,” Bioinformatics, vol. 16, no. 10, pp. 906–914, 2000. View at: Google Scholar
 M. Dettling and P. Bühlmann, “Boosting for tumor classification with gene expression data,” Bioinformatics, vol. 19, no. 9, pp. 1061–1069, 2003. View at: Publisher Site  Google Scholar
 A. Antoniadis, S. LambertLacroix, and F. Leblanc, “Effective dimension reduction methods for tumor classification using gene expression data,” Bioinformatics, vol. 19, no. 5, pp. 563–570, 2003. View at: Publisher Site  Google Scholar
 B. Ding and R. Gentleman, “Classification Using Generalized Partial Least Squares,” Bioconductor Project Working Papers, 2004, http://www.bepress.com/bioconductor /paper5. View at: Google Scholar
 S. Ma, X. Song, and J. Huang, “Supervised group Lasso with applications to microarray data analysis,” BMC Bioinformatics, vol. 8, article 60, 2007. View at: Publisher Site  Google Scholar
 P. Maji and S. Paul, “Rough set based maximum relevancemaximum significance criterion and Gene selection from microarray data,” International Journal of Approximate Reasoning, vol. 52, no. 3, pp. 408–426, 2011. View at: Publisher Site  Google Scholar
 Y. Ji, K.W. Tsui, and K. Kim, “A novel means of using gene clusters in a twostep empirical Bayes method for predicting classes of samples,” Bioinformatics, vol. 21, no. 7, pp. 1055–1061, 2005. View at: Publisher Site  Google Scholar
 X. Zhao and L. W.K. Cheung, “Kernelimbedded Gaussian processes for disease classification using microarray gene expression data,” BMC Bioinformatics, vol. 8, article 67, 2007. View at: Publisher Site  Google Scholar
 O. Dagliyan, F. UneyYuksektepe, I. H. Kavakli, and M. Turkay, “Optimization based tumor classification from microarray gene expression data,” PLoS ONE, vol. 6, no. 2, Article ID e14579, 2011. View at: Publisher Site  Google Scholar
 H. H. Zhang, J. Ahn, X. Lin, and C. Park, “Gene selection using support vector machines with nonconvex penalty,” Bioinformatics, vol. 22, no. 1, pp. 88–95, 2006. View at: Publisher Site  Google Scholar
 G. M. Fung and O. L. Mangasarian, “A feature selection Newton method for support vector machine classification,” Computational Optimization and Applications, vol. 28, no. 2, pp. 185–202, 2004. View at: Publisher Site  Google Scholar
 A. Krishna, H. D. Bondell, and S. K. Ghosh, “Bayesian variable selection using an adaptive powered correlation prior,” Journal of Statistical Planning and Inference, vol. 139, no. 8, pp. 2665–2674, 2009. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2013 WenKuei Chien and Chuhsing Kate Hsiao. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.