Table of Contents Author Guidelines Submit a Manuscript
Computational and Mathematical Methods in Medicine
Volume 2017, Article ID 8520480, 10 pages
https://doi.org/10.1155/2017/8520480
Research Article

Node-Structured Integrative Gaussian Graphical Model Guided by Pathway Information

1Department of Statistics, Keimyung University, Daegu, Republic of Korea
2The Institute of Natural Science, Keimyung University, Daegu, Republic of Korea
3Department of Statistics, Korea University, Seoul, Republic of Korea
4Graduate School of Information Security, Korea University, Seoul, Republic of Korea
5School of Industrial Management Engineering, Korea University, Seoul, Republic of Korea

Correspondence should be addressed to ByungYong Lee; moc.liamg@101901mor and SungWon Han; rk.ca.aerok@nahws

Received 31 October 2016; Revised 20 February 2017; Accepted 6 March 2017; Published 12 April 2017

Academic Editor: Hongmei Zhang

Copyright © 2017 SungHwan Kim et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Up to date, many biological pathways related to cancer have been extensively applied thanks to outputs of burgeoning biomedical research. This leads to a new technical challenge of exploring and validating biological pathways that can characterize transcriptomic mechanisms across different disease subtypes. In pursuit of accommodating multiple studies, the joint Gaussian graphical model was previously proposed to incorporate nonzero edge effects. However, this model is inevitably dependent on post hoc analysis in order to confirm biological significance. To circumvent this drawback, we attempt not only to combine transcriptomic data but also to embed pathway information, well-ascertained biological evidence as such, into the model. To this end, we propose a novel statistical framework for fitting joint Gaussian graphical model simultaneously with informative pathways consistently expressed across multiple studies. In theory, structured nodes can be prespecified with multiple genes. The optimization rule employs the structured input-output lasso model, in order to estimate a sparse precision matrix constructed by simultaneous effects of multiple studies and structured nodes. With an application to breast cancer data sets, we found that the proposed model is superior in efficiently capturing structures of biological evidence (e.g., pathways). An R software package nsiGGM is publicly available at author’s webpage.

1. Introduction

Genomic data have been extensively applied to analyze disease mechanism on the basis of predictive signatures from DNA alterations (e.g., genotyping and mutation), RNA transcription (e.g., gene or isoform expression and fusion transcripts), and gene regulation by epigenetic changes (e.g., methylation, protein-DNA interaction, and miRNA expression). In particular, gene regulation is a complicated system that builds on tens of thousands of cellular components’ interactions and diverse activities across multiple layers. Biological networks are the most popularly used data resource to sketch this interconnectivity of gene regulations. High-throughput genomic technologies are paving the way toward systematically characterizing diverse types of biological networks and suggestive of underlying gene regulation mechanisms. And yet a complete inference of network’s complexity has been a long concern in the field of systems biology.

To circumvent the shortcoming of single feature-based analysis, the activity of a gene or of a whole biological process in a disease can be assessed by sets of genes (a.k.a. gene set enrichment analysis or pathway analysis). In doing so, a bulk of pathways have been identified through many cancer-related researches [1]. Pathway information demonstrates cellular functions and biological processes or represents a unique signature of deregulation of a given gene [2]. For example, the pathway or signature associated with the activity of a given oncogene is defined as the set composed of those genes most differentially expressed by perturbation of oncogenes [35]. Importantly, the usage of pathway information is increasingly prevalent in biomedicine. For instance, target drug associated with potential pathway is taken as a practical solution to overcome the traditional drug discovery that usually adopts the one-drug-one-target approach. This strategy takes into account the fact that the disease occurrence is usually the result of complex interactions of molecular events.

In recent years, large-scale genomic data generated from relevant biological experiments or clinical hypotheses have increasingly soared, as high-throughput experiment technologies have markedly advanced [6]. Such increasing genomic data has been publicly available in data repositories (e.g., Gene Expression Omnibus and Sequence Read Archive). This abundance of biological experiments poses a new challenge of multiple data in regard to exploring and validating biological signatures and pathways. More precisely, a question of network analysis often relates to how to characterize underlying transcriptomic patterns or molecular mechanisms across disease subtypes or between case-control groups, because it is commonplace that biological signals are not coherently present across studies. Generally a single network [79] is found to accurately estimate underlying dependency with an adjustment of gene perturbation effects (e.g., polymorphic genotype alteration [10, 11]). Nonetheless, these methods hardly discover network patterns of subtle signals and dynamic features in the midst of coupled networks under diverse conditions. Moreover, single networks potentially generate many potential false positive signals (edges) attributed to experimental biases and errors. To address this challenge, the recent trend of data analysis has been in the spotlight to data integration allowing for multiple data to achieve a more accurate network inference. To this end, many have proposed methods to combine multiple networks based on unified model [1214]. This approach is also known as integrative analysis and is analogue to traditional meta-analysis.

The joint Gaussian graphical model (JGGM; Danaher et al. [12]) focuses on incorporating nonzero edge effects (i.e., off-diagonal entries of precision matrix) to combine multiple studies in view of integrative analysis. This model, however, inevitably is dependent on post hoc analysis when validating biological significance. Therefore it is interesting to combine not only DNA and/or transcriptomic changes but also pathway information as such well-ascertained biological evidence. Normally we perform post hoc analysis to see if the estimated gene networks are enriched for any pathways. Contrary to this, it is also sensible to estimate gene networks, with an adjustment of pathway information. It is common that we hardly combine pathway information in spite of its biological significance. To the best of our knowledge, no method has been proposed that can accommodate overlapping node structures, mainly due to overlapped gene annotations of pathway gene sets. To tackle this problem, we propose a new graphical model called “node-structured integrative Gaussian graphical model (nsiGGM)” jointly leveraging a priori knowledge of pathway information. This method allows for overlapping group lasso problems, making it possible to integrate overlapped genes of pathways. It is worthwhile for biological pathways to intervene the network estimation to reveal true gene regulatory network. The nsiGGM builds on prespecified structured nodes with multiple genes as building blocks in the stage of estimating a precision matrix. The implementation rule employs lasso penalty of structured input-output lasso model [15], in order to estimate sparse precision matrix that accounts for simultaneous effects of multiple studies and structured nodes. With an application to simulated and breast cancer genomic data, the proposed model is found to be superior in efficiently capturing transcriptional modules predefined by pathway database. A software package (nsiGGM) is publicly available at author’s webpage (https://sites.google.com/site/sunghwanshome/).

This paper is outlined as follows. In Section 2, we review background knowledge of the standard and joint Gaussian graphical models. In addition, we propose the node-structured integrative Gaussian graphical model (nsiGGM). In Section 3, we describe an implementation strategy that is primarily based on the input-output lasso. In Section 4, we compare performance of our proposed methods with other methods using real breast cancer data (TCGA) and simulated data. In Section 5, conclusions and further studies are discussed.

2. Method

In this section, we briefly discuss methodological backgrounds on the Gaussian graphical models (GGM) aiming at constructing gene networks. In what follows, we propose the node-structured integrative Gaussian graphical model (nsiGGM) that can accommodate a priori biological knowledge (e.g., pathway data or targeted predictive genes of miRNA).

2.1. Gaussian Graphical Models for Gene Networks

A Gaussian graphical model demonstrates the conditional dependency of multiple random variables, , with a graph , where is a set of nodes and is a set of edges indicating that nodes are linked and conditionally dependent. Let follow the multivariate Gaussian distribution , where is a covariance matrix. Let denote the inverse covariance matrix (also known as a precision matrix). More precisely, each nonzero off-diagonal element implies conditional dependency between the th and th nodes given all the other variables, , whereas the covariance presents marginal dependencies without considering other variables. This model is also called a GGM [16]. The graphical lasso [9, 17] produces a sparse Gaussian graphical model constructed in nonpenalized edges in . The graphical lasso minimizes the negative log-likelihood with the lasso penalty:where is the trace of matrix , is the sample covariance matrix, and is the regularization parameter adjusting the degree of sparsity. The optimal value for can be chosen by cross-validation or the Bayesian information criterion (BIC; Schwarz [18]; Yuan and Lin [8]).

2.2. Joint Gaussian Graphical Models for Combining Multiple Studies

In this section, we revisit the joint Gaussian graphical models (JGGM) proposed by Danaher et al. [12]. Simply put, the JGGM combines multiple studies and constructs multiple networks in a unified model. Let denote the number of studies in our data and ) the true precision matrices. Consider genomic data of studies, , each of which consists of samples with common features, where . We assume that observations are independent and that those of each data set follow the multivariate normal distribution as for . It is well known in meta-analysis that multiple data sets are of common associations and genomic characteristics among features (e.g., genetic association intensity). It, therefore, is worth estimating precision matrices across studies in parallel rather than separate estimation. To this end, we assume that the features within each data set are centered and take the form of a penalized log-likelihood with the group sparsity-inducing penalty that maximizes (2) with respect to :subject to being positive definite, where is the sample covariance matrix of and , are nonnegative tuning parameters. It is interesting to note that the -penalty captures similarity across the precision matrices. Due to this property, the penalty terms of (2) are also referred to as the joint graphical lasso (JGL). Moreover, the penalty induces estimated precision matrices to be sparse.

2.3. Node-Structured Integrative Gaussian Graphical Model

In this section, we propose an integrative graphical model that can accommodate a priori known structure of genomic features. Learning gene networks, the sparseness of precision matrix can be guided to some extent by known feature modules (e.g., pathway information). Typically data integration allows picturing the interplay of underlying biological factors. In this regard, it is worthwhile accommodating known feature module information ascertained in previous experiments. In doing so, we seek to integrate a priori feature module information to be embedded across multiple networks via an additional group penalty. The following objective function is taken to minimizewhere is a subset of off-diagonal entry indices of for , , is the number of a priori feature modules, and . Importantly, it is noted that elements of can be overlapped (e.g., duplicated genes of two different pathways). The third penalty, adjusted by pertains to structured feature modules (i.e., structured node in networks) on the basis of a priori known information. Here, unbiased regularization to each feature should be taken into consideration, in the sense that the feature overlapping inevitably comes into play.

In what follows, we present a toy example to demonstrate how a priori information constructs feature modules in . In Figure S1, in Supplementary Material available online at https://doi.org/10.1155/2017/8520480, we take an example of networks consisting of 5 common nodes (e.g., genomic features) across three studies. In Figure S1A, the second penalty with captures matched up common edges (e.g., ) identical to the joint graphical lasso. Besides, the third group lasso penalty with accommodates the six edges of the three features in a predefined module so that feature regulatory effects can be further modeled in the context of data integration (see Figure S1B). Importantly note that this module structure (e.g., pathway) is priorly known knowledge. It is interesting that this approach is in line with the integrative cluster [19] that allows for cis-regulatory effects and target gene prediction for miRNAs. In the case of multiple modules in network, suppose that we are given a set of five genes and a precision matrix for . Let a priori information generate two feature modules defined as Module 1, , and Module 2, , and then we can enumerate precision matrix’s index of each module for all , say, and . Of note, the component is simultaneously present in both and , implicating that a suitable implementation is required for regularization to the overlapped component . To estimate solutions to (3), we apply the structured input-output lasso [15] that can handle overlapped features, making it possible to learn a model allowing for both single-node effects across studies and predefined node structures (e.g., pathway modules). Inspired by integrative nature of this method, we call this graphical model the node-structured integrative Gaussian graphical model (nsiGGM). When it comes to tuning the penalty parameters (, , and ), the BIC is applied to determine the optimal sparseness of networks’ edges.

3. Implementation Strategy

3.1. Structured Alternating Directions Method of Multipliers Algorithm

In this section, we delineate the implementation strategy for the nsiGGM. We solve problem (3) by using structured alternating directions method of multipliers algorithm (sADMM). The alternating directions method of multipliers algorithm (ADMM) was previously introduced to tackle the problem of the JGL [12]. Similar to the JGL, the sADMM proposed in spirit of the ADMM is designed to adopt the structured input-output lasso in order to embed node structures into the model. We first reformulate (3) with and aswhere ; for and that satisfies positive definiteness. Boyd et al. [20] proposed the scaled augmented Lagrangian to solve problem (4) bywhere are dual variables and denotes the Frobenius norm of matrix (i.e., ). The sADMM algorithm repeatedly solves the three-step optimization with respect to , and , starting with initial values of the related parameters: , , and for . The iteration is repeated until convergence as follows: In -step for , update that minimizes In -step, for , update that minimizeswhere . To find the optimal solution of (7), we directly apply the structured input-output lasso [15] to (7) using both coordinate descent algorithm and KKT conditions considered to boost up the computational speed. For more details, see [15]. In -step, for , update as . Update repeatedly the three parameters until convergence by a stopping rule below: Putting together, Algorithm 1 encapsulates the structured alternating directions method of multipliers algorithm.

Algorithm 1: The structured alternating directions method of multipliers algorithm.

4. Numerical Studies

4.1. Simulated Data

In this section, we carry out experimental studies to assess performance of the nsiGGM. In brief, the following describes how we generate simulated data. The experimental scheme is largely motivated by Chun et al. [14]. Let be the total number of studies, each containing true signal genes for a priori module (e.g., pathway genes) and sample size , where (=3). Starting off with edges of signal genes, we first generate network edges of 100 nodes subject to the scale-free network structures, the most commonly observed structures in biology, being simulated by applying the Barabasi Albert algorithm [21]. Subsequent to this, we randomly added four edges to impose random effects. Constructing network structures, we simulate the precision matrices by setting values of the off-diagonals sampled from and by setting the diagonal elements with . The process is repeated until becomes a positive definite matrix. For simulating , we first consider a scenario such that no covariate incurs dependency among genes. Thus, this is an ideal experiment scenario in that any conditional dependency is not taken into consideration to the model. We simulated , where each th row of was randomly sampled from . Simulations were repeated and average values are presented in Tables 1 and 2. To examine performance of the nsiGGM, sensitivity, specificity, and Youden index were benchmarked by comparing the JGGM [12] and GGM [16]. Youden index is defined as , ranging from to 1. In principle, the higher the Youden index, the higher the prediction accuracy.

Table 1: Performance comparisons of the nsiGGM with the JGGM and GGM using data simulated along with predefined module genes.
Table 2: Shown are the brief descriptions of the three data information pieces used in real genomic application.

In Table 1, Youden index of the nsiGGM appears to be clearly declining as noise edges increase in number and yet is consistently larger than that shown in the JGGM and GGM. This is mainly due to the fact that the JGGM suffers low specificity (0.8685–0.8733) compared to the nsiGGM (0.9433–0.9481). In contrast, the JGGM slightly outperforms, when 30 and 40 noises are augmented, the nsiGGM for sensitivity at the expense of poor specificity. Taken together, it is clear to say that the nsiGGM is superior to the JGGM and GGM in detecting the true underlying pathway sets.

4.2. Application to Genomic Data

In this section, we demonstrate applications to three mRNA expression profiles for breast cancer. We collected two microarray profiles from Desmedt et al. [22], Wang et al. [23], and TCGA cancers data from TCGA’s web portal (https://cancergenome.nih.gov/), where we retrieved mRNA data of breast carcinoma (BRCA). We matched up features across all studies and filtered out probes by the rank sum of mean and standard deviation (SD 0.99; Wang et al. [24]), which selected 106 genes. Table 2 delineates detailed information of miRNA expression data. In what follows, we examine if the nsiGGM is suited to improve accuracy for detecting pathway genes. We collected gene sets from exploring the Molecular Signatures Database (MSigDB) v2.5 gene set collections [25], consisting of at least 11 genes of 106 genes, of which 53 distinct genes belong to the 11 pathways presented in Table 3. To evaluate a detection rate of pathway genes, we define an evaluation benchmark, as follows:where is a set of gene indices, whose genes form th network and is an indicator function. Comparing the JGGM, we examine whether the nsiGGM effectively captures the existing pathway structures better than the JGGM in context of connectivity and proportions of identified pathway genes. To first appearances, the nsiGGM effectively represents modules well enriched with pathway genes in Figure 1, as compared to those of the JGGM in Figure 2. In support of this notion, given that we observed of nsiGGM = 0.573 and of JGGM = 0.521, where , it is not surprising to say that the nsiGGM can facilitate constructing gene networks biologically more enriched for pathway gene sets than the JGGM. Table 3 enumerates the pathway genes discovered by the nsiGGM, each being highlighted by bold and underlined characters (note: asterisks represent pathway genes identified by only the nsiGGM not by JGGM). Interestingly, there are many pathways genes monitored by the nsiGGM, but not by the JGGM. Focusing on the cell signaling pathway, we particularly notice that EREG [26], SLC1A1 [27], STC2 [28], GAD1 [29], and TRH [30] are genes not selected by the JGGM but nonetheless previously were monitored in signaling pathways. Importantly, Hou et al. [28] showed that STC2 inhibited tumorigenesis and metastasis of breast cancer cells, indicating that STC2 may inhibit epithelial-mesenchymal transition (EMT) at least partially through the PKC/Claudin-1-mediated signaling in human breast cancer cells. Therefore, STC2 can be taken into consideration as a potential biomarker for metastasis and targeted therapy in human breast cancer. Besides, signaling through glutamate receptors in regard to SLC1A1 has been reported in human cancers [31]. In support of this evidence, it is also well known that increases in SLC1A1 expression subject to hypoxia-inducible factors (HIFs) possibly contributes to increased efflux of glutamate, by which glutamate transporters and receptors are regulated to activate key signal transduction pathways that promote cancer progression [27]. Therefore, it is clear to say that the nsiGGM is superior in detecting genes capable of implicating the functional process of human cancers in essence.

Table 3: The pathway sets from the Molecular Signatures Database (MSigDB) analyzed in the nsiGGM. (Note: asterisks represent pathway genes identified by only the nsiGGM not by JGGM.)
Figure 1: Three gene networks estimated by the nsiGGM. The detection rate of pathway genes is 0.573.
Figure 2: Three gene networks estimated by the JGGM. The detection rate of pathway genes is 0.521.

5. Conclusion and Discussion

In this article, we propose a new graphical model called node-structured integrative Gaussian graphical model (nsiGGM) jointly learning Gaussian graphical models with an emphasis of prior knowledge of pathway information. It is highlighted that this method allows us to handle overlapping group lasso problems, making it possible to integrate overlapped pathway gene sets. With applications to experimental and real data, we verified outstanding numerical performance of the nsiGGM and analytical capability of inducing biological significance related to breast cancer. And yet it might be controversial whether prior knowledge too excessively determines the network structures. Despite apprehension to overly guided network structures, a priori known information can be still acceptable in that the nsiGGM selects tuning parameters on the basis of the likelihood-based BIC.

The proposed nsiGGM is highly subject to computational complexity in nature, mainly due to the coordinate decent algorithm to tackle the sparse overlapping group lasso. Since the sparse overlapping group lasso applied here deals with both study-specific effects and prior knowledge, the optimization becomes inevitably complicated. Our current package is implemented in R and the routine flows can be further expedited via C/C++ in the future. Currently, the prior knowledge of regulatory structure is accommodated to an unidirectional graphical model. It is also interesting that we impose the prior knowledge to directional networks instead, so that the presence or absence of directional edges amid multiple features can be explicitly modeled. We leave these tasks for future research.

Disclosure

SungWon Han is the corresponding author, and ByungYong Lee is the cocorresponding author.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this manuscript.

Acknowledgments

This research is supported by the Korea University Grant (K1607901) and the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2017R1C1B5017528).

References

  1. R. Maglietta, A. Distaso, A. Piepoli et al., “On the reproducibility of results of pathway analysis in genome-wide expression studies of colorectal cancers,” Journal of Biomedical Informatics, vol. 43, no. 3, pp. 397–406, 2010. View at Publisher · View at Google Scholar · View at Scopus
  2. G. A. Viswanathan, J. Seto, S. Patil, G. Nudelman, and S. C. Sealfon, “Getting started in biological pathway construction and analysis,” PLoS Computational Biology, vol. 4, no. 2, article e16, 2008. View at Publisher · View at Google Scholar · View at Scopus
  3. A. H. Bild, G. Yao, J. T. Chang et al., “Oncogenic pathway signatures in human cancers as a guide to targeted therapies,” Nature, vol. 439, no. 7074, pp. 353–357, 2006. View at Publisher · View at Google Scholar · View at Scopus
  4. D. Hanahan and R. A. Weinberg, “Hallmarks of cancer: the next generation,” Cell, vol. 144, no. 5, pp. 646–674, 2011. View at Publisher · View at Google Scholar · View at Scopus
  5. T. Kessler, H. Hache, and C. Wierling, “Integrative analysis of cancer-related signaling pathways,” Frontiers in Physiology, vol. 4, article 124, 2013. View at Publisher · View at Google Scholar · View at Scopus
  6. S. Richardson, G. C. Tseng, and W. Sun, “Statistical methods in integrative genomics,” Annual Review of Statistics and Its Application, vol. 3, pp. 181–209, 2016. View at Publisher · View at Google Scholar · View at Scopus
  7. H. Li and J. Gui, “Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks,” Biostatistics, vol. 7, no. 2, pp. 302–317, 2006. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  8. M. Yuan and Y. Lin, “Model selection and estimation in the Gaussian graphical model,” Biometrika, vol. 94, no. 1, pp. 19–35, 2007. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  9. J. Friedman, T. Hastie, and R. Tibshirani, “Sparse inverse covariance estimation with the graphical lasso,” Biostatistics, vol. 9, no. 3, pp. 432–441, 2008. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  10. J. Yin and H. Li, “A sparse conditional Gaussian graphical model for analysis of genetical genomics data,” Annals of Applied Statistics, vol. 5, no. 4, pp. 2630–2650, 2011. View at Publisher · View at Google Scholar · View at Scopus
  11. L. Zhang and S. Kim, “Learning gene networks under SNP perturbations using eQTL datasets,” PLoS Computational Biology, vol. 10, no. 2, Article ID e1003420, 2014. View at Publisher · View at Google Scholar · View at Scopus
  12. P. Danaher, P. Wang, and D. M. Witten, “The joint graphical lasso for inverse covariance estimation across multiple classes,” Journal of the Royal Statistical Society. Series B: Statistical Methodology, vol. 76, no. 2, pp. 373–397, 2014. View at Publisher · View at Google Scholar · View at Scopus
  13. J. Guo, E. Levina, G. Michailidis, and J. Zhu, “Joint estimation of multiple graphical models,” Biometrika, vol. 98, no. 1, pp. 1–15, 2011. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  14. H. Chun, M. Chen, B. Li, and H. Zhao, “Joint conditional Gaussian graphical models with multiple sources of genomic data,” Frontiers in Genetics, vol. 4, article 294, 2013. View at Publisher · View at Google Scholar · View at Scopus
  15. S. Lee and E. P. Xing, “Leveraging input and output structures for joint mapping of epistatic and marginal eQTLs,” Bioinformatics, vol. 28, no. 12, pp. i137–i146, 2012. View at Publisher · View at Google Scholar · View at Scopus
  16. S. Lauritzen, Graphical Models, Graphical Models, Oxford University Press, Oxford, UK, 1996.
  17. O. Banerjee, L. El Ghaoui, and A. D'Aspremont, “Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data,” Journal of Machine Learning Research, vol. 9, pp. 485–516, 2008. View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  18. G. Schwarz, “Estimating the dimension of a model,” The Annals of Statistics, vol. 6, no. 2, pp. 461–464, 1978. View at Publisher · View at Google Scholar
  19. S. Kim, S. Oesterreich, S. Kim, Y. Park, and G. C. Tseng, “Integrative clustering of multi-level omics data for disease subtype discovery using sequential double regularization,” Biostatistics, vol. 18, no. 1, pp. 165–179, 2017. View at Publisher · View at Google Scholar
  20. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2010. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  21. A.-L. Barabási and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  22. C. Desmedt, F. Piette, S. Loi et al., “Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series,” Clinical Cancer Research, vol. 13, no. 11, pp. 3207–3214, 2007. View at Publisher · View at Google Scholar · View at Scopus
  23. Y. Wang, J. G. M. Klijn, Y. Zhang et al., “Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer,” The Lancet, vol. 365, no. 9460, pp. 671–679, 2005. View at Publisher · View at Google Scholar · View at Scopus
  24. X. Wang, Y. Lin, C. Song, E. Sibille, and G. C. Tseng, “Detecting disease-associated genes with confounding variable adjustment and the impact on genomic meta-analysis: with application to major depressive disorder,” BMC Bioinformatics, vol. 13, no. 1, article 52, 2012. View at Publisher · View at Google Scholar · View at Scopus
  25. A. Subramanian, P. Tamayo, V. K. Mootha et al., “Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles,” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 43, pp. 15545–15550, 2005. View at Publisher · View at Google Scholar · View at Scopus
  26. M. Farooqui, L. R. Bohrer, N. J. Brady et al., “Epiregulin contributes to breast tumorigenesis through regulating matrix metalloproteinase 1 and promoting cell survival,” Molecular Cancer, vol. 14, no. 1, article 138, 2015. View at Publisher · View at Google Scholar · View at Scopus
  27. H. Hu, N. Takano, L. Xiang, D. M. Gilkes, W. Luo, and G. L. Semenza, “Hypoxia-inducible factors enhance glutamate signaling in cancer cells,” Oncotarget, vol. 5, no. 19, pp. 8853–8868, 2014. View at Publisher · View at Google Scholar · View at Scopus
  28. J. Hou, Z. Wang, H. Xu et al., “Stanniocalicin 2 suppresses breast cancer cell migration and invasion via the PKC/Claudin-1-mediated signaling,” PLoS ONE, vol. 10, no. 4, Article ID e0122179, 2015. View at Publisher · View at Google Scholar · View at Scopus
  29. R. Kimura, A. Kasamatsu, T. Koyama et al., “Glutamate acid decarboxylase 1 promotes metastasis of human oral cancer by β-catenin translocation and MMP7 activation,” BMC Cancer, vol. 13, article 555, 2013. View at Publisher · View at Google Scholar · View at Scopus
  30. M. Martínez-Armenta, S. Díaz de León-Guerrero, A. Catalán et al., “TGFβ2 regulates hypothalamic Trh expression through the TGFβ inducible early gene-1 (TIEG1) during fetal development,” Molecular and Cellular Endocrinology, vol. 400, pp. 129–139, 2015. View at Publisher · View at Google Scholar · View at Scopus
  31. M. Nedergaard, T. Takano, and A. J. Hansen, “Beyond the role of glutamate as a neurotransmitter,” Nature Reviews Neuroscience, vol. 3, no. 9, pp. 748–755, 2002. View at Publisher · View at Google Scholar · View at Scopus