Identification of Potential Transcriptomic Markers in Developing Ankylosing Spondylitis: A Meta-Analysis of Gene Expression Profiles
The goal of this study was to identify potential transcriptomic markers in developing ankylosing spondylitis by a meta-analysis of multiple public microarray datasets. Using the INMEX (integrative meta-analysis of expression data) program, we performed the meta-analysis to identify consistently differentially expressed (DE) genes in ankylosing spondylitis and further performed functional interpretation (gene ontology analysis and pathway analysis) of the DE genes identified in the meta-analysis. Three microarray datasets (26 cases and 29 controls in total) were collected for meta-analysis. 905 consistently DE genes were identified in ankylosing spondylitis, among which 482 genes were upregulated and 423 genes were downregulated. The upregulated gene with the smallest combined rank product (RP) was GNG11 (combined ). The downregulated gene with the smallest combined RP was S100P (combined ). In the gene ontology (GO) analysis, the most significantly enriched GO term was “immune system process” (). The most significant pathway identified in the pathway analysis was antigen processing and presentation (). The consistently DE genes in ankylosing spondylitis and biological pathways associated with those DE genes identified provide valuable information for studying the pathophysiology of ankylosing spondylitis.
Ankylosing spondylitis (AS) represents a chronic inflammatory arthritis, which affects the axial joints such as spine and sacroiliac joints . It causes serious spinal mobility impairment and influences the quality of life . Ankylosing spondylitis is a complex and systemic rheumatic disease; hence systematic screening is required to improve the diagnosis and treatment of ankylosing spondylitis.
Rapid growth of high-throughput transcriptomic data largely enables gene expression profiling and diagnostic targets identification in disease nowadays. In the past decade, several studies have focused on the transcriptional profiling of ankylosing spondylitis using microarrays to identify candidate genes involved in ankylosing spondylitis [3, 4]. Analysis of multiple transcriptomic datasets has the likelihood of discovering robust candidates for diagnosis and treatment. Therefore, we investigated gene expression patterns between ankylosing spondylitis patients and healthy controls in a meta-analysis based on public microarray datasets. The differently expressed genes identified in the meta-analysis were further interpreted by gene ontology analysis and pathway analysis.
To carry out these studies, we used the INMEX (integrative meta-analysis of expression data) program . Careful data procession and annotation were done to insure that the data format and class labels were consistent across datasets. Due to the differences in study design and platform usage, heterogeneity exists among microarray datasets. To address this, we applied the combing rank orders algorithm based on the RankProd package , which is robust facing outliers and variations among studies, to carry out the meta-analysis.
2. Materials and Methods
2.1. Microarray Datasets Search and Selection
In this study, we searched public microarray study till March 18, 2014, according to the keywords “ankylosing spondylitis” in Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) . The studies obtained were further selected for the meta-analysis and our selection criteria were (a) case-control study; (b) study providing gene expression data; and (c) study with ankylosing spondylitis patients diagnosed based on the modified New York criteria . Animal studies and studies not about ankylosing spondylitis were excluded in this meta-analysis.
Two investigators independently collected data from each eligible study. The data were composed of GEO accession, sample size, sample source, platform, and gene expression data. Through checking between the two investigators, a final data collection was determined.
2.2. Meta-Analysis Methods
According to the data collected from each eligible microarray study, we performed an overall meta-analysis to identify differentially expressed (DE) genes in ankylosing spondylitis. In this study, we used the INMEX (integrative meta-analysis of expression data) program (http://www.inmex.ca/INMEX/)  to carry out the meta-analysis.
All eligible datasets were uploaded to INMEX, then processed, and annotated to insure that the data format and class labels were consistent across datasets. After data integrity check, we carried out a meta-analysis using combing rank orders algorithm, with 100 times of permutation tests. The combing rank order algorithm is based on the RankProd package  and is robust facing outliers and variations among studies.
2.3. Functional Interpretation Methods
Functional interpretation (gene ontology analysis and pathway analysis) of the DE genes identified in the meta-analysis was further performed using the INMEX program. In gene ontology (GO) analysis, a value threshold of 0.05 was used to identify significantly enriched GO terms . In pathway analysis, enrichment analysis was carried out using the hypergeometric test with a value threshold of 0.05 based on the KEGG database .
3.1. Studies and Data Included in This Meta-Analysis
Original search identified 8 studies in total. Then, 5 studies were excluded among which 4 were not about the DE genes between ankylosing spondylitis patients and healthy controls, and 1 was animal study. Through searching and selection, a final list of 3 microarray datasets [3, 4] was collected for meta-analysis. In total, the 3 eligible datasets consisted of 26 cases and 29 controls. All 3 datasets provided case-control data with various sample sources (1 dataset of synovial biopsies sample and 2 datasets of blood sample). The detailed information of these 3 datasets is presented in Table 1. Heat map of rescaled individual expression data for a subset of genes across the 3 datasets is shown in Figure 1, and the patterns of change for a gene among different datasets could be visualized.
3.2. Meta-Analysis Results
In this study we performed the meta-analysis based on combing rank orders, and the DE genes with value < 0.05 were selected. Overall, there were 743 gained genes and 167 lost genes in this meta-analysis (Figure 2, see Supplementary Table 1 in Supplementary Materials available online at http://dx.doi.org/10.1155/2014/826316). Gain genes are those identified to be differentially expressed uniquely in the meta-analysis. The expression profiles of gain genes are relatively weak but consistent across datasets. They benefit by larger sample size and hence are more reliable DE genes. Lost genes are those identified to be differentially expressed in individual analysis rather than in the meta-analysis. The expression profiles of lost genes vary largely across different datasets .
In total, according to the results of our meta-analysis, 905 genes were identified to be differentially expressed between ankylosing spondylitis patients and healthy controls across microarray datasets (Supplementary Table 2). Among the 905 DE genes, 482 genes were upregulated and 423 genes were downregulated. The top 10 most significantly upregulated genes and top 10 most significantly downregulated genes are shown in Table 2. The upregulated gene with the smallest combined rank product (RP) was GNG11 (combined RP = 299.64). GNG11, guanine nucleotide binding protein (G protein) gamma 11, is a member of the G protein gamma family which functions in the transmembrane signaling system and cellular senescence [11, 12]. The downregulated gene with the smallest combined RP was S100P (combined RP = 335.94). S100P (S100 calcium binding protein P) belongs to the S100 calcium-binding protein family and functions in the regulation of diverse cellular processes . However, neither GNG11 nor S100P have been reported to be associated with ankylosing spondylitis yet.
Many consistently DE genes across datasets identified are involved in immune regulation, such as COMMD6, C19orf59, CCR7, CX3CR1, CFD, and FGFBP2 (see Table 2). Although these results are suggestive rather than straightforward, those DE genes could be involved in the pathogenesis of ankylosing spondylitis and further research is required.
3.3. Advanced Analyses Results
Advanced analyses (GO analysis and pathway analysis) were carried out for further functional investigation of the DE genes. Figure 3 presented a summary of the GO analysis results. In the GO analysis, 715 GO terms were significantly enriched for the DE genes (see Supplementary Table 3), and the three most significantly enriched GO terms were “immune system process” ( = 3.46 × 10−26), “immune response” ( = 7.83 × 10−24), and “defense response” ( = 6.99 × 10−16) (Table 3). In the pathway analysis, 15 significant pathways were identified when we mapped the DE genes to the KEGG database (see supplementary Table 4), and the three most significant pathways were antigen processing and presentation ( = 8.40 × 10−5), measles ( = 3.30 × 10−3), and cell adhesion molecules (CAMs) ( = 5.66 × 10−3) (Table 4).
A number of genes have been reported to be upregulated or downregulated in ankylosing spondylitis patients [14–16]. Identification of the most important candidate genes and pathways involved in ankylosing spondylitis pathogenesis is a challenge currently. Growing high-throughput transcriptomic data enables meta-analysis of multiple datasets which has the likelihood of discovering robust candidates for diagnosis and treatment. Hence in this study, we performed a meta-analysis of multiple public microarray datasets to identify potential transcriptomic markers in developing ankylosing spondylitis.
In the meta-analysis, 905 consistently DE genes were identified in ankylosing spondylitis, among which 482 genes were upregulated and 423 genes were downregulated. The upregulated gene with the smallest combined RP was GNG11. GNG11 belongs to the G protein gamma family and plays a role in the transmembrane signaling system and cellular senescence [11, 12]. The downregulated gene with the smallest combined RP was S100P. S100P encodes a protein which is a member of the S100 calcium-binding protein family and functions in the regulation of diverse cellular processes . So far, S100P is reported to be involved in the development and progression of various cancers . Although the exact contributions of these DE genes to ankylosing spondylitis development are not clear yet, further research is necessary as those genes could be potential transcriptomic markers for ankylosing spondylitis.
Many consistently DE genes identified are involved in immune regulation, suggesting its major role in ankylosing spondylitis development. A set of DE genes without previous studies in ankylosing spondylitis were also identified in our analysis. Some of these DE genes may function in the pathogenesis of ankylosing spondylitis and could be potential biomarkers.
In the GO analysis, 715 significantly enriched GO terms were identified in total. The top three significantly enriched GO terms were “immune system process,” “immune response,” and “defense response.” Similar features were also shown in a previous microarray analysis of ankylosing spondylitis mouse model . The overlap of enriched GO terms between human study and mouse study suggests that immune response plays a major role in ankylosing spondylitis development and therefore deserves further studies. In the pathway analysis, 15 significant pathways were identified in all. The top three significant pathways were antigen processing and presentation, measles, and cell adhesion molecules (CAMs). Our result that pathways involved in immune response are related with the pathogenesis of ankylosing spondylitis is consistent with previous reports highlighting innate immune stimulation and the IL-23 pathway in ankylosing spondylitis pathogenesis [18, 19]. Our analysis also identified pathways not previously studied in ankylosing spondylitis, some of which may harbor interesting functions and deserve further investigation.
In addition, all the results of our meta-analysis should be considered prudently due to the existence of several limitations. One limitation is the insufficient sample size used in our meta-analysis. A second limitation is the lack of subgroup analyses based on potential influential factors, including age, sex, treatment, disease severity, and platform usage, as ankylosing spondylitis is reported to be more prevalent in men and often occur in the third decade of life . The third limitation is that biological knowledge base and pathway information are far from being complete at present and need further investigation. Hence, in order to achieve a more convincible conclusion, further analysis using larger sample size and more complete biological knowledge base and pathway information is required, and stratified analyses on different factors such as age, sex, disease severity, and platform usage are needed. In addition, experimental verification of the candidate DE genes identified should also be performed in the future, and functional studies need to be carried out as well to address the exact roles of those candidate DE genes in ankylosing spondylitis.
In conclusion, we identified consistently DE genes in ankylosing spondylitis that could potentially serve as transcriptomic markers. GO and pathway analyses revealed that those candidates strongly were associated with immune system process besides the underlying complex and multifactor-influenced molecular mechanism. These results provide novel insights into the pathogenesis of ankylosing spondylitis and promote the generation of diagnostic gene sets.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was supported by grants from key medical subjects of Jiangsu Province (Grant no. XK201120); Innovative Team of Jiangsu Province (Grant no. LJ201114); Special Clinical Medical Science and Technology of Jiangsu Province (Grant nos. BL2012050 and BL2013014); Key Laboratory of Suzhou (Grant nos. SZS201108 and SZS201307); and National Natural Science Foundation (Grant nos. 81100371, 81370627, 81300423, and 81272143).
Supplementary Table 1: The 743 gained genes and 167 lost genes in this meta-analysis.
Supplementary Table 2: The 905 genes identified to be differentially expressed between ankylosing spondylitis patients and healthy controls across microarray datasets in this meta-analysis.
Supplementary Table 3: The 715 significantly enriched GO terms for the DE genes in ankylosing spondylitis.
Supplementary Table 4: The 15 significant pathways identified when the DE genes in ankylosing spondylitis were mapped to the KEGG database.
F. M. Pimentel-Santos, D. Ligeiro, M. Matos et al., “Whole blood transcriptional profiling in ankylosing spondylitis identifies novel candidate genes that might contribute to the inflammatory and tissue-destructive disease aspects,” Arthritis Research and Therapy, vol. 13, Supplement, no. 2, article no. R57, 2011.View at: Publisher Site | Google Scholar
J. A. Smith, M. D. Barnes, D. Hong, M. L. DeLay, R. D. Inman, and R. A. Colbert, “Gene expression analysis of macrophages derived from ankylosing spondylitis patients reveals interferon-γ dysregulation,” Arthritis and Rheumatism, vol. 58, no. 6, pp. 1640–1649, 2008.View at: Publisher Site | Google Scholar
C. Romero-Sanchez, H.-K. Tsou, M.-S. Jan et al., “Serum monocyte chemotactic protein-1 concentrations distinguish patients with ankylosing spondylitis from patients with mechanical low back pain,” Journal of Spinal Disorders & Techniques, vol. 24, no. 3, pp. 202–207, 2011.View at: Publisher Site | Google Scholar