Abstract

Background. Bladder cancer (BCa) is a common urothelial malignancy. The Cancer Genome Atlas (TCGA) database allows for an opportunity to analyze the relationship between gene expression and clinical outcomes in bladder cancer patients. This study is aimed at identifying prognosis-related genes in the bladder cancer microenvironment. Methods. Immune scores and stromal scores were calculated by applying the ESTIMATE algorithm. We divided bladder cancer patients into high and low groups based on their immune/stromal scores. Then, differentially expressed genes (DEGs) were identified in bladder cancer patients based on the TCGA database. We evaluated the correlation between immune/stromal scores and clinical characteristics as well as prognosis. Finally, we validated identified genes associated with bladder cancer prognosis through a cohort study in the Gene Expression Omnibus (GEO) database. Results. A higher stromal score was associated with female (vs. male), (vs.), T3/4 (vs. T1/2,), N status, and pathological high grade (vs. low grade). By analyzing DEGs, there were 1125 genes commonly upregulated, and 209 genes were commonly downregulated. Protein-protein interaction networks further showed the important protein that may be involved in the biological behavior and prognosis of BCa, such as FN1, CXCL12, CD3E, LCK, and ZAP70. A total of 14 DEGs were found to be associated with overall survival of bladder cancer. After validation by a cohort of 165 BCa cases with detailed follow-up information from GSE13507, 10 immune-associated DEGs were demonstrated to be predictive of prognosis in BCa. Among them, 5 genes have not been reported previously associated with the prognosis of BCa, including BTBD16, OLFML2B, PRRX1, SPINK4, and SPON2. Conclusions. Our study elucidated tight associations between stromal score and clinical characteristics as well as prognosis in BCa. Moreover, we obtained a group of genes closely related to the prognosis of BCa in the tumor microenvironment.

1. Introduction

Bladder cancer (BCa) is the fourth most common cancer in men and the twelfth in women and the leading cause of cancer-related mortality [1, 2]. Epidemiology studies reported that men have a four-fold higher incidence of bladder cancer than women [3]. Urothelial carcinoma accounts for 90% of BCa cases, whereas the remaining 10% cases are squamous cell carcinoma (SCC) or adenocarcinomas [4]. Bladder cancer treatment may include surgery, chemotherapy, radiotherapy, immunotherapy, and targeted therapy. Despite significant improvement in cancer treatment, two-thirds of patients with UBC will have a recurrence or disease progression within 5 years, leading to poor prognosis [5].

The tumor microenvironment consists of immune cells, mesenchymal cells, endothelial cells, along with inflammatory mediators, and extracellular matrix molecules [6]. Immune cells and stromal cells are the two most important nontumor cells in the tumor microenvironment. Previous evidences indicate that the components in the tumor microenvironment have an important role in promoting the proliferation and invasion of BC [79]. High stromal tumor-infiltrating lymphocytes within the tumor immune microenvironment is indicative of an inflamed subtype with an 80% 5-year disease-specific survival, and a lack of immune infiltrates is associated with an uninflamed subtype with a survival rate of smaller than 25%. A separate immune evading phenotype with upregulated immune checkpoints is associated with poor prognosis in bladder cancer [5]. Tumor microenvironment may have protective functions in other cancers, such as head and neck squamous cell carcinoma and glioblastoma [10, 11]. However, there are few reports that systematically investigate the genes involved in the tumor microenvironment and their associations with prognosis in bladder cancer.

To better understand the impact of tumor genetic composition on clinical outcomes, the Cancer Genome Atlas (TCGA), a comprehensive genome-wide gene expression collection, has been established to discover genomic abnormalities around the world [12]. Tomczak et al. designed an algorithm called ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data), which calculated immune and stromal scores to predict the infiltration of nontumor cells by analyzing specific gene expression signature [13]. In this present study, we identified a list of tumor microenvironment-related genes, which are closely related to the prognosis in bladder cancer by using the TCGA database and ESTIMATE algorithm. Moreover, we validated prognosis-related genes in a different bladder cancer cohort available from the Gene Expression Omnibus (GEO) database.

2. Materials and Methods

2.1. Data Acquisition

Gene expression profile and clinical data including gender, age, clinical stage, overall survival for 412 patients with BCa, and 19 adjacent nontumor tissues were obtained from the TCGA data portal (https://tcga-data.nci.nih.gov/tcga/). The histological subtype of the bladder cancer is muscle-invasive urothelial carcinoma. Immune scores and stromal scores were calculated by the ESTIMATE algorithm [14]. For validation, the expression data and clinical information of 165 eligible samples were identified from the GEO database (GSE13507) (https://www.ncbi.nlm.nih.gov/gds). Only histologically verified transitional cell carcinoma samples were selected in the GEO dataset (GSE13507).

2.2. Differentially Expressed Gene Analysis

Differentially expressed genes (DEGs) were determined between 412 BCa, and 19 nontumor counterparts were performed using the package “limma” in R. Genes were considered as DEGs following the thresholds of and adjusted value < 0.05. Expression profiling of the DEGs was performed using the R package heatmap. The overlap of DEGs was determined using the R package Venn diagrams.

2.3. Functional Enrichment Analysis of DEGs

Functional enrichment analysis of DEGs was performed by R packages “clusterProfiler”, “http://org.Hs.eg.db”, “enrichplot”, and “ggplot2” to identify gene ontology (GO) categories, including biological processes (BP), cellular components (CC), or molecular functions (MF). The abovementioned packages were also used to perform KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis [15]. The protein-protein interaction (PPI) network was built by Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (https://string-db.org); interaction was set as the cut off to screen for the key PPI nodes [16].

2.4. Overall Survival Analyses

The survival analysis was assessed by Kaplan-Meier methods. The patients with BCa were split into high-score group and low-score group according to the median immune/stromal scores. Overall survival curves were compared between the two groups using the Kaplan-Meier method. To investigate the associations of DEGs with overall survival, the Kaplan-Meier method was utilized to compare the survival rate difference between high and low groups stratified by the median gene expression levels. Log-rank tests were used to assess the statistical differences. Univariate and multivariate survival analyses were performed between overall survival, DEG expression levels, and clinical features, including age, gender, T clinical stage, N clinical stage, and M clinical stage and grade, using the Cox regression models of the R package survival. Hazard ratio (HR) and 95% confidence interval (CI) of HR were extracted from the Cox regression models. Prognosis-associated genes were classified into protective genes () and risk genes () according to their HR values. values < 0.05 were considered statistically significant.

2.5. Statistical Analysis

The correlations between clinicopathological parameters and immune/stromal scores were analyzed using a two-side student test. All of the statistical tests were performed in R version 3.3 (https://www.r-project.org/). values < 0.05 were considered as statistical significance.

3. Results

We downloaded gene expression profiles and clinicopathological parameters of 412 patients with BCa from the TCGA database. The total study population comprised 304 (73.8%) men and 108 (26.2%) women. Stromal scores ranged from -2537.32 to 2146.99, and immune scores ranged from -1591.88 to 3176.41.

3.1. The Association between Immune and Stromal Scores and Clinicopathological Parameters

Based on the latest AJCC staging, there was no statistical difference between immune score and the clinical stage () (Figure 1(a)). However, stromal cell scores were significantly associated with the clinical stage () (Figure 1(b)). We divided the 412 bladder cancer cases into high and low score groups. Kaplan-Meier survival curves (Figure 1(c)) showed that the low stromal scores group had a lower mortality rate than the high scores group (33.5% and 45.3%, ). Cases with low immune scores had a lower mortality rate compared to patients with high scores group (Figure 1(d), 38.9% and 39.4%, ), although no statistical differences were found.

To evaluate the correlations between clinicopathological parameters and immune/stromal scores, we compared and plotted the distribution of immune scores and stromal scores stratified by age, gender, T status, N status, M status, and Fuhrman grade. We found that higher stromal score was associated with female (vs. male, Figure 2(a)), (vs. , Figure 2(b),), T3/4 (vs. T1/2, Figure 2(c),), N status (Figure 2(d), ), and pathological high grade (vs. low grade, Figure 2(f)), but was not associated with M status (Figure 2(e)). However, high immune scores were only associated with pathological high grade (vs. low grade, Supplementary Figure 1(f), ), and no evidence supports the significant correlations between immune scores and age, gender, T status, N status, and M status (Supplementary Figure 1A-1E, ).

3.2. Characteristics in Gene Heatmaps with Immune Scores and Stromal Scores in BCa

Heatmaps showed the differential gene expression profiles based on the immune and stromal scores (Figures 3(a) and 3(b)). There were 1371 genes upregulated and 457 genes downregulated in the high immune scores group as compared to the low immune score group. A total of 1519 genes were upregulated, and 398 genes were downregulated in the high stromal score group as compared to the low stromal score group. Moreover, Venn diagrams (Figures 3(c) and 3(d)) showed that 1125 genes were commonly upregulated, and 209 genes were commonly downregulated.

3.3. Protein-Protein Interactions among Genes of Prognostic Value

We drew PPI networks by using the STRING (Figure 4). The number of protein nodes in the network diagram was shown in Figure 4. There were 13 genes in the network diagram with more than 20 interconnected nodes, such as FN1, CXCL10, CXCL12, IL10, ITGAM, CCL5, CCR5, CD4, CCR2, CXCL11, CXCL13, CXCL9, and LCK (Figure 5). It includes multiple genes that are closely related to immune response. Moreover, FN1, CXCL12, CD3E, LCK, and ZAP70 were key interconnected node genes in PPI networks and are also associated with overall survival in patients with bladder cancer (Figure 6).

3.4. GO and KEGG Enrichment Analysis of Genes

Top GO terms included the regulation of leukocyte activation and T cell activation in biological processes, collagen-containing extracellular matrix and extracellular matrix in cellular components, and receptor-ligand activity and receptor regular activity in molecular functions (Figure 7). In addition, the results of KEGG pathway enrichment analysis of differentially expressed proteins are shown in Figure 8, and cytokine-cytokine receptor interaction, PI3K-Akt signaling pathway, and chemokine signaling pathway are the top three enrichment pathway for differentially expressed proteins.

3.5. Correlation of Expression of Individual DEGs in Overall Survival and Validation in the GEO Database

We generated Kaplan-Meier survival curves to evaluate the role of individual DEGs in overall survival in bladder cancer. A total of 274 DEGs were found to be associated with the prognosis of bladder cancer (, selected genes are shown in Figure 9). To further validate, the prognosis-related genes found in the TCGA database are significant in other bladder cancer cases. We analyzed a cohort study of 165 bladder cancer cases from the GEO database. A total of 29 genes were validated (Figure 10) to be significantly associated with bladder cancer prognosis (Table 1).

Univariate and multivariate analyses confirmed that 9 genes were risk genes and 5 protective genes in the TCGA cohort ( for all cases, Supplementary Table 1-2). Of 14 prognosis-associated genes, 5 risk genes (CALD1, TNC, OLFML2B, PRRX1, and SPON2) and 5 protective genes (HOXB3, HOXB6, MOGAT2, BTBD16, and SPINK4) were validated by univariate survival analysis in the GEO cohort ( for all cases, Supplementary Table 1-2). Among them, 5 genes have not been reported previously associated with the prognosis of BCa, including BTBD16, OLFML2B, PRRX1, SPINK4, and SPON2.

4. Discussion

In our study, we found that immune scores and stromal scores were associated with BCa patients’ survival based on TCGA datasets, although no statistical differences were found in K-M survival analysis. Stromal scores were associated with multiple clinicopathological parameters, including AJCC stage, age, gender, T status, N status, and Fuhrman grade of BCa. However, immune scores were only associated with Fuhrman grade, and no significant correlation was found between immune scores and AJCC stage, age, gender, T status, N status, and M status. These results indicated that stromal cells in the tumor microenvironment may play a more important role in multiple biological behaviors, affecting the occurrence, development, and prognosis of bladder cancer, compared with immune cells.

We further performed heatmaps to show the differential gene expression profiles according to the immune and stromal scores. In addition, Venn diagrams showed that 1125 DEGs were commonly upregulated, and 209 DEGs were commonly downregulated. Moreover, protein-protein interaction networks were constructed based on commonly differentially expressed genes. There are many key proteins in the PPI networks involved in immune/inflammation response, such as IL10 and CD4. Luo reported that blocking IL-10 can enhance the effect of BCG immunotherapy for bladder cancer [17]. Previous research found that CD3D/CD4 ratio is an important marker for the prognosis of bladder cancer [18].

The results of GO enrichment analysis suggested that many of the enriched differential genes are related to immune cell activity (leukocyte activation and T cell activation) and stromal cell components (collagen-containing extracellular matrix and extracellular matrix), indicating immune cells and stromal cells, as key components in the tumor microenvironment, may be involved in the development of bladder cancers. He et al. reported that inflammatory response-related KEGG pathway was significantly enriched in shOIP5 bladder cancer cell lines, including cytokine-cytokine receptor interaction and chemokine signaling pathway, which was in accordance with our present study [19]. The PI3K-Akt signaling pathway is a classic pathway for cell proliferation, thus playing an important role in the pathogenesis of multiple cancers. Previous researches have shown many oncogenes promote proliferation via the PI3K-Akt signaling pathway in bladder cancer [20, 21]. All these suggest that our analysis results are helpful to clarify the pathogenesis and mechanism of bladder cancer and provide new research ideas for the treatment of bladder cancer.

In recent years, increasingly more mRNAs have the potential to be molecular biomarkers for evaluating and predicting the prognosis in various cancer types [2224]. However, there are few reports on the tumor microenvironment-related genes to predict cancer outcomes in bladder cancer. In our study, a total of 274 genes are significantly associated with overall survival in bladder cancer. Among the 274 genes, the proteins encoded by FN1, CXCL12, CD3E, LCK, and ZAP70 genes were also key proteins in PPI. FN1 has the largest number of interconnected nodes in the PPI, suggesting that it may be involved in multiple aspects of bladder cancer development. FN1 encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and itself is a potential urine biomarker for bladder cancer detection [20]. CXC chemokine ligand 12 (CXCL12) is an important member of the CXC subfamily of chemokines. Nazari et al. suggested that elevated protein and mRNA levels of CXCL12 are found in human bladder cancer, which plays a role during the genesis of BCa and its further development [25]. Punt et al. found that high CD3E expression was correlated with improved disease-specific survival in squamous cervical cancer [26]. Previous study lymphocyte-specific protein tyrosine kinase (Lck) was one of the key molecules regulating T-cell functions, which emerged as a novel druggable target molecule for the treatment of cancers [27]. Fu et al. demonstrated that ZAP70 (a tyrosine kinase of the Syk family) may play an important role in the T-cell receptor (TCR) signaling pathway, which facilitated PCa cell migration and invasion [28]. To our delight, the function of CD3E, Lck, and ZAP70 gene have not been reported in bladder cancer, indicating that functional studies on these genes may help us to more accurately understand the prognosis-related biological behavior of bladder cancers. Furthermore, they also provide a new idea for further exploring the molecular mechanism and targeted therapy of bladder cancer.

Importantly, we validated these OS-related genes in an independent cohort of 165 bladder cancer patients from the GEO database (GSE13507). A total of 10 genes were identified to be correlated with overall survival. Among these identified genes, 5 genes (CALD1, HOXB3, HOXB6, MOGAT2, and TNC) have been reported to be associated with the development or prognosis of bladder cancer, indicating that the results in our study are consistent with previous publications in bladder cancer [2931]. The remaining 5 genes have not been found to be associated with the prognosis of bladder cancer (BTBD16, OLFML2B, PRRX1, SPINK4, and SPON2). Further research that focuses on these potential prognostic-related genes may find new biomarkers of bladder cancer. Meanwhile, these genes may lead to gene-mediated molecular targeting therapy.

5. Conclusions

In summary, high stromal is associated with females, age above 65, clinical T stage 3/4, clinical N status, and pathological high grade. We found 1125 differentially expressed genes associated with tumor microenvironment in bladder cancer. Ten genes were closely related to prognosis, and they may have the potential to become prognostic biomarkers for bladder cancer.

Abbreviations

BCa:Bladder cancer
TCGA:The Cancer Genome Atlas
GEO:Gene Expression Omnibus
DEGs:Differentially expressed genes
GO:Gene ontology
PPI:Protein–protein interaction
KEGG:Kyoto Encyclopedia of Genes and Genomes
STRING:Search Tool for the Retrieval of Interacting Genes/Proteins.

Data Availability

The datasets supporting the conclusions of this article are available in the Cancer Genome Atlas (TCGA) database and Gene Expression Omnibus (GEO) with the accession numbers GSE13507. All of those studies previously were approved by their respective institutional review boards.

Conflicts of Interest

The authors declare that they have no competing interests.

Authors’ Contributions

The first author of this manuscript is XZ and YT. YL designed this research. XZ and HR managed the data. XZ and YT performed the statistical analyses. XZ, YT, and YL reviewed the results, interpreted the data, and wrote the manuscript. All authors read and approved the final version of the paper. Xin Zhao and Yu Tang contributed equally to this work.

Supplementary Materials

Supplementary 1. Supplementary Figure 1: the association between immune scores and clinicopathological parameters. (a) The scatter plot indicated that higher immune scores were associated with female sex (vs. male sex, ). (b) The scatter plot indicated that higher immune scores were associated with (vs. , ). (c) The scatter plot indicated that higher immune scores were associated with T3/4 (vs. T1/2, ). (d) The distribution of immune scores stratified by N status. The scatter plot indicated that higher immune scores were associated with higher N status (). (e) The scatter plot indicated no significant associations between immune scores and M1 (vs. M0, ). (f) The scatter plot indicated that higher immune scores were associated with high grade (vs. low grade, )

Supplementary 2. Supplementary Table 1: survival analyses between patients’ overall survival and 14 gene expression levels in the TCGA dataset. Supplementary Table 2: survival analyses between patients’ overall survival and 14 gene expression levels in the GEO dataset.