Abstract

Background. Epithelial-mesenchymal transition (EMT) plays a vital role in tumor metastasis and drug resistance. It has been reported that EMT is regulated by several long noncoding RNAs (lncRNAs). We aimed to identify EMT-related lncRNAs and develop an EMT-related lncRNA prognostic signature in kidney renal clear cell carcinoma (KIRC). Materials and Methods. In total, 530 ccRCC patients with 611 transcriptome profiles were included in this study. We first identified differentially expressed EMT-related lncRNAs. Then, all the samples with transcriptional data and clinical survival information were randomly split into training/test sets at a ratio of 1 : 1. Accordingly, we further developed a twelve differentially expressed EMT-related lncRNA prognostic signature in the training set. Following this, risk analysis, survival analysis, subgroup analysis, and the construction of the ROC curves were applied to verify the efficacy of the signature in the training set, test set, and all patients. Besides, we further investigated the differential immune infiltration, immune checkpoint expression, and immune-related functions between high-risk patients. Finally, we explored the different drug responses to targeted therapy (sunitinib and sorafenib) and immunotherapy (anti-PD1 and anti-CTLA4). Results. A twelve differentially expressed EMT-related lncRNA prognostic signature performed superior in predicting the overall survival of KIRC patients. High-risk patients were observed with a significantly higher immune checkpoint expression and showed better responses to the targeted therapy and immunotherapy. Conclusions. Our study demonstrates that the twelve differentially expressed EMT-related lncRNA prognostic signature could act as an efficient prognostic indicator for KIRC, which also contributes to the decision-making of the further treatment.

1. Introduction

As a fundamental and reversible biological process in physiological and pathological conditions, epithelial-mesenchymal transition (EMT) has been identified in three different types [1]. Type I is majorly involved in growth and development, especially in embryogenesis and organ development [2]. Type II is majorly involved in wound healing [2], and type III is regarded as one of the essential processes in tumorigenesis, progression, tumor metastasis, and drug resistance [3, 4]. The changing phenotype from epithelial cells to mesenchymal cells obtains great migratory and even invasion ability. Researches to date have reported that EMT plays an essential role in invasion-metastasis cascade in tumor migration, contributing to the diffusion of cancer cells to the surrounding stromal environment and followed intravasation that also means hematogenous metastasis and finally resulted in the metastasis to the distant organ [5, 6].

Furthermore, the EMT mechanism is also involved in mediating drug resistance [79]. For example, though tyrosine kinase inhibitors (TKIs) have been extensively applied in targeted therapy in several tumor types, the most significant limitation lies in cancer resistance. Notably, it has been reported that EMT induction through YAP/FOXM1 axis contributes to EGFR inhibitor resistance [10]. Besides, nerve growth factor (NGF) can stimulate the EMT mechanism in triggering erlotinib resistance [11]. Hence, the significant role of EMT in cancer drives researchers to explore the EMT-related genes, especially EMT regulator genes [12, 13]. Meanwhile, Zhao et al. collected all EMT genes with experimental verification and set up an EMT gene database named dbEMT [14, 15]. However, it is essential to continue exploring the EMT-related genes, which may be helpful to discover novel therapeutical targets or relieve the current crisis of drug resistance.

Kidney cancer is the 6th most common cancer in both sexes and the most common urogenital tumor [16], accounting for approximately 2-3% of all malignancies and 90% of all diagnosed renal parenchymal malignancies [16, 17], claiming 14,830 lives with 73,750 new confirmed cases in the USA in 2020 [16]. Kidney renal clear cell carcinoma (KIRC) is the predominant pathological subtype of all kidney cancer, accounting for approximately 85% of renal cancer [18, 19], also considered one of the most invasive diseases associated with a high mortality rate with the form of metastasis [20]. As KIRC is not sensitive to radiation, hormone, or cytotoxic therapy, tyrosine kinase inhibitors (TKIs) such as sunitinib and sorafenib targeting vascular endothelial growth factor (VEGF) pathway play an essential role in the current clinical treatment as the first-line targeted therapy [21, 22]. In recent years, immune checkpoint inhibitors (ICIS) that block PD-1/PD-L1 or CTLA-4/T-cell suppressor inhibitory have also shown excellent performance in the therapy of KIRC [23], especially with the combination of VEGF-directed therapy [24]. Also, immunotherapy-combined therapy has replaced TKI’s first-line targeted therapy as a first-line treatment in the latest European Association of Urology (EAU) guidelines for clear cell metastatic renal cell carcinoma (cc-mRCC) [25]. The overall survival (OS) rate for localized KIRC reaches 95%, but those KIRC patients with metastasis and drug resistance had a poor OS [26]. Thus, it is essential to discover precise prognostic biomarkers associated with drug resistance in KIRC.

The long noncoding RNA (lncRNA), defined as RNA transcripts with longer than 200 nucleotides and little protein-coding ability [27], have been found with robust prognostic value in KIRC [28]. Interestingly, lncRNAs also play a significant role in the regulation of EMT [29, 30]. Combined with the above, EMT-derived cancer metastasis and drug resistance resulted in poor prognosis of patients, and KIRC is a typical tumor type that its metastasis is usually challenging to detect and its resistance to TKIs is a thorny problem clinical treatment. Both these resulted in a poor prognosis. Thus, lncRNAs may play a critical role as a regulator factor in EMT in KIRC. In our study, we first identified differentially expressed EMT lncRNAs in KIRC, then constructed and verified the EMT-related lncRNA signature, and further explored and found that the prognostic signature was significantly associated with immune infiltration, immune functions, immune checkpoints, and, more importantly, drug response to targeted therapy and immunotherapy.

2. Materials and Methods

2.1. Data Sources

The transcriptional expression profiles and corresponding clinical information contained survival time, and clinicopathological characteristics for the KIRC were retrieved from TCGA-GDC (https://portal.gdc.cancer.gov/). Notably, the transcriptional profiles were downloaded in the fragments per kilobase of per million (FPKM) format and then merged as an eligible matrix. The corresponding clinical data was downloaded in the bcr xml format and then merged as a clinical information matrix. Besides, the gene annotation files were acquired from the Ensembl database (http://asia.ensembl.org/index.html) to transfer the Ensembl ID in the matrix to the gene symbol and annotate genes for its transcripts type, such as mRNA and lncRNA. More importantly, EMT-related gene list was downloaded from the dbEMT 2.0 database (http://dbemt.bioinfo-minzhao.org/). All the downloaded data was proceeded by R program version 4.1.0 [31].

2.2. Differentially Expressed EMT-Related mRNAs and EMT-Related lncRNAs

According to the tissue source, we sorted the transcriptional data: normal adjacent tumor tissue as “normal” and tumor tissue as “tumor.” Then, we divided the transcriptional atlas of KIRC patients into two matrices: one for mRNA expression atlas and another for lncRNA expression atlas. Following this, we exported the expression atlas of EMT-related mRNAs according to the gene list downloaded from dbEMT 2.0. We performed the Pearson correlation test between expression of EMT-related mRNAs and lncRNAs, set the , and adjusted value < 0.001 as the filter to identify EMT-related lncRNAs. Subsequently, we applied the Wilcoxon test between normal tissue and tumor tissue to discover significantly differentially expressed EMT-related mRNAs and EMT-related lncRNAs. GO enrichment analysis and KEGG enrichment analysis were employed to check the potential functions and pathways influenced by the differentially expressed EMT-related mRNAs.

2.3. Random Grouping and Signature Construction

Having identified and screened differentially expressed EMT-related lncRNAs, we merged it with its corresponding clinical survival data. All the samples with transcriptional data and clinical survival information were randomly split into training/test sets at a ratio of 1 : 1. Following this, we performed univariate Cox regression in the training group to identify prognostic EMT-related lncRNAs with a significance of . Subsequently, we applied LASSO regression to avoid overfitting and screen appropriate variables. Finally, a survival-predicting signature was constructed by the multivariate Cox proportional hazards model. Notably, a risk score formula was accordingly created based on the signature: where is the number of differentially expressed EMT-related lncRNAs in the multivariate Cox proportional hazards model; is the expression value normalized by FPKM of the th lncRNA in the signature, and is the estimated regression coefficient of it. Besides, all samples in both the training set and test set obtained a risk score calculated by the predict function in the R program, and we set the medium value of the risk score in the training set as the cut-off value to stratify patients with KIRC that the higher risk score represented a high risk, and the lower risk score was grouped into low risk.

2.4. Signature Validation

We first described the distribution of the risk scores and the proportion of risk stratification in the training set, the test set, and all samples and visualized the survival status of each patient with different risk scores through risk analysis, and then, we performed the Kaplan-Meier method to plot the survival curves for the patients in the training set, test set, and all patients. The log-rank test was carried out to examine the survival-predicting availability of this signature at the same time. Following this, we sort the clinicopathologic information of each patient, considering age, gender, stage, grade, and risk score as the alternative prognostic factors, and then compared the predicting capability of these prognostic factors by the 5-year multivariate ROC curves; simultaneously, the area under the curve (AUC) for each prognostic factors including age, gender, stage, grade, and risk score was calculated and compared with each other in the training set, test set, and all patients.

2.5. EMT-Related mRNA-EMT-Related lncRNA Interaction Network

We were interested in the correlation between these EMT-related lncRNAs included in the signature and their correlated EMT-related mRNAs. Thus, we extracted the correlation between these two from the results of the Pearson correlation test and visualized it as an interaction network by using Cytoscape (version 3.8.0) [32].

2.6. Clinical Correlation and Subgroup Analysis

Further investigating the differences of these clinicopathologic characteristics between high- and low-risk groups might contribute to learning the potential correlation between the risk score and clinical characteristics. All patients were divided into two groups as a high- or low-risk group; and then, the test was applied to examine the difference of clinicopathologic status between these two groups. Moreover, all patients were divided into subgroups according to each clinicopathologic characteristic, including age (), gender (male and female), stage (stage I-II and stage III-IV), and grade (G1-2 and G3-4). Then, survival analysis was employed to test the efficacy of this signature in all different subgroups.

2.7. Immune Infiltration, Immune Functions, Immune Checkpoint Expression, and Drug Response

Having verified the efficacy of this signature, we were interested in the immune infiltration, immune-related functions, and immune checkpoint expression between high- and low-risk patients. Thus, we separately applied the Wilcoxon test to compare the immune infiltration and immune checkpoint expression status. Immune-related functions were investigated by single set gene set enrichment analysis (ssGSEA) [33]. Also, though high-risk patients showed a poor prognosis, we wondered whether high-risk patients had other alternative therapeutic choices, such as targeted therapy and immunotherapy. Thus, the drug response to sunitinib and sorafenib for each patient was predicted by R package “Prophetic,” and drug response to immunotherapy was predicted by the submap algorithm [34]. Finally, each drug response was compared between high- and low-risk patients.

3. Results

3.1. Basic Characteristics

The study flow was shown in the graphic abstract. A total of 611 transcriptome profiles containing 72 normal tissues and 539 tumor tissues from 530 KIRC patients were downloaded and sorted. For those samples sequenced multiple times, we took the average of them as their transcriptional data. Besides, all samples were randomly split into training and test set at a ratio as 1 : 1, and the characteristics of the samples in the training set, test set, and all samples are shown as Table 1; the test or Fisher’s exact test was applied to explore the heterogeneity between the training set and test set. It seemed that there was no significant difference between the training set and the test set.

3.2. Differentially Expressed EMT-Related mRNAs and EMT-Related lncRNAs

A total of 1184 EMT-related genes were downloaded from the dbEMT 2.0 database. Then, we extracted the transcriptional expression atlas of 1135 EMT-related mRNAs and identified 2380 EMT-related lncRNAs with and . Following this, 358 differentially expressed EMT-related mRNAs (Figure 1(a)) and 1339 differentially expressed EMT-related lncRNAs (Figure 1(b)) with and were screened by the Wilcoxon test for further analysis. GO enrichment analysis showed that the most significantly enriched function for these differentially expressed EMT-related mRNAs is the EMT (Figure 1(c)). KEGG analysis revealed that these EMT-related mRNAs play an important role in several tumor-related pathways (Figure 1(d)).

4. Signature Construction and Signature Verification

In the training set, we discovered 265 prognostic differentially expressed EMT-related lncRNAs through univariate cox regression with a value < 0.05. The univariate Cox regression of the lncRNAs in the final signature is shown in Figure 2(a). Then, the LASSO regression was carried out to avoid overfitting and screened 26 appropriate EMT-related lncRNAs to conduct further multivariate Cox proportional hazards model (Figures 2(b) and 2(c)). Finally, we developed a twelve-EMT-related lncRNA survival-predicting signature (Figure 2(d)), and their detailed information is shown in Table 2. The risk score calculating formula was established as mentioned in the Materials and Methods. Then, each patient acquired a risk score and corresponding risk stratification according to the medium value of risk score in the training set. The distribution status of the risk score in the training, test, and all patients is shown in Figures 3(a)3(c). The survival status of each patient with different risk levels is shown in Figures 3(d)3(f). Following this, the survival analysis was carried out as shown in Figures 3(g)3(i). The 1-year, 3-year, and 5-year ROC curves of the risk score in the training set, the test set, and all patients are shown in Figures 3(j)3(l). More importantly, the multivariate 5-year ROC curves in different sets were assembled to compare the efficacy of the risk score to other common-used clinicopathological characteristics, and the risk score showed the best performance with the most extensive area under curves (AUC) in the training set (Figure 3(m)), test set (Figure 3(n)), and all patients (Figure 3(o)).

5. Interaction Network and Clinical Correlation

Having constructed the prognostic signature, we established the EMT-related mRNA-EMT-related lncRNA interaction network according to the results of the initial Pearson correlation test. The network is shown in Figure 4(a). The blue nodes represented the twelve EMT-related lncRNAs, while the left red nodes represented EMT-related mRNAs coexpressed with these lncRNAs. The clinical correlation checked by the tests and the expression status of these twelve EMT-related lncRNAs were visualized in a heat map as shown in Figure 4(b). Notably, the risk level is associated with the distribution of gender, stage, grade, and survival status of these KIRC patients (Figure 4(b)).

5.1. Subgroup Analysis

Because the risk stratification by this signature showed a significant correlation with gender, grade, and stage (Figure 4(b)), we divided all patients into several subgroups by their clinical characteristics to explore the universality of this signature, including age (Figures 5(a) and 5(b)), gender (Figures 5(c) and 5(d)), stage (Figures 5(e) and 5(f)), grade (Figures 5(g) and 5(h)), T stage (Figures 5(i) and 5(j)), N stage (Figures 5(k) and 5(l)), and M stage (Figures 5(m) and 5(n)). Interestingly, the risk stratification by this signature was generally verified effective in all subgroups that high-risk patients keep a significantly worse overall survival, suggesting the great universality of this prognostic signature.

5.2. Immune Infiltration, Immune Checkpoint Expression, Immune-Related Functions, and Drug Response

Differential immune infiltration between high- and low-risk patients calculated by several different methods is shown in Figure 6(a), and the differential expression of immune checkpoint between high risk and low-risk patients is shown in Figure 6(b). Interestingly, almost all the immune checkpoints were a higher expression in high-risk patients, which might account for the poor OS for high-risk patients. The results of ssGSEA revealed differential activated immune-related functions between high- and low-risk patients (Figures 7(a)-7(d)). More importantly, the high-risk patients showed a more sensitive drug response to either sunitinib or sorafenib. Finally, considering the differential immune infiltration and immune-related functions, we explored and found high-risk patients more suitable for anti-PD1 immunotherapy and well responded to immunotherapy.

6. Discussion

It has been reported that the five-year cancer-specific survival (CSS) rate of ccRCC was 91%, 74%, 67%, and 32% separately for AJCC stages I, II, III, and IV, respectively [35]. The OS of differential individuals is heterogeneous, and it is challenging to predict the prognosis of patients with KIRC accurately using the AJCC stage, grade, and pathological TNM stage. With the rapid development of molecular sciences of ccRCC biology and high-throughput sequencing, numerous fresh biomarkers for predicting the prognosis or therapeutic targets have been designed and raised in recent years [28, 36]. lncRNAs, as a complement to genes or microRNAs, have been shown to regulate many cell functions, such as proliferation, apoptosis, invasion, and metastasis [37]. lncRNAs are involved in major oncogenic events in genitourinary malignancies, including invasiveness and recurrence in ccRCC [38]. Therefore, it is essential to establish a potential reliable lncRNA signature to predict the prognosis of ccRCC patients.

Epithelial-mesenchymal transition (EMT), first introduced by Newgreen et al. in 1979 [39], was discovered to play an essential role in tumor metastasis [3]. Notably, nearly 80% to 90% of the tumors come from epithelial cells, which are closely connected and cannot move, so it is challenging to metastasize. However, the activation of epithelial-mesenchymal transformation can provide tumor cells with the ability of migration, infiltration, and invasion [4]. Once they reach distant organs, these mesenchymal cells will return to the characteristics of epithelial cells through mesenchymal-epithelial transition (MET) and restore the ability of proliferation and form secondary foci in the distance [4]. Therefore, primary tumors need to conduct EMT, to obtain the ability to migrate before they can metastasize. Besides, the researchers also found that the invasion and metastasis of drug-resistant cancer cells were significantly enhanced, suggesting that EMT played an essential role in tumor drug resistance [4042]. Ashrafizadeh et al. summarized the vital role of epithelial-mesenchymal cells in paclitaxel and docetaxel resistance [1].

Furthermore, researchers focused on the cell plasticity with the stemness of tumor cells [8]. They believed that once driven by EMT, tumor cells obtain more heterogeneity and plasticity. Therefore, conventional therapy (including radiotherapy, chemotherapy, and targeted therapy) is difficult to eradicate tumor cells that enter the cancer stem cells (CSC) state by activating EMT. Thus, it is quite challenging to prevent drug resistance and recurrence or progression after treatment.

In the present study, we focused on the EMT-related lncRNAs in KIRC and identified and developed a twelve-EMT-related lncRNA prognostic signature in KIRC. Following this, we verified the prognostic signature in the test set by log-rank test, risk analysis, subgroup analysis, and time-dependent multivariate ROC curves. Interesting, in the subgroup analysis, this prognostic signature showed an outstanding performance in almost all clinicopathological subgroups, which indicate the universality of our prognostic signature. Besides, in the comparison for the prognostic signature with common-used clinicopathological characteristics, including age, gender, grade, and stage, this prognostic signature showed a superior AUC. All these verified the accuracy and sensitivity of this EMT-related lncRNA signature. More importantly, though high-risk patients stratified by this signature had a worse OS, we found that the high-risk patients showed a more sensitive drug response to both sunitinib and sorafenib, which might contribute to the clinical decisions.

Furthermore, the differential immune infiltration, immune checkpoint expression, and immune-related functions were engaging, consistent with the current opinion. Zhou et al. reported that the activation of EMT was associated with the upregulation of the immune checkpoint expression (especially PD-L1) [7], which mean EMT induced immune escape to promote cancer progression. Also, Nilsson et al. summarized the association between EMT, MET, and the tumor microenvironments that the changes of tumor plasticity resulted in differential immune infiltrations and differential immune response [10]. These well accounted for the better response to the immunotherapy in high-risk patients than low-risk patients.

Besides, we constructed an EMT-related lncRNA-EMT gene network, which may provide potential regulatory mechanisms of EMT. Further experimental examination is needed to explore the interaction network in the future. Notably, in these twelve EMT-related lncRNAs, AC084876.1 has been identified as glycolysis-related lncRNA with prognostic value in KIRC by Cao et al. [43]. IL10RB-DT has been reported associated with immune infiltration in KIRC [44]. LINC02154 was discovered significantly differentially expressed between normal and tumor tissues in laryngeal cancer [45, 46]. PSORS1C3 is a significant lncRNA in immune-mediated diseases that have been identified regulated by glucocorticoids (GC) and showed great prognostic values in bladder cancer [47, 48]. SUCLG2-AS1 has been reported with prognostic value in clear cell carcinoma, gastric cancer, and triple-negative breast cancer [4951]. AC068338.3 has been reported as a significant prognostic factor associated with immune infiltration in lung adenocarcinoma [52, 53]. However, our knowledge about these lncRNAs is far from enough, and further investigation is needed in the future.

There are some limitations in our study. Firstly, there lacks an independent validation set. Besides, we revealed a twelve differentially expressed-EMT-related-lncRNA prognostic signature for ccRCC with sufficient bioinformatical analysis and statistical-method based verification, which lacks relevant experimental verification. Further experimental verification is required to explore the molecular mechanism behind the association between this signature with immune infiltration, immune checkpoint expression, drug response to targeted therapy and immunotherapy.

7. Conclusions

In summary, we successfully developed a twelve differentially expressed EMT-related lncRNA prognostic signature, which could robustly predict the overall survival and prognosis of KIRC patients. The prognostic signature performed best compared to other frequently used prognostic clinicopathologic factors in predicting the overall survival. However, large-scale, multicenter prospective researches need to be carried out to confirm our results in the future.

Abbreviations

EMT:Epithelial-mesenchymal transition
KIRC:Kidney renal clear cell carcinoma
TKIs:Tyrosine kinase inhibitors
NGF:Nerve growth factor
VEGF:Vascular endothelial growth factor
ICIS:Immune checkpoint inhibitors
EAU:European Association of Urology
OS:Overall survival
CSS:Cancer-specific survival
MET:Mesenchymal-epithelial transition
CSC:Cancer stem cells
GC:Glucocorticoids
lncRNA:Long noncoding RNA.

Data Availability

The source data of this study has been uploaded to the Jianguo Yun in following website link: https://www.jianguoyun.com/p/DZwbosUQh5LQCRiR3f0D.

Disclosure

Yu-Chao Lu, Deng He, and Shao-Gang are the co-first authors.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

XQD, ZY, LLS, LYC, HD, and WSG designed, analyzed, and interpreted the data, drafted the manuscript, and critically revised the manuscript; XQD and LJL were responsible for the statistical analysis; XQD, LLS, LJL, XY, SJX, LCQ, XJZ, LYC, and WSG were responsible for the methodology; LYC, HD, and WSG were responsible for the project administration; XQD and HD wrote the original draft; all authors wrote, reviewed, and edited the manuscript. Qi-Dong Xia, Yuan Zhang, Li-Sha Li, Yu-Chao Lu, Deng He, and Shao-Gang Wang contributed equally to this work.

Acknowledgments

We all thank the R programming package developer. The present study was supported by the grants from the National Natural Science Foundation of China (NSFC, 81770704).