Abstract
Background. Soft tissue sarcoma is a malignant tumor with high degree of malignancy and poor prognosis, originating from mesenchymal tissue. Long noncoding RNAs (lncRNAs) are involved in various biological and pathological processes in the body. They perform preprocessing, splicing, transport, degradation, and translation of mRNA to achieve posttranscriptional level regulation, resulting in the occurrence, invasion, and metastasis of tumors. Therefore, they are highly relevant with regard to early diagnoses and as prognostic indicators. Objective. The objective of the present study was to identify immune microenvironment-related lncRNAs that can be used to predict soft tissue sarcomas. Methods. Clinical data and follow-up data were obtained from the cBioPortal database, and RNA sequencing data used for the model structure can be accessed from The Cancer Genome Atlas (TCGA) database. LncRNAs were screened by differential expression analysis and coexpression analysis. The Cox regression model and Kaplan–Meier analysis were used to study the association between lncRNAs and soft tissue sarcoma prognosis in the immune microenvironment. Unsupervised cluster analysis was then completed to discover the impact of screening lncRNAs on disease. We constructed an mRNA-lncRNA network by Cytoscape software. Finally, qRT-PCR was used to verify the difference in the expression of the lncRNAs in normal cells and sarcoma cells. Results. Unsupervised cluster analysis revealed that the 210 lncRNAs screened showed strong correlation with the tumor immune microenvironment. Two signatures containing seven and five lncRNAs related to the tumor microenvironment were constructed and used to predict overall survival (OS) and disease-free survival (DFS). The Kaplan–Meier (K-M) survival curve showed that the prognoses of patients in the high-risk and low-risk groups differed significantly, and the prognosis associated with the low-risk group was better than that associated with the high-risk group. Two nomograms with predictive capabilities were established. qRT-PCR results showed that the expression of AC108134.3 and AL031717.1 was significantly different in normal and sarcoma cells. Conclusion. In summary, the experimental results showed that lncrnA associated with immune microenvironment was related to tumor, which may provide a new idea for immunotherapy of STS.
1. Introduction
Soft tissue sarcoma is a heterogeneous malignant mesenchymal tumor [1]. It accounts for more than 20% of solid malignant tumors in children and less than 1% of solid malignant tumors in adults [2]. The incidence of the disease is relatively low, but it is highly malignant in most patients and is associated with a poor prognosis [3]. Therefore, prognostic indicators of the disease and early diagnosis are vitally important.
Previous studies have revealed that the tumor immune microenvironment plays an important part in the occurrence and development of tumors [4–6]. The tumor microenvironment (TME) can affect the biological characteristics of tumor cells by regulating the expression of long noncoding RNAs (lncRNAs) [7]. And lncRNAs can also regulate TME [8–10]. Studies have shown that the stimulation of interleukin 6 (IL-6) can cause the spread of liver cancer cells, which are mainly caused by the promotion of lncTCF7 expression through the transcription (STAT) signaling pathway [11]. However, the abnormal regulation of a variety of oncogenes and tumor suppressor genes can lead to tumorigenesis [12], and lncRNAs can participate in malignant changes in cells and tumorigenesis by regulating important oncogenes or suppressor genes [13]. For example, lncRNA RUSC1-AS1 plays an important role in the occurrence of liver cancer, mainly by regulating the PI3K/AKT signaling pathway [14]. lncRNA KCNQ1OT1 can promote the growth of osteosarcoma through enhanced aerobic glycolysis [15]. Therefore, lncRNA related to the tumor immune microenvironment has the possibility of being a prognostic indicator. Moreover, research into such markers can provide the theoretical basis for the development of new therapeutic targets and strategies and can guide first-line treatment [16].
In the present study, RNA sequencing data and clinical data were collected and sorted out, and the osteosarcoma immune score is quantified based on the ESTIMATE algorithm [17]. Differential expression analysis and immune-related mRNA coexpression analysis were used to identify immune-related lncRNAs associated with the TME. Finally, a series of bioinformatic methods were used to determine the prognostic value of the identified lncRNAs.
2. Materials and Methods
2.1. Data Collection and Pretreatment
Clinical data and follow-up data were downloaded from the cBioPortal database (http://www.cbioportal.org/) [18]. RNA sequencing data were obtained from TCGA data portal (https://cancergenome.nih.gov/) [19]. The collected samples only retained the specimens of the tumor at the primary site (259 cases). All data from this study are available to the public.
2.2. Differences in Tumor Microenvironmental Immune Score and Prognosis
The ESTIMATE, an algorithm inferring tumor purity, stromal score, and immune cell admixture from expression data, was used in the language software to evaluate matrix score and immune score on the samples by executing ssGSEA [17, 20]. The scores were sorted, and X-tile software [21] was used to divide the samples into high-score and low-score groups based on the median score. The prognostic differences between the two groups were then compared using K-M survival curves (including OS and DFS).
2.3. Identification of Immune Microenvironment-Related lncRNAs in Soft Tissue Sarcomas
In order to understand the reasons for the differences between the high-score and low-score groups, we analyzed the differences in immune scores between the high-score and low-score of lncRNA in the microenvironment. After obtaining the lncRNA expression data, the “limma” software package—written in the programming language—was used to compare lncRNA expression in the high-score and low-score groups and to perform differential expression analysis [22]. When and , the lncRNA expression between the high-score and low-score groups is considered significantly different. Immune-related mRNA data were downloaded from the ImmPort database (https://www.immport.org/) [23], and we identify immune-related lncRNAs by Pearson correlation analysis ( and ). Finally, the results obtained using the two methods described above were combined to identify the immune microenvironment-related lncRNAs of soft tissue sarcomas.
2.4. Unsupervised Cluster Analysis
To determine the correlation between the screened lncRNAs and immunity, the “Consensus Cluster Plus” software package was used to perform unsupervised cluster analysis [24]. The K-M survival curve was used after the subgroups were divided, and the log-rank test was used to determine the differences in OS and DFS between the groups. The differences in the microenvironment scores between the groups were then compared.
2.5. Independent Prognostic Analysis and Clinical Correlation Analysis
First, we performed a single-factor Cox analysis ( value <0.05) to identify the lncRNAs related to prognosis. LASSO regression analysis was performed to avoid overfitting [25]. Multifactor Cox analysis was then carried out. The most appropriate differentially expressed lncRNAs related to OS or DFS and associated with the immune microenvironment were selected. The corresponding lncRNA-derived risk score for each patient with soft tissue sarcoma was simultaneously calculated using the following formula: (where β is the regression coefficient).
The patients were then divided into high-risk and low-risk groups. K-M survival analysis was used to compare the prognostic differences between the high-risk and low-risk groups. The receiver operating characteristic (ROC) curves for 3, 5, and 7 years were used simultaneously to verify the prediction efficiency of the signatures [20, 26, 27]. And the area under the curve (AUC) was used to represent the differentiation of the nomogram. Finally, combined with the clinical data, single- and multifactor Cox analyses were performed to determine the independent predictors of lncRNA prognosis in the TME.
2.6. Construction of the Nomogram
We developed a nomogram to predict the OS and DFS of lncRNAs in the tumor immune microenvironment. Firstly, the univariate COX analysis was performed to filter prognostic variables, which will be further included in the multivariate COX analysis. Secondly, based on independent prognostic variables, two nomograms were established for predicting the OS and DFS, respectively. The time-dependent ROC curves were used to create the nomogram prognostic prediction chart [26, 27]. Simultaneously, the 3-, 5-, and 7-year survival rate calibration curves were used to correct the nomogram prognostic prediction chart.
2.7. Construction of the mRNA-lncRNA Network
A regulatory mRNA-lncRNA network was constructed using Cytoscape (version 3.7.2), and the interaction between mRNA and lncRNA was analyzed using the Pearson test (, ) [28].
2.8. Cell Culture
Normal human dermal fibroblast cells (HDF-a) and human fibrosarcoma cells (HT1080) were purchased from the Cell Storage Center of otwo. All cells are cultured in Dulbecco’s Modified Eagle’s Medium (DMED) containing 10% FBS and 1% streptomycin/penicillin. Then, place the cells in a 37°C, 5% CO2 incubator for culture. Change the medium once a day. The RNA is extracted when the cells grow to 80% confluent.
2.9. Quantitative Real-Time PCR (qRT-PCR)
Use TRIzol (ThermoFisher Scientific, USA) to extract total cell RNA. Follow the steps of PrimeScrip reverse transcription kit (Takara, Japan) to reverse transcription into cDNA. Configure the PCR reaction system and analyze it according to the SYBR Premix Ex Taq (Takara, Japan) instruction. The relative expression is expressed by 2−∆∆Ct. Repeat the experiment 3 times independently for each sample (Table 1).
2.10. Statistical Analyses
All analyses were performed using version 4.0.5. Unless otherwise noted, statistical significance was set at .
3. Results
3.1. Relationship between TME Immune Score and Patient Prognosis
Excluding nonprimary tumor specimens, we included 259 cases in the analysis. Of these, there were 129 cases in the high immune score group and 130 cases in the low immune score group. K-M survival analysis revealed that OS differed significantly whereas DFS did not (Figure 1).

(a)

(b)
3.2. Overview of lncRNAs Related to the Immune Microenvironment
To identify the lncRNAs that were differentially expressed in the high and low immune score groups, we first screened and obtained 1153 differentially expressed lncRNAs according to the conditions and methods described above (Figure 2(a)). The volcano map reveals that the number of upregulated (red) lncRNAs is the majority, reflecting that most of the immune-related lncRNAs promote the occurrence and development of tumors (Figure 2(b)). The obtained immune-related mRNAs and all the lncRNAs were then coexpressed, and 1326 immune-related lncRNAs were identified. Finally, the intersection of lncRNAs obtained by two methods was used to identify the 210 immune-related lncRNAs (Figure 2(c)).

(a)

(b)

(c)
3.3. LncRNA-Based Clusters Significantly Associated with Prognosis and Immune Microenvironment Scores
Based on the consensus matrix heat map, the 258 samples were clearly divided into three clusters (Figure 3(a)). In addition, by comprehensively analyzing the relative change in area under the cumulative distribution function, three clusters were determined (Figures 3(b) and 3(c)). K-M survival analysis subsequently revealed that OS differed significantly but DFS did not (Figures 3(d) and 3(e)). Finally, the differences in immune scores between the three clusters were compared, and it was found that the third cluster was significantly higher than the first two clusters in both the matrix and immune scores (Figures 3(f) and 3(g)).

(a)

(b)

(c)

(d)

(e)

(f)

(g)
3.4. Construction of lncRNA Prognostic Model Related to the Immune Microenvironment
First, single-factor Cox regression analysis identified 32 and 12 lncRNAs correlated with OS (Figure 4(a)) and DFS (Figure 5(a)), respectively. LASSO regression analysis was used to reduce overfitting (Figures 4(b) and 4(c) and Figures 5(b) and 5(c)), and 20 and 12 lncRNAs were determined to be extremely relevant to the prognosis of OS and DFS, respectively. And multifactor Cox regression analysis was used to screen 12 and 8 lncRNAs related with independent prognoses of OS (Table 2) and DFS (Table 3), respectively. K-M survival analysis revealed significant differences between the low-risk and high-risk groups (Figures 4(d) and 5(d)). Simultaneously, the ROC curve AUC value of the model was greater than 0.7, indicating that the model was more accurate (Figures 4(e) and 5(e)). Then, combined with clinical indicators for analysis, the results indicated that lncRNA-derived risk indicators could be used as an independent prognostic factor for OS and DFS models (Tables 4–7). Finally, two nomograms were established based on independent prognostic predictors. The calibration curves of the 3-, 5-, and 7-year survival rates revealed good agreement between the predicted results and the actual results (Figure 6).

(a)

(b)

(c)

(d)

(e)

(a)

(b)

(c)

(d)

(e)

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)
3.5. Regulatory Network of mRNA and lncRNA
The construction of mRNA-lncRNA coexpression network helps us to further understand the regulatory relationship between mRNA and lncRNA (Supplementary document 1). We obtained 7 lncRNAs and 97 mRNAs that were incorporated into the final OS signature, 102 of which were related to network generation (Figure 7(a)). At the same time, 5 lncRNAs and 65 mRNAs were selected for incorporation into the final DFS signature, and 65 associations were used to generate another network (Figure 7(b)). Two lncRNAs—AC108134.3 and AL031717.1—were concurrently combined into OS and DFS signatures. Each had five RNAs as targets.

(a)

(b)
3.6. Expression of AC108134.3 and AL031717.1 mRNA in Fibroblast and Fibrosarcoma Cells
By qRT-PCR analysis of the two lncRNAs, it can be found that the expression of AC108134.3 (Figure 8(a)) in normal fibroblasts is significantly higher than that in fibrosarcoma cells, while the expression of AL031717.1 is the opposite (Figure 8(b)).

(a)

(b)
4. Discussion
In recent years, many studies have targeted the relationship between lncRNA and tumors [29–31]. However, research on lncRNAs in soft tissue sarcomas remains insufficient. In this study, lncRNAs related to the immune microenvironment were jointly screened from the differences between the sample groups and the coexpression of immune-related mRNAs in the database. The prediction model based on the screened lncRNAs and clinicopathological data performed well.
In the present study, the samples were divided into high-score and low-score groups according to their immune microenvironment scores to compare different lncRNAs. The results are presented as a volcano map. There are numerous positive correlations, indicating that most immune-related lncRNAs in the immune microenvironment promote the formation and development of tumors. We then downloaded immune-related mRNA data and all the lncRNA data from the ImmPort database for coexpression analysis. The two results described above were then combined and screened to obtain 210 immune-related lncRNAs. Subsequently, an unsupervised cluster analysis was performed, in which the lncRNAs were divided into three groups. The results showed that OS differed significantly among the three groups. The survival rates of the first and third clusters were higher than that of the second cluster. Furthermore, the matrix and immune scores of the third cluster were both the highest, indicating that these lncRNAs are closely related to immunity. There were also differences in the levels of immune cells between the third cluster and the other two clusters, indicating that lncRNAs related to the immune microenvironment may influence the prognosis of patients.
Indepth studies show that lncRNAs have roles in epigenetic modification and transcriptional and posttranscriptional regulation. Different lncRNAs are related to the occurrence and development of tumors, and they are usually expressed abnormally in cancers. LncRNA not only participates in tumor formation but also inhibits the occurrence and development of tumors. Studies have shown that the expression of lncRNAs can be used as a biomarker for cancer diagnosis [32], may be related to the prognosis of tumors, and can be used as a potential biomarker to guide prognosis [33]. In the regulatory network constructed in the present study, lncRNA SFTA1P has been reported to downregulate miR-4766-5p through the PI3K/AKT/mTOR signaling pathway to promote liver cancer growth [34]. Our research has identified lncRNAs contained in two unreported signatures, i.e., AC108134.3 and AL031717.1. In addition, we performed qRT-PCR analysis on these two lncRNA. It is found that these two lncRNAs are significantly different in normal cells and tumor cells, which verifies the correctness of our results. This study provides a theoretical basis for further study of these two lncRNAs as prognostic biomarker. In addition to the lncRNAs, the corresponding targeted mRNAs are also involved in immune regulation. For example, previous studies have shown that SHC3 is functionally relevant to TRIP13-mediated tumor growth and metastasis [35, 36].
Undeniably, this study still has several limitations that need to be improved. First, the data is downloaded from the public TCGA database; so, a certain degree of selection bias cannot be ruled out, and the clinical data were not comprehensive. Second, our data were based on theoretical analysis, and further basic experiments are needed to verify the differences and specific mechanisms of these lncRNAs.
5. Conclusion
The experimental results showed that lncRNA associated with immune microenvironment was related to tumor, which may provide a new idea for immunotherapy of STS.
Abbreviations
TME: | Tumor microenvironment |
TCGA: | The Cancer Genome Atlas |
lncRNAs: | Long noncoding RNAs |
ROC: | Receiver operating characteristic |
OS: | Overall survival |
DFS: | Disease-free survival |
K-M: | Kaplan–Meier |
CDF: | Cumulative distribution function |
DMED: | Dulbecco’s Modified Eagle’s Medium |
qRT-PCR: | Quantitative real-time PCR. |
Data Availability
This study was carried out using publicly available data from the cBioPortal database at http://www.cbioportal.org/ and the TCGA data portal at https://cancergenome.nih.gov/. And you can contact us for analysis code.
Consent
We consent to publish our data.
Disclosure
The preprinted version of our manuscript can be found in Research Square [37]. Available from https://www.researchsquare.com/article/rs-752805/v1
Conflicts of Interest
The authors declare that they have no competing interests.
Authors’ Contributions
Wang-Ying Dai and Bin Wang have contributed equally to this work. Zong-Ping Luo, Wang-Ying Dai, Bin Wang, and Jian-Yi Li conceived of and designed the study. Wang-Ying Dai performed literature search. Wang-Ying Dai and Bin Wang generated the figures and tables. Bin Wang and Jian-Yi Li analyzed the data. Wang-Ying Dai wrote the manuscript, and Zong-Ping Luo and Bin Wang critically reviewed the manuscript. All authors have read and approved the manuscript.
Acknowledgments
I wish to thank Jun-Cheng Zhu for participating in the literature search. This study was supported by the National Natural Science Foundation of China (32071307) and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).
Supplementary Materials
Supplementary 1. Supplementary document 1: raw data of mRNA-lncRNA coexpression network.
Supplementary 2. Supplementary document 2: raw data of lncRNA heat map.
Supplementary 3. Supplementary document 3-1: raw data of univariate Cox analysis of overall survival- (OS-) related variables.
Supplementary 4. Supplementary document 3-2: raw data of univariate Cox analysis of OS-related variables.