Abstract

Objective. We identified differentially expressed microRNAs (DEMs) between esophageal carcinoma (ESCA) tissues and normal esophageal tissues. We then constructed a novel three-miRNA signature to predict the prognosis of ESCA patients using bioinformatics analysis. Materials and Methods. We combined two microarray profiling datasets from the Gene Expression Omnibus (GEO) database and RNA-seq datasets from the Cancer Genome Atlas (TCGA) database to analyze DEMs in ESCA. The clinical data from 168 ESCA patients were selected from the TCGA database to assess the prognostic role of the DEMs. The TargetScan, miRDB, miRWalk, and DIANA websites were used to predict the miRNA target genes. Functional enrichment analysis was conducted using the Database for Annotation, Visualization, and Integrated Discovery (David), and protein-protein interaction (PPI) networks were obtained using the Search Tool for the Retrieval of Interacting Genes database (STRING). Results. With cut-off criteria of and |log2FC| > 1.0, 33 overlapping DEMs, including 27 upregulated and 6 downregulated miRNAs, were identified from GEO microarray datasets and TCGA RNA-seq count datasets. The Kaplan–Meier survival analysis indicated that a three-miRNA signature (miR-1301-3p, miR-431-5p, and miR-769-5p) was significantly associated with the overall survival of ESCA patients. The results of univariate and multivariate Cox regression analysis showed that the three-miRNA signature was a potential prognostic factor in ESCA. Furthermore, the gene functional enrichment analysis revealed that the target genes of the three miRNAs participate in various cancer-related pathways, including viral carcinogenesis, forkhead box O (FoxO), vascular endothelial growth factor (VEGF), human epidermal growth factor receptor 2 (ErbB2), and mammalian target of rapamycin (mTOR) signaling pathways. In the PPI network, three target genes (MAPK1, RB1, and CLTC) with a high degree of connectivity were selected as hub genes. Conclusions. Our results revealed that a three-miRNA signature (miR-1301-3p, miR-431-5p, and miR-769-5p) is a potential novel prognostic biomarker for ESCA.

1. Introduction

Esophageal carcinoma (ESCA), including esophageal adenocarcinoma and esophageal squamous cell carcinoma, is characterized by aggressive malignant tumor formation [1]. It is the eighth most common type of cancer and the sixth most frequent cause of cancer-related death globally [2]. There were an estimated 572,000 new ESCA diagnoses worldwide in 2018, which accounted for 3.2% of all malignancies [3]. The overall 5-year survival rate was only 15%–25% [4]. Thus, there is an urgent need to discover novel effective prognostic biomarkers for assisting in early detection and improved treatment of ESCA.

MicroRNA (miRNA) is one type of noncoding RNA, approximately 19–25 nucleotides in length, which regulates cellular proliferation, migration, and invasion by modifying gene expression [5]. MiRNAs can regulate target genes by binding to the 3′-untranslated region to repress translation or regulate degradation [6]. It has been reported that dysregulated miRNAs could be a potential biomarker for tumor diagnosis [7]. Therefore, significantly differentially expressed microRNAs (DEMs) in ESCA could be effective biomarkers for early diagnosis, therapeutic strategy selection, and prognosis of ESCA.

Previous studies have reported that a number of DEMs are associated with ESCA prognosis [810]. However, these studies had limitations including clinical heterogeneity, sample insufficiency, and differences in various data processing methods. A large sample size with detailed clinical features is important for reliable survival prediction in patients with ESCA. The Gene Expression Omnibus (GEO) database provides an invaluable resource for gene expression data and other functional genomics data [11]. The Cancer Genome Atlas (TCGA) project contains sequencing data on more than 11,000 miRNAs as well as clinical information from cancer patients [12].

In this study, we detected DEMs between ESCA tissues and normal tissues through the integration of two microarray profiling datasets from the GEO database and RNA-seq datasets from the TCGA database. The prognostic value of the identified DEMs was evaluated by using clinical features downloaded from the TCGA database, and a novel three-miRNA signature was built to determine the prognosis of ESCA patients. Functional enrichment analysis was performed and protein-protein interaction (PPI) networks were obtained to elucidate interactions and identify characteristics of the potential common target gene of the three miRNAs.

2. Materials and Methods

2.1. Data Mining of GEO and Identification of DEMs

The microarray profiling datasets of GSE43732 (including 119 ESC samples and 119 adjacent noncancerous samples) and GSE55856 (including 108 ESC samples and 108 adjacent noncancerous samples) with large sample sizes were obtained from the GEO database. In addition, we downloaded ESCA miRNA-Seq with associated clinical data from the TCGA database. A total of 181 samples consisting of 168 ESCA and 13 normal esophageal samples were analyzed to identify DEMs. After normalization and log2 transformation of the data from the original GEO datasets, the DEMs between ESCA tissue samples and normal esophageal tissue samples were analyzed by the limma package of R software [13]. The EdgeR package of R was used to explore DEMs for the RNA-seq count datasets from the TCGA database [14]. All miRNAs meeting the criteria of and |log2FoldChange (log2FC)| > 1.0 were considered as DEMs. Volcano plots and hierarchical clustering heatmaps (the top 30 DEMs) of the microarray profiling datasets from GEO and the RNA-seq count datasets from the TCGA database were drawn using R software.

2.2. DEMs and Overall Survival (OS) of Patients with ESCA

The RNA-seq data and corresponding clinical information for ESCA were downloaded from the TCGA database. The following inclusion criteria were used: (1) samples with necessary clinical data for analysis, (2) patients without preoperative radiotherapy and chemotherapy, and (3) OS time exceeding two weeks. A total of 181 samples were selected in this study, including 13 normal tissues and 168 ESCA tissues. The prognostic value of the DEMs was examined by the Kaplan–Meier method and the log-rank test. Subsequently, the statistically significant prognosis-related miRNAs were ranked by median expression level, and each ESCA patient was scored in accordance with a high or low level of expression, and a risk grade was defined by the total scores. Then, ESCA patients were sorted into a high-risk group or low-risk group according to the risk-score rank. The differences in patients’ survival between the two groups were evaluated by using the Kaplan–Meier survival method. Finally, a prognostic three-miRNA signature was established by integrating the expression profiles of three miRNAs and corresponding estimated regression coefficient. Furthermore, the expression levels of the three DEMs were analyzed with RNA sequencing expression data of normal control tissue samples and ESCA tissue samples in TCGA database.

2.3. Functional Enrichment Analysis

The common target genes of the three miRNAs were predicted using the miRDB (http://www.mirdb.org/), TargetScan (http://www.targetscan.org/), miRWalk (http://zmf.umm.uniheidelberg.de/apps/zmf/mirwalk2/), and DIANA (http://www.microrna.gr/microT-CDS) online bioinformatics tools. Venn diagrams were used to obtain overlapping target genes identified by these four bioinformatics tools. GO annotation (http://www.geneontology.org) and KEGG signaling pathway were set up using the Database for Annotation, Visualization, and Integrated Discovery (David) (https://david.ncifcrf.gov/) bioinformatics tool [15].

2.4. Hub Nodes in the PPI Network

The Search Tool for the Retrieval of Interacting Genes (STRING) database (http://www.string-db.org) and Cytoscape (version 3.4.0) were used to construct a PPI network of common target genes. The hub nodes in the PPI network were investigated topologically, and PPI pairs with a combined score of more than 0.4 were selected to construct the PPI network. The crucial genes were identified by eigenvector centrality (EGC), degree centrality (DC), and closeness centrality (CC), which were calculated using the Cytoscape plugin CytoNCA [16].

2.5. Statistical Analyses

All statistical analyses were performed using SPSS 20.0 (SPSS Inc., Chicago, IL, USA) and Prism 7.0 (GraphPad, Inc., La Jolla, CA, USA) software. The comparisons of DEM expression and clinical characteristics between the two groups were performed using the chi-square test and Student’s t-tests. Univariate and multivariate Cox analyses were performed to detect the relationship between prognostic features and DEMs. values < 0.05 were considered statistically significant.

3. Results

3.1. Identification of DEMs in ESCA

Using the cut-off criteria of and |log2FC| > 1.0, we obtained 188 DEMs (including 158 upregulated and 30 downregulated miRNAs) from GSE43732, 208 DEMs (including 175 upregulated and 33 downregulated miRNAs) from GSE55856, and 169 DEMs (including 117 upregulated and 52 downregulated miRNAs) from TCGA RNA-seq datasets. The downregulated miRNAs and upregulated miRNAs in microarray profiling datasets and the RNA-seq count dataset are displayed in the volcano plots in Figure 1(a). Moreover, a total of 33 DEMs were found to overlap in the GEO microarray datasets and TCGA RNA-seq count datasets, of which 27 were significantly upregulated and 6 were downregulated, as shown in a Venn diagram (Supplementary Table 1, Figure 1(b)). The hierarchical cluster heatmaps (the top 30 DEMs) of each GEO microarray dataset and TCGA dataset are shown in Figure 2.

3.2. Association between DEMs and OS of ESCA Patients

To explore the prognostic roles of each DEM in ESCA patients, data from 168 ESCA tumor samples and 13 normal tissue samples were downloaded from the TCGA database. Associated clinical characteristics included sex, age, pathologic stage, T stage, lymph node status, metastasis, and histological stage (Table 1). Subsequently, the Kaplan–Meier survival analysis was performed to identify the relationship between each DEM and OS of ESCA patients (Supplementary Figure 1). We found that three DEMs (miR-1301-3p, miR-431-5p, and miR-769-5p) had a negative association with OS (Figures 3(a)3(c)). Then, clinical and pathological analyses indicated that miR-431-5p was significantly associated with the pathologic stage (), miR-1301-3p was significantly associated with the T stage (), and miR-769-5p and miR-1301-3p were both significantly associated with histologic grade () (Table 2).

3.3. The Expression and Prognostic Value of the Three DMEs

Furthermore, TCGA data were used to validate the expression of the three DEMs. We found that miR-1301-3p, miR-431-5p, and miR-769-5p were upregulated in TCGA databases (Figure 3(e)). Based on the risk grade results, all 168 ESCA patients were classified into a high- or low-risk group. The Kaplan–Meier survival analysis indicated that lower expression of the three-miRNA signature was correlated with a higher survival rate of ESCA patients () (Figure 3(d)). In addition, we performed univariate and multivariate Cox regression analyses to examine the prognostic role of the three-miRNA signature (high-risk vs. low-risk) according to clinical features. The results of univariate analysis indicated that the pathological stage (hazard ratio , ), T stage (, ), lymph node status (, ), distant metastasis (, ), and expression of the three DEMs (, ) were significantly associated with the prognosis of ESCA patients. In a multivariate analysis, pathologic stage (, ) and the three-miRNA signature (, ) were shown to be potential independent factors in predicting the prognosis of ESCA patients (Table 3).

3.4. Functional Enrichment Analysis

The target genes of miR-1301-3p, miR-431-5p, and miR-769-5p were analyzed using four bioinformatics tools: miRDB, TargetScan, miRWalk, and DIANA. We then identified the overlapping target genes for each miRNA obtained from the four online tools using a Venn diagram (Figure 4). As a result, a total of 150 consensus target genes were identified. Moreover, the biological process analysis indicated that, among these genes, there was functional enrichment in chromosome modification, histone remodeling, cell cycle, signal transduction regulation, and transcription regulation. The KEGG pathway results indicated that the genes were significantly enriched in cancer-related pathways, including viral carcinogenesis, forkhead box O (FoxO), vascular endothelial growth factor (VEGF), human epidermal growth factor receptor 2 (ErbB2), and mammalian target of rapamycin (mTOR) signaling pathways (Figure 5).

3.5. PPI Network Construction

A PPI network among the target genes of the three miRNAs was established using the STRING database and visualized using Cytoscape (Figure 6). We used “degree more than 6” as the cut-off criterion to identify edge counts of every gene in the PPI network. Of the 150 target genes, 51 were selected to establish the PPI, containing 51 nodes and 84 edges. The three most important hub genes, MAPK1, RB1, and CLTC, were obtained using the Cytoscape plugin CytoNCA.

4. Discussion

Due to improvements in screening programs and advancements in endoscopy, the mortality of ESCA patients has decreased over the last several decades. Nevertheless, globally, there were still approximately 509,000 esophageal cancer-related deaths in 2018 [17]. Thus, it is critical to understand the molecular mechanisms of ESCA and explore reliable prognostic biomarkers. A large number of studies have reported that miRNAs could play a crucial role in cancer by functioning as oncogenes or tumor suppressor genes [18]. Moreover, it has been demonstrated that the miRNA profile of tissues exhibits characteristic changes in certain types of tumors [19, 20]. A number of miRNAs have been shown to participate in tumor progression, invasion, and metastasis, including miR-134 [21], miR-145 [22], miR-625 [23], and miR1294 [24]. However, due to clinical heterogeneity, sample insufficiency, and use of various data processing methods, there is inconsistency in the results of these studies.

In this study, we combined two microarray profiling datasets from the GEO database and RNA-seq datasets from the TCGA database to analyze common DEMs in ESCA. A total of 240 normal control tissue samples and 395 ESCA tissue samples were available for the DEM screen, and a total of 33 overlapping DEMs, of which 27 were significantly upregulated and 6 significantly downregulated, were identified. We then determined the prognostic value of each DEM by screening the clinical features downloaded from the TCGA database. Three DEMs, miR-1301-3p, miR-431-5p, and miR-769-5p, were shown to be associated with the OS of ESCA patients. Differential expression analyses of the three DEMs in TCGA validated that they were significantly upregulated in ESCA. It is worth noting that the functions of the three miRNAs identified in our study differ depending on cancer type, and they may act as oncogenic miRNAs or anticancer miRNAs. MiR-431-5p, formerly reported as a particular nervous system miRNA, has also been identified as a novel tumor-related miRNA [25]. It has been shown to be a prognostic biomarker for colorectal cancer (CRC) patients, as it plays an oncogenic role in CRC [26]. However, Rapa et al. have reported that miR-431-5p can inhibit lung carcinoma development, and miR-431-5p is also significantly downregulated in carcinoids that metastasize to the lymph nodes [27]. In addition, miR-769-5p has been involved in the oncogenesis and development of melanoma and can promote the proliferation of this malignancy by suppressing GSK3B [28]. However, Zhao et al. demonstrated that miR-769-5p is significantly downregulated in non-small-cell lung carcinoma [29]. As for miR-1301-3p, Song et al. have shown that it promotes prostate cancer by activating the Wnt pathway [30]. Conversely, Chao et al. reported that miR-1301-3p inhibits the migration, invasion, and angiogenesis of hepatocellular carcinoma by targeting BCL9 via Wnt/β-catenin signaling [31]. The function of these three miRNAs in ESCA remains poorly understood. Our findings demonstrate that they are upregulated and associated with the prognosis of ESCA.

In order to gain further insight into the molecular function of the three miRNAs, we analyzed their potential target genes, performed functional enrichment analysis, and established the PPI network. The results of GO annotation and KEGG enrichment analysis revealed that the three miRNAs are involved in cancer-related signaling pathways, including the viral carcinogenesis, FoxO, ErbB2, VEGF, and mTOR signaling pathways. It has been reported that FoxO transcription factors regulate tumorigenesis and drug resistance. Maxim et al. showed that the ErbB2/PI3K signaling pathway modifies drug sensitivity in patients with ESCA [32]. VEGF signaling has been demonstrated to play a critical role in the growth and metastasis of multiple cancers [33]. Furthermore, the PI3K/Akt/mTOR signaling pathway can promote cell apoptosis and decrease cell proliferation by reducing Plk1 and could be a potential therapeutic target [34]. In the PPI network, we found three high-degree proteins encoded by the target genes MAPK1, RB1, and CLTC, which play key roles in a variety of tumors. A previous study demonstrated that abnormal expression of MAPK1 contributes to the occurrence of ESCA [35]. It has been reported that RB1 expression was downregulated in esophageal small-cell carcinoma [36]. CLTC is associated with endocytosis and mitosis in cancer cells [37]. Further functional verification is needed to confirm these predictions, which could provide potential therapeutic targets in the treatment of ESCA.

Our study has several limitations. First, although the identified DEMs were associated with ESCA, no additional experiments were conducted to validate these findings. Second, although we added TCGA data validation to increase our sample size and confirm our findings, the majority of the ESCA samples were from esophageal squamous cell carcinoma. Therefore, further experimental exploration is needed to verify the ability of DEMs to determine the prognosis of ESCA.

5. Conclusions

We identified a novel three-miRNA signature (miR-1301-3p, miR-431-5p, and miR-769-5p) as a potential prognostic biomarker for ESCA patients. Further experimental studies with larger sample sizes are needed to validate our findings and elucidate the mechanisms of the three DEMs in esophageal cancer.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

The authors thank Editage (https://www.editage.com/) for editing this manuscript. This research was supported in part by the Japan China Sasakawa Medical Fellowship (2017816) and the China Scholarship Council (201908050148).

Supplementary Materials

Supplementary Table 1: the names of the common differentially expressed miRNAs (DEMs) among the three datasets. Supplementary Figure 1: Kaplan–Meier survival curves of other 30 differentially expressed miRNA based on TCGA. (Supplementary Materials)