Background. Colorectal cancer (CRC) is one of the leading causes of cancer death worldwide. Successful treatment of CRC relies on accurate early diagnosis, which is currently a challenge due to its complexity and personalized pathologies. Thus, novel molecular biomarkers are needed for early CRC detection. Methods. Gene and microRNA microarray profiling of CRC tissues and miRNA-seq data were analyzed. Candidate microRNA biomarkers were predicted using both CRC-specific network and miRNA-BD tool. Validation analyses were carried out to interrogate the identified candidate CRC biomarkers. Results. We identified miR-451a as a potential early CRC biomarker circulating in patient’s serum. The dysregulation of miR-451a was revealed both in primary tumors and in patients’ sera. Downstream analysis validated the tumor suppressor role of miR-451a and high sensitivity of miR-451a in CRC patients, further confirming its potential role as CRC circulation biomarker. Conclusion. The miR-451a is a potential circulating biomarker for early CRC diagnosis.

1. Introduction

To date, colorectal cancer (CRC) is the second leading cause of cancer death among men and women in the United States, and it is also becoming one of the most death-causing cancer worldwide [1]. Previous clinical studies showed that almost 90% of CRC patients who are diagnosed at an early stage have an extended survival rate of 5 to 10 years, whereas only those 12% of the patients can survive when diagnosed at the late stage [2, 3]. This highlights the importance of early diagnosis of CRC. Nowadays, several clinical approaches have been developed for CRC early detection, including fecal occult-blood testing (FOBT), computed tomography (CT or CAT) scan, colonoscopy, and molecular tumor markers [4]. In particular, numerous clinical markers, which include the carcinoembryonic antigen (CEA), cyclooxygenase-2 (COX-2), and thymidyate synthetase (TS), have been identified in the last several decades [5, 6]. The role of biomarker in CRC diagnosis is becoming more important to improve early diagnosis and better prognosis. However, it is still a challenge to create a more accurate, fast, and specific diagnostic and prognostic biomarkers given that CRC is a complex disease with inherently personalized pathologies. Thus, novel molecular biomarkers for CRC diagnosis and prognosis are still highly in demand.

Noncoding RNAs, which include microRNAs (miRNAs), circular RNAs (circRNAs), and long noncoding RNAs (lncRNAs), are investigated for its potential application in diseases diagnosis. Specifically, miRNAs are the most studied noncoding RNAs, which are a set of small endogenous noncoding RNAs that act as the upstream regulators of many biomolecules and pathways. Because they are easy to obtain from body liquids and they relevantly stable in extreme physiological conditions including extreme changes in pH and temperature [7], they can be excellent candidates as circulating biomarkers for complex diseases [8, 9]. Emerging evidence revealed that dysregulation of circulating miRNAs might correspond to tumor genesis and development [10, 11]. For example, relevant studies found that miR-181a can promote angiogenesis in CRC tissues by regulating SRCIN1 so that SRC/VEGF signaling pathway can be promoted [12]. Jin Y et al. observed that miR-32 plays a key role in cell proliferation and migration, as well as in suppressing apoptosis in colon cancer [13]. These studies further enhance the essential roles of microRNAs in cancer and the necessity of identifying novel CRC biomarkers.

In this study, we performed an integrated bioinformatics analysis based on multiple microarray and miRNA-seq data to discover novel circulating miRNA biomarkers for CRC diagnosis (Figure 1). Further downstream regulation and function of the candidate biomarkers were also explored. Our research findings could provide new research strategies for CRC biomarker discovery and new insights for CRC diagnosis.

2. Materials and Methods

2.1. Microarray Data Collection and Processing

CRC relevant microarray data were identified by closely searching the GEO DataSets with the following keywords: “colon[Title] AND (cancer [Title] OR carcinoma [Title] OR tumor [Title]) AND “Homo sapiens”[porgn: txid9606].” Furthermore, filters were set to “Expression profiling by array” and “Non-coding RNA profiling by array.” Three publicly available microarray datasets on CRC versus normal colon tissues and serum were downloaded, where GSE41258 is the gene expression array data and GSE112264 and GSE113486 are the microRNA expression array data. All data were downloaded in raw data format. The detailed information on the three datasets is shown in Table 1.

For GSE41258, probe sequences were mapped using the miRBase to obtain the unified name [14]. The expression levels of genes with multiple probe IDs were replaced by their average probe density. Probes with blank gene names and multiple gene names were removed.

2.2. Differentially Expressed Genes and microRNA Extraction

In this study, the Limma package was utilized for differentially expressed genes (DE-genes) extraction from GSE41258 [19]. All data were firstly normalized by using “normexp” method, with an offset value of 0. The background-subtracted data were normalized through “quantile” algorithm. All data were then processed by calculating the averages of each miRNA for further statistical analysis. The Student t-test was applied to calculate the significant differences ( value). Fold changes were calculated according to each gene expression level, comparing between cancer and control group. Final DE-genes were determined using two cutoff criteria: value < 0.05 and .

To obtain the differentially expressed microRNAs (DE-miRNAs), two datasets were processed using the Limma package: DE-miRNA screening criteria were the same as what we set in Limma analysis. We then made an overlap between these two DE-miRNA sets. Those differentially expressed miRNAs in both datasets were selected for further validation.

2.3. TCGA miRNA-Seq Data Analysis

The microRNA sequencing (miRNA-seq) data were acquired from The Cancer Genome Atlas (TCGA) and filtered using the following strategy: primary site: colon; project: TCGA-COAD; disease type: adenomas and adenocarcinomas; experimental strategy: miRNA-seq; and data type: miRNA expression quantification. A total of 388 microRNA sequencing data with 371 cases were included in our study. One recurrent tumor sample and one metastatic sample were removed. Meanwhile, samples that showed treatment history or did not provide any detailed treatment information were also excluded based on their annotation files. Hence, 329 tumor miRNA-seq datasets and 6 healthy datasets were finally obtained to further investigate DE-miRNAs. The EdgeR was employed for expression analysis [20]. For DE-miRNA cutoff criteria, samples were grouped based on value < 0.05 and .

2.4. Biomarker Prediction

To identify potential candidate microRNA biomarkers, we used a public microRNA biomarker discovery tool named miRNA-BD, which is based on the DE-genes we acquired from Section 2.2 [21]. A human miRNA-mRNA interaction network was set as default. Two built-in feature parameters, namely the number of single-line regulation (NSR) and transcription factor percentage (TFP), were used here for biomarker identification. The thresholds of NSR and TFP were both set at 2, and the cutoff criteria of value of these two parameters were set at 0.05. In order to improve our ability to discover more convincible CRC biomarkers, we also set a CRC-specific parameter, which is the CRC biomarker percentage (CBP). Relevant studies showed that microRNAs with more target genes are disease-associated or disease biomarkers, and these microRNAs are more likely to serve as disease-specific biomarkers [8]. The formula of the calculation is as follows:

Here, stands for the number of target CRC-associated genes or biomarkers and is the total targets of the microRNAs. As we already acquired DE-miRNAs from miRNA microarray and miRNA-seq analysis, a trioverlap was made to identify more robust miRNA biomarkers.

2.5. Identification of Target Genes of miR-451a

The target genes of miR-451a were derived from two online platforms: the miRTarBase and miRWalk 2.0 [22, 23]. First, we searched the hsa-miR-451a in miRTarBase, which provided interactive information about experimentally validated miRNA-mRNA interaction. Further expansion of the target genes was conducted in the miRWalk 2.0, where we both examined experimentally the validated miRNA-mRNA interaction and computational predicted miRNA-mRNA interaction. The 4 algorithms, miRWalk, miRanda [24], RNA22 [25], and Targetscan [26] were used for prediction. Genes that were predicted to interact with miR-451a by a certain algorithm were labeled as 1 out of 4. A threshold of 2/4 was utilized for predicted target gene screening.

2.6. Identification of Target Long Noncoding RNAs of miR-451a

DIANA-LncBase v2 was used for miRNA-lncRNA interaction validation and prediction [27]. Experimentally validated lncRNAs, which are regulated by miR-451a, were selected. A threshold of 0.7 was set to screen out lncRNAs, which are predicted to be regulated by miR-451a in colon tissue. To examine CRC-associated lncRNAs, Lnc2Cancer 2.0 database was employed for lncRNA data mining [28] and a final overlap was made between CRC-associated lncRNAs and target lncRNAs of miR-451a.

2.7. Functional Enrichment Analysis

To investigate further the role of miRNA-451a, two enrichment analyses, namely gene ontology and pathway analysis, were performed to validate the association between target genes of miR-451a and CRC. The Search Tool for the Retrieval of Interacting Genes (STRING) was used for gene ontology annotation and Kyoto Encyclopedia of Genes and Genomes pathway enrichment [29, 30]. Terms and pathways with value < 0.05 were considered as significantly enriched items. The top 20 most significantly enriched pathways and top 10 most enriched terms were selected, and the association between these items and CRC were further validated through literature mining.

2.8. Protein-Protein Interaction Network Analysis

STRING online tool was applied to examine the protein-protein interaction (PPI) patterns of target genes of miR-451a. All target genes, including experimentally validated genes and computational predicted genes, were submitted to STRING for analysis. A combined score of >0.4 was set as the cutoff criterion. The PPI network data generated by STRING was further loaded into Cytoscape for network analysis. CytoHubba, a plug-in tool in Cytoscape, was used for network degree, betweenness, and closeness calculation [31]. The top 10 gene nodes were selected as significant hub genes. Functional modules were predicted using the Molecular Complex Detection (MCODE).

3. Results

3.1. Identification of Differentially Expressed Genes

To obtain the differentially expressed genes in CRC, we performed a microarray analysis on GSE41258 obtained from GEO DataSets. Data was firstly normalized using the Limma package (Figure 2(a)), and the eBayes algorithm, which was integrated into Limma package, was applied for DE-gene detection. A total of 707 differentially expressed genes were detected through two thresholds’ screening: value < 0.05 and . Among these genes, 447 were downregulated in primary tissues, while 260 genes were upregulated. The expression pattern of the 707 genes and the top 10 most significant DE-genes are shown in Figures 2(b)2(d).

Furthermore, literature validation was conducted for the top 10 most significant DE-genes. We noticed that most of the relevant studies revealed potential mechanisms wherein the DE-genes are involved in colon tissue mutation and tumorigenesis. In particular, carbonic anhydrase 1 (CA1) has been implicated as a marker for colon epithelium differentiation [32]. Ghaleb et al. observed that the deletion of Klf4 will lead to a downregulation of CA1, which is highly expressed in colorectal cancer cells [33, 34]. In our analysis, we found a significant upregulation of Klf4 ( value = 4.45-42 and ). Our data enhanced the relationship between Klf4 and CA1, indicating that the two are most likely coexpressed in colorectal cancer cells. Other genes, such as ADH1B [35], GUCA2A [36], SCNN1B, and CHP2 [37, 38], were also reported to play a role in CRC cell differentiation and tumorigenesis. Taken these results together, our analysis identified DE-genes that are involved in different stages of CRC. These results indicate the demands of cancer cells for quick proliferation, tissue invasion, and metastasis. Hierarchical clustering of DE-genes showed a well-distinguished pattern between primary tumor tissues and healthy colon tissues, suggesting the possibility of selecting features for CRC diagnosis.

3.2. Differentially Expressed miRNA Detection and Biomarker Selection

To build a CRC specific miRNA-mRNA interaction network, we obtained both miRNA expression microarray data and miRNA-seq data from Geo DataSets and TCGA, respectively. Here, 730 DE-miRNAs with 456 downregulation and 331 upregulation were detected from GSE112264 and 1073 DE-miRNAs with 605 downregulation and 274 upregulation were detected from GSE113486 using the Limma analysis. In miRNA-seq analysis, 40 DE-miRNAs, in which 11 are downregulated and 29 are upregulated, were obtained. The detailed expression information is shown in Figure 3.

Meanwhile, with the DE-genes obtained from Limma DE-gene analysis, a CRC-specific miRNA-mRNA regulatory network was constructed. Then, we performed a miRNA biomarker prediction through miRNA-BD, in which we included the CBP index for more precise prediction. In total, 41 candidate miRNA biomarkers were produced. Among them, 30 miRNAs (73%) were reported to involve in CRC genesis and metastasis. For example, Chai et al. found that miR-223-3p was upregulated in colon cancer [39] and its expression enhancement could cause the suppression of cell apoptosis. Kim et al. found that the expression of miR-590-5p is significantly higher than that in their matched primary CRC [40]. Particularly, by validating the Colorectal Biomarker Database [5], 8 microRNAs (miR-371a-5p, miR-218-5p, miR-21-5p, miR-22-3p, miR-96-5p, miR-150-5p, and miR-200c-3p) have been reported to be useful in CRC diagnosis, treatment, and prognosis biomarkers. These results not only suggest the potential miRNA biomarkers for further investigation but also reveal the accuracy and robustness of miRNA-BD model, which makes our results more convincing.

After overlapping the DE-miRNAs with candidate miRNA biomarkers from miRNA-BD, only one miRNA, which is miR-451a, was identified as the candidate biomarker. Interestingly, miR-451a is downregulated in CRC tissues (with value = 0.0288 and ) but upregulated in human sera (with value = 1.78-10, value = 4.10-13, and and ), indicating changes on gene expression levels regulated by miR-451a between primary tumor tissue and serum. Relevant studies confirmed the expression pattern of miR-451a in primary tumor tissue. Mamoori et al. noticed that the overexpression of miR-451a in colon cancer cells has negative influence on cell proliferation and may increase cell apoptosis [41]. They found that miR-451a results in decreased expression of Oct-4, Snail, and Sox-2 in CRC tissues, among which Oct-4 and Sox-2 are markers of stem cells. Moreover, these two genes are involved in CRC development [42, 43]. Snail is a marker of epithelial-mesenchymal transition (EMT). Meanwhile, Li et al. demonstrated that miR-451a may increase the expression of FoxO3, leading to the downregulation of Ywhaz protein and further inhibition of CRC growth [44]. These findings revealed a tumor suppressor role of miR-451a in CRC primary tumor tissue. In summary, although the regulatory mechanism of miR-451a in human sera is still unclear, miR-451a has a potential role as a circulating biomarker for CRC.

3.3. Downstream Target Validation

To investigate the functional role of miR-451 in CRC, we examined the downstream targets of miR-451a, including mRNAs and lncRNAs. For mRNA target identification, we applied miRWalk 2.0 for both experimental validated targets and computational predicted targets, as well as miRTarBase for experimental validated targets. A total of 24 and 31 validated targets were found from miRWalk and miRTarBase, respectively. In total, 31 experimental validated targets were obtained after a combination of these two results. The regulatory network of miR-451a and these mRNA targets are shown in Figure 4(a). Among these targets, 20 of them (65%) were reported to be involved in CRC tumorigenesis, development, and metastasis. For example, a well-known gene named ROR2 that is involved in both canonical and noncanonical signaling pathways, such as Wnt signaling pathway, was reported to be associated with CRC [45]. The ROR2 protein is a transmembrane receptor for Wnt noncanonical pathway activation. Recent study revealed that the noncanonical Wnt target genes are dependent on ROR2 [46]. Lara et al. also demonstrated that ROR2 is repressed by aberrant promoter of hypermethylation in CRC tissues [47]. Another well-studied gene called MAPK1, which is regulated by miR-451a, was reported recently to be upregulated with the inhibition of miR-145 in CRC tissues [48]. Upregulation of MAPK1 is associated with the promotion of cancer cell proliferation and differentiation [49]. These validated data further strengthen the close relationship between miR-451a and CRC.

MiRWalk was also used to predict potential mRNA targets that are regulated by miR-451a. Under the threshold of 2 as described in methods, a total of 1132 candidate mRNAs were obtained. Most of the predicted mRNAs were reported previously to be involved in CRC, including DISC1, EREG, PPARA, and SYNJ2 [5053].

The regulatory network of miR-451a and lncRNA targets was built based on the data we acquired from the DIANA-LncBase. We obtained 38 lncRNAs that were validated by immunoprecipitation assay. In addition, 3 lncRNAs were detected through the prediction module of this tool. The miRNA-lncRNA regulatory network of miR-451a is presented in Figure 4(b). We also examined the lncRNAs and demonstrated its association with CRC using the Lnc2Cancer database and 208 lncRNAs selected. By overlapping the CRC-associated lncRNAs and lncRNA targets, 5 of the targets were finally obtained, including SLC25A25-AS1, SNHG15, LOC283070, MALAT1, and NEAT1. Among these lncRNAs, MALAT1 has been suggested to play an important role in oxymatrine resistance in CRC and has the potential to be a therapeutic target and prognosis biomarker for CRC patients [54, 55]. NEAT1 was also reported as a promising circulating and prognosis biomarker for CRC [56, 57]. Also, Li et al. found that the decreasing expression levels of SLC25A25-AS1 promote cell proliferation and chemoresistance in CRC [58]. Other evidence showed the oncogenesis potential of SNHG15 and LOC283070 [59, 60].

Taken these results together, our analysis of downstream targets of miR-451a suggests its multiple roles in different stages of CRC, enhancing its potential and rationality to serve as a biomarker for CRC diagnosis.

3.4. KEGG Pathway and Gene Ontology Enrichments

We performed a functional enrichment analysis from the two different databases to investigate the mechanisms of miR-451a: KEGG pathway database and Gene Ontology database. The enrichment analysis was conducted using the STRING. The top 20 significantly enriched pathways and top 10 significantly enriched ontology items were selected at each level, as shown in Figure 5. The enriched GO items in BP included the positive regulation of cell process, developmental process, and organ development, suggesting that genes regulated by miR-451a may have positive regulations for cell development. This result further confirmed the suppressor role of miR-451a in CRC tissues. The regulation activity of miR-451a may happen in cytosol, cytoplasm, and nucleus, as supported by the results of GO items in CC. Results of GO items in MF, such as protein kinase activity, phosphotransferase activity, and kinase binding, indicated that genes regulated by miR-451a are strongly associated with protein activation; a set of study evidences support these results [6163]. In KEGG pathway analysis, we observed that most of the top 20 pathways (about 70%) are related to CRC occurrence and development, including cAMP signaling pathway [64], FoxO signaling pathway [44], MAPK signaling pathway, and signaling pathways regulating pluripotency of stem cells [49, 65]. Notably, colorectal pathway ranked No. 4 in this enrichment, which enhanced our analysis confidence. Our enrichment study revealed the regulatory roles of miR-451a and the mechanisms it may be involved in. Thus, these data explained why miR-451a could serve as a promising biomarker for CRC diagnosis.

3.5. PPI Network Construction and Detection of Hub Nodes

Target mRNAs from both experimentally validated group and miRWalk predicted group (1163 mRNAs) were loaded into the STRING for PPI investigation. A total of 438 mRNAs were selected through this process for network construction. Results retrieved from STRING were processed and a PPI network for the target genes of miR-451a was visualized using the Cytoscape, as shown in Figure 6(a). The degree of a node reflects the number of connections with this node and the higher degree the node has, the more indispensable it will be for the stabilization of the network. Thus, degrees of these mRNAs were calculated and the top 10 hub nodes were screened, including AKT1, MYC, IL6, MAPK1, CCND1, RPL6, RPS8, RPS4X, RPL13A, and RPL8. The degree distributions of nodes in the network and the degree distributions of the top 10 hub nodes are presented in Figures 6(b) and 6(c). Interestingly, we noticed that these top 10 hub mRNAs could be roughly divided into four functional groups: AKT1 and MAPK1 are protein kinase coding genes; MYC and CCND1 are genes involved in cell cycle; RPL6, RPS8, RPS4X, and RPL13A are ribosomal protein coding genes, and IL6 is involved in T cell activation and tumor immune microenvironment modifications. In our gene expression analysis, we found that the expression of MYC and CCND1 was upregulated, with logFC of 1.3602 and 1.2455, respectively, and the expression of ribosomal proteins, such as RPL6, RPS8, and RPS4X, was slightly higher in CRC patients. However, the expression of IL6 showed no significant change. These results further showed the remarkable roles of miR-451a in CRC growth, inflammation, and differentiation.

Submodules were also detected using MCODE. The most significant module is presented in Figure 6(d). We noticed that most of the genes in this module are ribosomal protein coding genes. Emerging evidences have shown potentials of these proteins involving CRC carcinogenesis and drug targets [6668]. These studies may give new insight into miR-451a function as it links with ribosomal proteins.

3.6. Diagnosis Potential of miR-451a

Finally, we performed ROC curves and survival curves to test the discriminatory performance of miR-451a. These two graphs are shown in Figure 7. In ROC curves, totally 4 datasets were utilized here for validation. For GSE112264 and GSE113486 that we used for biomarker discovery, the values of area under curve (AUC) are 91.24% (CI: 0.857-0.968) and 89.45% (CI: 0.838-0.951), respectively. Besides, we also downloaded 2 new datasets from GEO DataSets, GSE113740 and GSE124158, to further confirm the biomarker reliability of miR-451a. The AUC values are 79.60% (CI: 0.614–0.978) and 92.2% (CI: 0.886–0.958), respectively. These consistent results indicate the favorable performance of miR-451a to distinguish CRC patient sera and normal sera. For survival curves, we obtained patients’ information from TCGA and a total of 300 patients’ data were used here. Patients were equally divided into low miR-451a expression group and high miR-451a expression group. The Kaplan–Meier analysis was used here for survival analysis. The results suggest that patients with high miR-451a expression have significantly higher survival rate compared to patients with low miR-451a expression ( value = 0.0486). These data show the tumor suppressor role of miR-451a in CRC patients, which are consistent with several previous findings of the role of miR-451a [69, 70]. These results give a strong support that miR-451a could evidently discriminate between CRC sera and normal sera, adding to previous evidences that pinpoint miR-451a as a diagnostic biomarker for CRC.

4. Discussion

To date, there are accumulating evidences that reveal various roles of miRNAs in the mechanisms of cancer. In clinical diagnosis, many studies demonstrated their brilliant performance in cancer detection and treatment due to their perfect biomarker characteristics [11]. However, previous studies mainly focus on dysregulated miRNAs in primary tumor, which have great contributions to expanding our understanding of the mechanisms of cancer but it provided limited knowledge for clinical diagnosis of cancer, especially for early diagnosis [71]. Hence, improvements in our current strategies for tumor screening are urgently required. In recent years, an increasing number of studies suggested that components of tumors are shed into the blood circulation, which could be detected from body liquids. These findings improve many aspects of tumor screening and management and give researchers new insights for the methods for early detection of cancer.

In this study, we used publicly available microarray data and miRNA-seq data from GEO DataSets and TCGA data portal for integrative bioinformatics analysis strategy to identify novel circulating miRNA biomarkers for CRC diagnosis. We considered miRNAs as candidate circulating biomarkers under at least two essential characteristics: first, they are dysregulated in CRC sera compared with normal sera; second, they are sufficiently powerful to indicate the status of health and disease. To address these two characteristics, we selected DE-genes at first. Then, a specific tool named miRNA-BD was used to generate the candidate miRNA biomarkers using the inputs of DE-genes. This tool is based on the Pipeline of Outlier MicroRNA Analysis (POMA) algorithm, which has been validated in many other complex diseases, such as pediatric acute myeloid leukemia and autism [8, 72]. At the same time, we retrieved serum miRNA expression microarray data and primary tumor miRNA-seq data to check the DE-miRNAs in both CRC serums and primary tumors. Taken these data together, we made an overlap and finally miR-451a was selected as the candidate CRC biomarker. An interesting expression difference was noticed in the expression analysis, whereby miR-451a is upregulated in CRC sera but downregulated in primary tumors, indicating a relevant pathway signaling and gene expression changed from primary tumors to human sera.

To further investigate the functional roles of miR-451a, we performed various downstream analysis of miR-451a, including target identification, functional enrichment, and PPI network analysis. In target identification, we noticed that 20 out of 31 validated gene targets were reported to be involved in CRC occurrence and development, most of which are associated with cancer cell proliferation and differentiation [45, 48, 63]. Meanwhile, most of the predicted gene targets through miRWalk are also relevant to CRC. We returned to check the expression pattern in our microarray analysis and found upregulations of these genes, suggesting its role in tumor suppression role in CRC primary tumors. We also examined the lncRNA targets of miR-451a in which 41 lncRNA targets were obtained after overlapping the results from experimentally validated targets and computationally predicted targets. Here, we observed that 5 of the targets were previously demonstrated to have contributions to CRC drug resistance and prognosis. Results of pathway and GO enrichment analysis provided additional evidences for the functions of these targets. These target identification results illustrate the general functional pattern of miR-451a.

We also performed PPI network analysis to reveal the correlations among these target genes of miR-451a. Through PPI network construction, a series of hub genes were detected. Most significant hub genes are mainly enriched on protein kinases, ribosomal protein, cell cycle regulation, and tumor immune microenvironment modifications. We also identified submodules of this network, and the module with highest MOCDE score showed a potential role of miR-451a in regulating ribosomal protein coding, although the expression of these target genes in our microarray analysis is not significantly discriminative.

Finally, we measured the biomarker robustness of miR-451a through ROC curve analysis and survival analysis. The results of ROC curve showed high sensitivity of miR-451a, with 91.24% and 89.45% AUC values in two sera microarray datasets. The survival analysis also exhibited a significant distinguishing pattern between the high miR-451a expression group and the low miR-451a expression group.

Taken these results together, we managed to illustrate the functional pattern of miR-451a at a systematic level and identified it as a potential circulating biomarker for CRC. However, the dynamic and complexity of CRC required further confirmations of miR-451a through clinical trials, specifically to elucidate the change in expression levels of miR-451a from the primary tumor to serum. Detailed mechanism research of this change should also be conducted in the future.

5. Conclusions

In conclusion, we identified the significance of miR-451a in CRC. Using integrative data mining and bioinformatics analysis, we explained why miR-451a is an excellent circulating biomarker for early CRC diagnosis. Studies with large biological and clinical data or studies with detailed biological experiments should be carried out to confirm the critical role of miR-451a in CRC.

Data Availability

The figure and table data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


The authors would like to express their gratitude to EditSprings (https://www.editsprings.com/) for the expert linguistic services provided. This work was supported by the Health Commission of Henan Province (Grant Number: LHGJ20190665).