Identification of Key Genes and Pathways Involved in Circulating Tumor Cells in Colorectal Cancer
Background. Characterization of the features associated with circulating tumor cells (CTCs) is one of major interests for predicting clinical outcome of colorectal cancer (CRC) patients. However, the molecular features of CTCs remain largely unclear. Methods. For identification of key genes and pathways, GSE31023, contained CTCs from six metastatic CRC patients and three controls, was retrieved for differentially expressed gene (DEG) analysis. Protein-protein interaction networks of DEGs were constructed. Hub genes from the network were prognostic analyzed, as well as the association with tumor-infiltrating immune cells. Results. 1353 DEGs were identified between the CTC and control groups, with 403 genes upregulated and 950 downregulated. 32 pathways were significantly enriched in KEGG, with ribosome pathway as top. The top 10 hub genes were included, including eukaryotic translation elongation factor 2 (EEF2), ribosomal protein S2 (RPS2), ribosomal protein S5 (RPS5), ribosomal protein L3 (RPL3), ribosomal protein S3 (RPS3), ribosomal protein S14 (RPS14), ribosomal protein SA (RPSA), eukaryotic translation elongation factor 1 alpha 1 (EEF1A1), ribosomal protein S15a (RPS15A), and ribosomal protein L4 (RPL4). The correlation between CD4+ T cells and RPS14 () was the highest in colon cancer while CD8+ T and RPS2 () was the highest in rectal cancer. Conclusion. This study identified potential role of ribosome pathway in CTC, providing further insightful therapeutic targets and biomarkers for CRC.
Colorectal cancer (CRC) is one of the major digestive malignancies in the world. During the tumor progression, hematogenous tumor cell disseminates and initiates the metastatic cascade of CRC. Circulating tumor cells (CTCs) exist in the peripheral blood of patients with various solid tumors including colorectal cancer and may lead to tumor metastasis . With the development of liquid biopsy, CTCs have been proven to play an important role in detecting early development of metastasis and monitoring the curative effect of adjuvant therapy . Therefore, molecular characterization of CTCs has been one of major interests for predicting clinical outcome of patients .
Due to the low concentration of CTCs in blood, their detection needs highly sensitive and specific methods, including separation (concentration) and identification (detection). At present, CTCs and peripheral hematopoietic cells are generally distinguished according to their biological characteristics (expression and activity of cell surface proteins) and physical characteristics (size, density, charge, and deformability). Compared to the diameter of the blood cells (8 μm), tumor cells are larger and less likely to deform. Based on these characteristics, many membrane filtration devices appeared for CTC enrichment, including microelectromechanical system- (MEMS-) opticbased microfilter, isolation by size of epithelial tumor cells (ISET), CellSieve™, ScreenCell®, and CellOptics® . However, the morphological method to distinguish tumor cells from blood cells lacks certain specificity, and some smaller CTCs may be lost. Thus, immunocytochemistry and nucleic acid technology, highly sensitive and specific methods, have been commonly used to identify CTCs by detecting surface biomarkers with distinguished expression. Epithelial cell adhesion molecule (EpCAM), the most used antigen in CTCs, has been proven to be one of the key molecules associated with Wnt signaling pathway and cellular adhesion [5, 6]. During the initiation of spread, profile-changed tumor cells were increased in bloodstream with improved risk to form secondary tumor. At the origin of metastasis, EpCAM expression was absent in some cells due to epithelial-to-mesenchymal transition (EMT) process, while emerged again with activated mesenchymal-to-epithelial transition (MET) when metastatic lesions have been formed [7–9]. CTCs can undergo EMT and MET processes with a wide spectrum of CTC phenotypes in the bloodstream. Thus, the isolation of CTC-based solely measurement of EpCAM expression remains challenging to the isolation of CTCs. More markers are needed for higher yield of CTCs [10, 11].
Epidermal growth factor receptor (EGFR), a transmembrane receptor involved in multiple biological processes, has also been regarded as a specific marker of CTCs. Analysis of EGFR status in collected CTCs prior to treatment could potentially be benefit for the patients to select an appropriated targeted therapy. It has been reported that examining mutation of CTC levels in non-small-cell lung cancer (NSCLC) may be helpful in detecting heterogenic mutations in EGFR . In fact, the usage of EGFR in CTCs remains limited due to the limited benefits of targeted therapy.
Collectively, single biomarker could not delineate the whole picture of CTCs with the molecular features yet to be fully characterized. Given the increasing clinical practice and prognostic values of CTCs, this study employed GSE31023 , containing six CTC samples from metastatic CRC patients with three normal controls, to identify potential key genes and pathways associated with CTCs of CRC.
2. Materials and Methods
2.1.1. Gene Expression Profile GSE31023 for Analysis
GSE31023 was the gene expression profiling by array, and all corresponding data was downloaded from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) . This profile contained CTCs from six metastatic CRC patients and three healthy donors as control. And the related CTCs were isolated from 7.5 mL of peripheral blood by immunomagnetic separation using anti-EpCAM-coated magnetic beads (). Briefly, RNA in each sample was extracted and amplified using a whole transcriptome amplification system . GPL13497 (Agilent-026652 Whole Human Genome Microarray ) was the platform for GSE31023.
2.1.2. Functional Annotation of Differentially Expressed Genes (DEGs)
The DEGs between the CTCs and normal cells were identified using the web tool, GEO2R, with predefined cutoff value value < 0.05 and . The gene ontologies (GOs), as well as the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, were employed for selected DEGs using the Database for Annotation, Visualization, and Integrated Discovery platform (DAVID, http://david.abcc.ncifcrf.gov/) [15–18]. Top 10 terms in each category, including biological process (BP), cellular component (CC), and molecular function (MF), were displayed if more than 10 terms were defined as significant ( value < 0.05).
2.1.3. Construction of Protein-Protein Interaction (PPI) Networks
PPI networks of DEGs were performed using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, http://www.string-db.org/) and visualized by the Cytoscape software (version 3.6.0) [19, 20]. Moreover, the Molecular Complex Detection (MCODE) program was used for subcluster identification of the PPI . BiNGO program was used for the GO presentation in the network analysis . Hub genes were defined as the ten genes with highest degree determined by the PPI network.
2.1.4. Expression of Hub Genes in The Cancer Genome Atlas (TCGA)
The mRNA expression boxplot of hub genes of TCGA (colon cancer, COAD and rectal cancer, READ) was retrieved from the gene expression profiling interactive analysis platform (GEPIA, http://gepia.cancer-pku.cn) .
2.2. Correlation of Tumor-Infiltrating Immune Cells (TIICs) and Hub Genes
Tumor Immune Estimation Response (TIMER, https://cistrome.shinyapps.io/timer/) is a novel platform for analyzing the expression abundance of the immune infiltration cells (CD8+ T cells, CD4+ T cells, dendritic cells, macrophages, neutrophils, and B cells) in malignant tumors, which was set up for online comparison based on references in TCGA . Thus, the correlation of hub genes and all immune cells related in tumor was explored via TIMER. The correlation value was corrected by tumor purity .
2.3. Prognostic Values of Hub Gene Signature Defined Risk Groups
The prognostic values of hub gene signature defined risk groups in both overall survival (OS) and disease-free survival (DFS) were explored via the SurvExpress platform (http://bioinformatica.mty.itesm.mx:8080/Biomatec/SurvivaX.jsp) . High- and low-risk groups were determined based on the risk score algorithm .
3.1. Identification and Functional Enrichment Analysis of DEGs
A total of 1353 DEGs were identified between the CTCs and control groups, with 403 genes upregulated and 950 downregulated (Figures 1(a) and 1(b)). A total of 547 BP terms were significantly enriched. The most enriched three terms in BP were SRP-dependent cotranslational protein targeting to membrane, cotranslational protein targeting to membrane, and protein targeting to ER. A total of 142 terms were significantly enriched in CC. The most enriched three terms in CC were cytosolic ribosome, ribosomal subunit, and ribosome. A total of 100 terms were significantly enriched in MF. The most enriched three terms in MF were structural constituent of ribosome, poly (A) RNA binding, and RNA binding (Figure 2(a)). Noteworthy, a total of 32 pathways were significantly enriched in KEGG. The top three were ribosome (false discovery rate, ), systemic lupus erythematosus (FDR =2.39E-04), and intestinal immune network for IgA production () (Figure 2(b)).
3.2. PPI Network Establishment of DEGs
Next, we explored the PPI network of all DEGs. In fact, a total of 496 nodes and 4283 edges were identified within the PPI network (Figure 3). Meanwhile, the functional enrichment network was also displayed (Figures 4(a)–4(c)). The top 10 hub genes include deukaryotic translation elongation factor 2 (EEF2), ribosomal protein S2 (RPS2), ribosomal protein S5 (RPS5), ribosomal protein L3 (RPL3), ribosomal protein S3 (RPS3), ribosomal protein S14 (RPS14), ribosomal protein SA (RPSA), eukaryotic translation elongation factor 1 alpha 1 (EEF1A1), ribosomal protein S15a (RPS15A), and ribosomal protein L4 (RPL4) (Table 1). The top three scored modules were determined by MCODE and further functionally enriched, which also highlighted the role of ribosome (Table 2). Noteworthy, all the hub genes were found downregulated in CTCs.
3.3. Expression of Hub Genes
Of all the expression comparison between tumor and normal, only RPS2 was upregulated in tumor compared to normal in READ (Figure 5). RPS3, RPS5, RPS14, and RPSA were found significantly stage-specific expressed (Figure 6).
3.4. The Correlation between Hub Genes and TIICs
Furthermore, the correlation between hub genes and TIICs was analyzed via the TIMER platform. In colon cancer, the highest correlation was found between CD4+ T cells and RPS14 () and CD4+ T and RPS15A (), as well as dendritic cells and RPS3 (). In rectal cancer, the highest correlation was found between CD8+ T and RPS2 () and macrophage and RPS2 () (Figure 7).
3.5. Prognostic Values of Hub Gene Signature
Given increasing focus has been found in the prognostic roles of gene signature, this study further explored the prognostic values of hub gene signature via the SurvExpress platform. In OS analysis, significant prognostic roles were found between high-risk and low-risk groups (, 95% confidence interval: 1.38-2.87, and ) (Figure 8(a)). Meanwhile, the expression comparison was also illustrated between two groups (Figures 8(b) and 8(c)). In DFS analysis, significant prognostic roles were also found between high-risk and low-risk groups (, 95% confidence interval: 1.2-2.46, and ) with expression comparison (Figure 9).
Commonly, standard patterns for the detection of CTCs in CRC are closely associated with genomic features. In fact, the intrinsic genomic features of metastatic lesions may not be identical to those of primary lesions . During the metastatic progression, tumor cells show reduced adhesion markers and gradually detach from the primary lesion and flow into the circulation system. However, not all of the CTCs could be successfully habited at distant organs. Only a small proportion of tumor cells survives the intrinsic immunological eradication and undergoes profile-change at the secondary lesion. Meanwhile, normal epithelial cells also could join the circulated traveling, guided by inflammation-triggered cytokines . Thus, molecular characterization of CTCs is needed. However, the reculture of isolated CTCs remains technically difficult. Zhang et al. reported that a population of CTCs from 3 patients with breast cancer could be successfully used to form adherent cell line, with limited survival period and proliferation status . Guan et al. have analyzed 7 GEO datasets (GSE99394, GSE31023, GSE82198, GSE65505, GSE67982, GSE76250, and GSE50746) and found that CTCs mainly change epithelial-mesenchymal transition (EMT), cell adhesion, and apoptosis . Based on the study, we further indicated the key genes and pathways mainly involved in CTCs in CRC and revealed more promising biomarkers in CRC prognosis and immunotherapy.
Noteworthy, ribosome pathway was highlighted in this study given the enrichment analysis of DEGs between CTCs and control. Interestingly, most of the hub genes were closely associated with ribosome pathway and all downregulated in CTCs compared to control. Consistently, expression profiling of breast cancer also highlighted the ribosome-related pathways and terms in genes downregulated in CTCs compared to control . In fact, reduced levels of immune signals and apoptotic pathways were also enriched in CTCs of breast cancer . Moreover, mammalian target of rapamycin pathway, constitutively activated by upstream AKT and PI3K pathways, was one of the key targets for persistent/recurrent epithelial ovarian cancer and closely associated with ribosome protein and eukaryotic translation initiation factor . This study highlighted potential role of ribosome in CTCs of CRC, and the analysis of hub genes has opened up a new question as the therapeutic value of ribosome in CTCs.
For 10 hub genes, remarkable correlations with TIICs and prognostic values had been recognized in this study. However, solid validation remained in another independent CTC cohort, instead of conventional tissue-based genome results. Furthermore, only RPS2 was upregulated in tumor compared to normal in rectal cancer of TCGA, which may due to the different molecular expression characteristics between CTCs and solid tumor cells. Therefore, it is reasonable to further validate the results in an independent CTC cohort study.
Our study had the following strengths. First, we further identified the differentially expressed genes and pathways involved in CTCs in CRC. Second, several external datasets were used to verify that these hub genes can be related to the prognosis and immunotherapy of CRC patients. Besides, the study also has some limitations. First, the databases retrieving data from studies were conducted in different ways. Second, the direct relationship between these hub genes in CTCs and clinical characteristics has not been further verified.
This study identified potential role of ribosome pathway in CTC, providing further insightful therapeutic targets for CRC. Moreover, the association between hub genes and CTCs may provide new perspectives for the exploit of new markers.
|CTCs:||Circulating tumor cells|
|EMT:||Epithelial-to-mesenchymal transition process|
|ISET:||Isolation by size of epithelial tumor cells|
|EpCAM:||Epithelial cell adhesion molecule|
|EGFR:||Epidermal growth factor receptor|
|GEO:||Gene Expression Omnibus|
|KEGG:||Kyoto encyclopedia of genes and genomes|
|DAVID:||Database for Annotation, Visualization, and Integrated Discovery platform|
|STRING:||Search Tool for the Retrieval of Interacting Genes/Proteins|
|MCODE:||Molecular Complex Detection|
|TCGA:||The Cancer Genome Atlas|
|GEPIA:||Gene expression profiling interactive analysis platform|
|TIICs:||Tumor-infiltrating immune cells|
|TIMER:||Tumor immune estimation response|
The datasets supporting the conclusion of this article were included within the article.
No consent was necessary.
Conflicts of Interest
All authors declare no conflict of interest in this study.
Ruijun Pan, Chaoran Yu, and Yanfei Shao contributed equally to this work.
This work was supported by the National Natural Science Foundation of China (Nos. 81371598, 81572973, 81402423, 81572818, and 81871984) and Shanghai Municipal Commission of Health and Family Planning (2017YQ062), as well as Shanghai Science and Technology Committee (18695841400).
L. Cabel, C. Proudhon, H. Gortais et al., “Circulating tumor cells: clinical validity and utility,” International Journal of Clinical Oncology, vol. 22, pp. 421–430, 2017.View at: Google Scholar
M. Y. Huang, H. L. Tsai, J. J. Huang, and J. Y. Wang, “Clinical implications and future perspectives of circulating tumor cells and biomarkers in clinical outcomes of colorectal cancer,” Translational Oncology, vol. 9, pp. 340–347, 2016.View at: Google Scholar
D. Lin, L. Shen, M. Luo et al., “Circulating tumor cells: biology and clinical significance,” Signal Transduction and Targeted Therapy, vol. 6, no. 1, p. 404, 2021.View at: Publisher Site | Google Scholar
A. Toss, Z. Mu, S. Fernandez, and M. Cristofanilli, “CTC enumeration and characterization: moving toward personalized medicine,” Annals of translational medicine, vol. 2, no. 11, p. 108, 2014.View at: Google Scholar
A. Seeber, G. Untergasser, G. Spizzo et al., “Predominant expression of truncated EpCAM is associated with a more aggressive phenotype and predicts poor overall survival in colorectal cancer,” International Journal of Cancer, vol. 139, pp. 657–663, 2016.View at: Google Scholar
Y. Meng, B. Q. Xu, Z. G. Fu et al., “Cytoplasmic EpCAM over-expression is associated with favorable clinical outcomes in pancreatic cancer patients with hepatitis B virus negative infection,” International Journal of Clinical and Experimental Medicine, vol. 8, pp. 22204–22216, 2015.View at: Google Scholar
D. Fong, P. Moser, A. Kasal et al., “Loss of membranous expression of the intracellular domain of EpCAM is a frequent event and predicts poor survival in patients with pancreatic cancer,” Histopathology, vol. 64, pp. 683–692, 2014.View at: Google Scholar
M. Bulfoni, M. Turetta, F. Del Ben, C. Di Loreto, A. P. Beltrami, and D. Cesselli, “Dissecting the heterogeneity of circulating tumor cells in metastatic breast cancer: going far beyond the needle in the haystack,” International Journal of Molecular Sciences, vol. 17, p. E1775, 2016.View at: Google Scholar
S. H. Lim, T. M. Becker, W. Chua, W. L. Ng, P. de Souza, and K. J. Spring, “Circulating tumour cells and the epithelial mesenchymal transition in colorectal cancer,” Journal of Clinical Pathology, vol. 67, pp. 848–853, 2014.View at: Google Scholar
M. T. Gabriel, L. R. Calleja, A. Chalopin, B. Ory, and D. Heymann, “Circulating tumor cells: a review of non-EpCAM-based approaches for cell enrichment and isolation,” Clinical Chemistry, vol. 62, pp. 571–581, 2016.View at: Google Scholar
P. K. Grover, A. G. Cummins, T. J. Price, I. C. Roberts-Thomson, and J. E. Hardingham, “Circulating tumour cells: the evolving concept and the inadequacy of their enrichment by EpCAM-based methodology for basic and clinical cancer research,” Annals of Oncology, vol. 25, pp. 1506–1516, 2014.View at: Google Scholar
Q. Zhang, J. Nong, J. Wang et al., “Isolation of circulating tumor cells and detection of EGFR mutations in patients with non-small-cell lung cancer,” Oncology Letters, vol. 17, no. 4, pp. 3799–3807, 2019.View at: Google Scholar
A. Toss, Z. Mu, S. Fernandez, and M. Cristofanilli, “Molecular characterization of circulating tumor cells in human metastatic colorectal cancer,” PloS one, vol. 7, no. 7, article e40476, 2012.View at: Google Scholar
R. Edgar, M. Domrachev, and A. E. Lash, “Gene Expression Omnibus: NCBI gene expression and hybridization array data repository,” Nucleic Acids Research, vol. 30, no. 1, pp. 207–210, 2002.View at: Google Scholar
S. Davis and P. S. Meltzer, “GEOquery: a bridge between the Gene Expression Omnibus (GEO) and Bio Conductor,” Bioinformatics, vol. 23, no. 14, pp. 1846-1847, 2007.View at: Google Scholar
D. W. Huang, B. T. Sherman, and R. A. Lempicki, “Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources,” Nature Protocols, vol. 4, no. 1, pp. 44–57, 2009.View at: Google Scholar
M. Ashburner, C. A. Ball, J. A. Blake et al., “Gene Ontology: tool for the unification of biology,” Nature Genetics, vol. 25, no. 1, pp. 25–29, 2000.View at: Google Scholar
M. Kanehisa and S. Goto, “KEGG: Kyoto Encyclopedia of Genes and Genomes,” Nucleic Acids Research, vol. 28, no. 1, pp. 27–30, 2000.View at: Google Scholar
D. Szklarczyk, A. Franceschini, S. Wyder et al., “STRING v10: protein–protein interaction networks, integrated over the tree of life,” Nucleic Acids Research, vol. 43, no. D1, pp. D447–D452, 2014.View at: Google Scholar
P. Shannon, A. Markiel, O. Ozier et al., “Cytoscape: a software environment for integrated models of biomolecular interaction networks,” Genome Research, vol. 13, no. 11, pp. 2498–2504, 2003.View at: Google Scholar
G. D. Bader and C. W. Hogue, “An automated method for finding molecular complexes in large protein interaction networks,” BMC Bioinformatics, vol. 4, p. 2, 2003.View at: Google Scholar
S. Maere, K. Heymans, and M. Kuiper, “BiNGO a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in biological networks,” Bioinformatics, vol. 21, pp. 3448-3449, 2005.View at: Google Scholar
Z. Tang, C. Li, B. Kang, G. Gao, C. Li, and Z. Zhang, “GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses,” Nucleic Acids Research, vol. 45, no. W1, pp. W98–W102, 2017.View at: Google Scholar
T. Li, J. Fan, B. Wang et al., “TIMER: a web server for comprehensive analysis of tumor-infiltrating immune cells,” Cancer Research, vol. 77, no. 21, pp. e108–e110, 2017.View at: Google Scholar
R. Aguirre-Gamboa, H. Gomez-Rueda, E. Martínez-Ledesma et al., “SurvExpress: an online biomarker validation tool and database for cancer gene expression data using survival analysis,” PLoS One, vol. 8, no. 9, article e74250, 2013.View at: Google Scholar
N. C. Bird, D. Mangnall, and A. W. Majeed, “Biology of colorectal liver metastases: a review,” Journal of Surgical Oncology, vol. 94, pp. 68–80, 2006.View at: Google Scholar
K. Pantel, E. Denève, D. Nocca et al., “Circulating epithelial cells in patients with benign colon diseases,” Clinical Chemistry, vol. 58, pp. 936–940, 2012.View at: Google Scholar
L. Zhang, L. D. Ridgway, M. D. Wetzel et al., “The identification and characterization of breast cancer CTCs competent for brain metastasis,” Science Translational Medicine, vol. 5, p. 180, 2013.View at: Google Scholar
Y. Guan, F. Xu, Y. Wang et al., “Identification of key genes and functions of circulating tumor cells in multiple cancers through bioinformatic analysis,” BMC Medical Genomics, vol. 13, no. 1, p. 140, 2020.View at: Google Scholar
J. E. Lang, J. H. Scott, D. M. Wolf et al., “Expression profiling of circulating tumor cells in metastatic breast cancer,” Breast Cancer Research and Treatment, vol. 149, no. 1, pp. 121–131, 2015.View at: Google Scholar
K. Behbakht, M. W. Sill, K. M. Darcy et al., “Phase II trial of the mTOR inhibitor, temsirolimus and evaluation of circulating tumor cells and tumor biomarkers in persistent and recurrent epithelial ovarian and primary peritoneal malignancies: a Gynecologic Oncology Group study,” Gynecologic Oncology, vol. 123, no. 1, pp. 19–26, 2011.View at: Google Scholar