Background and Aim. With regard to patients with intermediate-stage, irresectable hepatocellular carcinoma (HCC), transcatheter arterial chemoembolization (TACE) is the mainstay of treatment. There is an urgent clinical requirement to identify reliable biomarkers to predict the response of HCC patients to TACE treatment. We aimed to identify a gene signature for predicting TACE response in HCC patients based on bioinformatics analysis. Methods. We downloaded the gene expression profile GSE104580 based on 147 tumor samples from 81 responders to TACE and 66 nonresponders from the Gene Expression Omnibus (GEO) database. Then, we randomly divided the 147 tumor samples into a training set () and a validation set () and screened differentially expressed genes (DEGs) in the training set. Gene Ontology (GO) term and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed to annotate functions of the DEGs. The DEGs were mapped into the STRING website for constructing protein-protein interaction (PPI). The predictive value of the candidate genes by receiver-operating characteristic (ROC) curves was further verified in the validation set. Results. We totally found 158 DEGs (92 upregulated genes and 66 downregulated genes) in the training set. The GO enrichment analysis revealed that DEGs were significantly enriched in metabolic and catabolic processes, such as drug metabolic process, fatty acid metabolic process, and small molecule catabolic process. The KEGG pathway analysis revealed that the DEGs were mainly concentrated in drug metabolism-cytochrome P450, metabolism of xenobiotics by cytochrome P450, and chemical carcinogenesis. We identified 6 candidate genes (CXCL8, AFP, CYP1A1, MMP9, CYP3A4, and SERPINC1) based on the PPI network of the DEGs, which had high predictive value in HCC response to TACE with an area under the curve (AUC) value of 0.875 and 0.897 for the training set and validation set, respectively. Conclusion. We identified a six-gene signature which might be biomarkers for predicting HCC response to TACE by a comprehensive bioinformatics analysis. However, the actual functions of these genes required verification.

1. Introduction

Hepatocellular carcinoma (HCC) represents a malignant tumor predominantly arising in the setting of cirrhosis and will be responsible for an estimated one million death on a global scale in 2030 [1]. As per the Global Cancer Statistics in 2018, the mortality rate of HCC is only second to lung cancer in males by gender stratification [2]. Most of HCC cases at their initial diagnosis have reached the advanced stage, which results in an unsatisfactory overall survival [3]. Transcatheter arterial chemoembolization (TACE) induces ischemia-hypoxia and local chemotherapy-induced cytotoxicity which destroys cancerous cells, which is the first-line treatment for unresectable or intermediate stage HCC [4]. However, as a palliative therapy, TACE also has several limitations. Firstly, the response rate to TACE may vary widely among HCC patients. Although TACE did offer a survival advantage over supportive treatment alone in some patients, there were still up to 60% of HCC patients who did not benefit from it in spite of multiple treatments. Secondly, TACE has multiple side effects (AEs). A recent literature reported a total of 21,461 AEs in 15,351 patients treated with TACE [5]. Abnormal liver enzyme was the most common AE, with an incidence of 52%, followed by postembolism syndrome such as intestinal obstruction, abdominal pain, and fever, with an incidence of 47.7% [5]. In addition, fever, nausea, and fatigue were also the typical postembolization syndrome. Some serious complications including liver failure, gastroduodenal ulceration, and death may also occur [46]. Thirdly, TACE treatment is costly. Therefore, early identifying which patients benefit most from TACE may be of great clinical relevance and is the key goal of modern personalized medicine.

The response to TACE in HCC patients is conventionally assessed by radiologically enhanced criteria which focus on tumor size [7]. However, the use of radiologic criteria to measure response has some limitations, such as interobserver subjectivity, high interobserver variability, increased patient’s radiation exposure, and misjudgment resulting from dysplastic or regenerative nodules, or perfusion abnormalities [8]. Recently, more and more researchers investigate novel biomarkers to predict patient response to TACE in HCC. For example, Mao et al. found that ASPP2 expression in cancer tissue following TACE is an independent risk factor for HCC recurrence as well as overall survival [9]. Ma et al. demonstrated serum STIP1 is a promising biomarker for outcome evaluation, therapeutic response assessment, and microvascular invasion prediction in HCC following TACE [10]. Given this, screening reliable and novel biomarkers for predicting HCC response to TACE treatment is an urgent clinical need. Due to the complicated molecular and cellular heterogeneity in HCC, we speculated that the difference in tumor gene expression may be related to the TACE response. With the development of biomedicine, high-throughput gene chip technology has begun to be popularized in the exploration of mechanism and identifying potential biomarkers.

Therefore, in this study, we downloaded and initially analyzed the GSE104580 dataset from the Gene Expression Omnibus (GEO) database to explore the DEGs associated with TACE response between TACE responders and nonresponders. We then annotated the functions of these DEGs by using the GO term enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Finally, we performed Protein-Protein Interaction (PPI) network analysis to discover candidate genes. Based on the bioinformatics analysis, novel biomarkers for predicting HCC response may be provided.

2. Methods

2.1. Microarray Data and Preprocessing

We downloaded the gene expression dataset of GSE104580 which contains microarray data from 147 tumor samples from 81 TACE responders and 66 TACE nonresponders from the GEO database and then randomly divided the 147 samples into a training set () and a validation set (). The training set was then used to screen the DEGs associated with TACE response.

2.2. Differential Expression Analysis

We used the limma package in R software to analyze the DEGs ( and the ) associated with TACE response by comparing HCC tissues between TACE responders and nonresponders in the training set.

2.3. GO and KEGG Pathway Analysis

To explore the significantly enriched functions of the DEGs and better understand the important pathways of the DEGs participation, we used clusterProfile packages for GO and KEGG analysis. The Fisher’s exact test was applied to evaluate the significance of GO and KEGG enrichments, and a value < 0.05 indicated a statistically significant difference.

2.4. Construction of the PPI Network

The Search Tool for the Retrieval of Interacting Genes (STRING) database was applied to construct the PPI network of DEGs, and a was set as statistically significant. Then, the Cytoscape software (version 3.6.3) was employed to visualize the obtained PPI network. Moreover, to find out the most important nodes in the PPI network, the CentiScaPe 2.2 plug-in was employed to calculate the degree, closeness, and betweenness of the network. We operationally defined genes with degree value in the top 10% as hub genes, while the betweenness value in the top 10% as bottleneck genes. The Venn diagram was then applied to screen the genes in the intersection of these two gene sets as candidate genes for predicting HCC response to TACE treatment.

2.5. Prediction and Evaluation of Predictive Biomarkers for HCC Response to TACE Treatment

To explore predictive biomarkers for HCC response to TACE treatment, we used the above hub genes as candidates to find their predictive value based on ROC curve analysis. In brief, 89 samples (, ) were randomly distributed as the training set, which was used to build a ROC curve. The remaining set () was used as the validation set. The area under the curve (AUC) was computed to estimate the predictive accuracy of the classifier.

3. Results

3.1. Identification of DEGs between Responders to TACE and Nonresponders for HCC Patients

After differential expression analysis, we identified a total of 158 DEGs according to the thresholds of and , including 92 upregulated genes and 66 downregulated genes (Table 1). The volcano plot is presented in Figure 1(a). In addition, Figure 1(b) shows the heat map of expression diversity of DEGs, from which we can see that tumor samples from the TACE responders were distinguished clearly from that from the TACE nonresponders by the identified DEGs.

3.2. Integrative Bioinformatics Analysis of DEGs

Figure 2 shows the results of Go enrichment, from which we found that the DEGs were the strongest enrichment in metabolic and catabolic processes, such as drug metabolic process, fatty acid metabolic process, and small molecule catabolic process (Figure 2). In addition, the KEGG pathway analysis showed the DEGs mainly concentrated in drug metabolism-cytochrome P450, metabolism of xenobiotics by cytochrome P450, and chemical carcinogenesis (Figure 3).

3.3. Construction of the PPI Network and Module Analysis

We performed PPI network to further investigate the interrelationship of the DEGs, and the results showed that the PPI network consisted of 151 nodes and 336 edges after hiding nodes which could not interact with other nodes (Figure 4). By using the CentiScape plug-in, 10 hub genes (CXCL8, CYP3A4, MMP9, UGT2B15, AFP, CYP2C9, CYP1A1, HRG, SERPINC1, and AGXT) and 8 bottleneck genes (CXCL8, AFP, CA9, CYP1A1, MMP9, CYP3A4, SERPINC1, and PCK1) were obtained. Subsequently, 6 candidate genes in the intersection of the above two gene sets were obtained by using Venn diagram analysis, including cysteine X chemokine ligand 8 (CXCL8), Alpha-fetoprotein (AFP), cytochrome P450 1A1(CYP1A1), matrix metalloproteinase-9 (MMP9), cytochrome P450 3A4 (CYP3A4), and serpin peptidase inhibitor, clade C (antithrombin), and member 1(SERPINC1). Among them, CYP1A1, CYP3A4, and SERPINC1were upregulated genes in TACE responders compared with TACE nonresponders, while CXCL8, AFP, and MMP9 were downregulated gene (Figure 5).

3.4. The Value of Candidate Genes in Predicting HCC Response to TACE Treatment

At last, we examined the predictive value of CXCL8, AFP, CYP1A1, MMP9, CYP3A4, and SERPINC1 in predicting HCC response to TACE treatment in the training set and confirmed in the validation set. The ROC curves revealed that the 6-gene signature had good performance in predicting HCC response to TACE treatment with AUC of 0.875 for the training set (Figure 6(a)) and 0.897 for the validation set (Figure 6(b)). These results suggested that the 6 differentially expressed hub genes can be used as potential biomarkers for predicting HCC response to TACE.

4. Discussion

Response to TACE treatment varies greatly among patients with HCC, and over 40% of patients do not respond to it. At present, we cannot accurately predict the efficacy of TACE because there is no effective predictor of markers. Therefore, it is urgent to determine the key genes that affect the effect of TACE treatment and actively invest in the research of multigene signatures for predicting the response of TACE treatment. In the current study, we screened out 158 significant DEGs between HCC samples from TACE responders and nonresponders. These DEGs might play decisive roles in TACE response through metabolic and catabolic processes. Also, these DEGs were significantly concentrated in pathways like drug metabolism-cytochrome P450 pathway, and chemical carcinogenesis, which may be the underlying mechanism by which the DEGs affect the TACE response. Furthermore, by constructing the PPI network, we established a six-gene signature (including CYP1A1, CYP3A4, SERPINC1, CXCL8, AFP, and MMP9) for predicting TACE response. The 6-gene signature had good performance in predicting TACE response, demonstrating that the 6-gene signature provided new insights into the biological behavior of HCC and a basis for individualized management of HCC.

The cytochrome P450 (CYP) enzyme family mainly participate in the metabolism of environmental toxicants, cancer-promoting substances, and antitumor drugs. Its function strongly correlates with the appearance and progression of tumors and the response to anticancer drugs [11, 12]. Cytochrome P450 1A1 (CYP1A1) is a subtype of the cytochrome P 450 family, which is highly expressed in human liver tissues [13, 14]. Numerous studies have confirmed the relationship between CYP1A1 gene polymorphisms and HCC risk. Moreover, it has been reported that CYP1A1 is an important regulator of the conversion of tobacco-derived polycyclic aromatic hydrocarbons (PAHs) to carcinogenic metabolites, and its polymorphism may exert an effect on raising the susceptibility to smoking-related HCC [15].

Cytochrome P450 3A4 (CYP3A4) is also a member of the CYP 450 superfamily. It is expressed mainly in human liver as well as extrahepatic tissues including lung, kidney, and intestine. The expression of CYP3A4 is induced by glucocorticoids and mainly participates in drug metabolism and lipid composition synthesis. CYP3A4 has a bidirectional function of activating some drugs and inactivating some drugs. Previous studies showed that cytochrome P450 enzymes metabolized approximately 70-80% of the drugs used in clinic, and CYP3A4 played a role in the metabolism of half of these drugs, including many cancer chemotherapeutic agents [16, 17]. Recently, several studies revealed that CYP3A4 was significantly downregulated in HCC tissues, which was associated with a poor prognosis of HCC [18, 19]. Moreover, another investigation found that CYP3A4 may be a new tumor suppressor gene, which is related to the prognosis of HCC [20].

SERPINC1 is a serine protease inhibitor. It controls the blood coagulation process of the body. Previous study found that the expression levels of SERPINC1 in the serum of HCC patients and healthy subjects were significantly different [21]. A major event in tumor growth and progression is the tumor angiogenesis. Cancer cells can secrete not only angiogenesis stimulating factors but also antiangiogenesis factors. The cleavage conformation of SERPINC1 inhibits the proliferation and migration of cancer cells through antiangiogenesis and anti-inflammatory [22, 23]. Its expression decreased in HCC [24, 25]. In the current study, upregulated expression of SERPINC1was associated with a favorable response to TACE. We therefore speculated that the high expression of SERPINC1 may contribute to the host’s anticancer defense.

The chemokine CXCL8 has tumorigenic and angiogenic effects. Both inflammatory cells and tumor cells can secrete it. Studies have shown that CXCL8 played a role in the angiogenesis and metastasis of many types of tumors and is related to the poor prognosis of these tumors [26].

Specifically, CXCL8 and its receptors CXCR1 and CXCR2 may be involved in the occurrence and progression of HCC [27]. Like alpha-fetoprotein (AFP), CXCL8 also has a high predictive value in identifying HCC, and it could also independently predict the survival prognosis of patients with early HCC [28]. Interestingly, the size of HCC was positively linked to CXCL8 [27]. CXCL8 overexpressed in HCC tissues, and it may be related to the pathological staging, microvascular infiltration, and metastasis of HCC, indicating that CXCL8 may play a role in the progression and metastasis of HCC by participating in tumor proliferation and angiogenesis [29].

About 70% of HCCs secreted AFP, and it is a commonly used biomarker for diagnosing HCC in the clinic [30]. Furthermore, AFP has been confirmed to be the most common prognostic indicators of HCC in clinic. Several prognostics scores for HCC patients treated with TACE have included AFP as an important scoring factor [6, 31, 32]. Serum levels of AFP before TACE treatment and changes in AFP levels after TACE treatment were good predictors of response rate to TACE treatment and overall survival for HCC patients, respectively. The serum levels of AFP were negatively correlated with the response rate to TACE, and HCC patients with would have no response to TACE. Consistent with previous studies, we found that the downregulated expression of AFP was associated with a good clinical response to TACE treatment. These results may be explained by the fact that higher levels of AFP are usually caused by late tumor stage, large tumor burden, and portal vein tumor thrombosis [6, 30].

The matrix metalloproteinase (MMP) family may degrade the extracellular matrix, destroy the normal tissue structure of cell adhesion molecules, and combine with other related enzymes to degrade the matric around the blood vessel, thereby facilitating the infiltration and metastasis of liver cancer cells. MMP 9 is a member of the MMP family and seems to exert a major effect in tumor angiogenesis through its key intervention in regulating growth plate angiogenesis and recruiting endothelial stem cells. M2 macrophages could promote the invasion and migration of HCC cells by means of changing miR-149-5p and therefore increasing the expression of MMP9 in HCC. Furthermore, the overexpression of MMP 9 in liver cancer leads to a higher TNM staging by increasing lymph node invasion and promoting metastasis, resulting in poor differentiation and an overall prognosis.

The current study still had several limitations need to be noted. Firstly, the six-gene signature lacked external validation. In future studies, external validation of the six-gene signature in more independent cohorts is necessary. Secondly, the 6 genes were obtained only by bioinformatics analysis; therefore, further experiments need to validate them and to clarify the underlying mechanism of them.

In conclusion, we identified a six-gene signature that could be used to predict the response of HCC to TACE treatment. Hopefully, this 6-gene signature may become a favorable tool to guide the precision treatment of HCC in the future.

Data Availability

The profiles of GSE104580 can be downloaded from GEO datasets.

Conflicts of Interest

The authors declare there is no conflict of interest.


We thank the contributors of profiles GSE104580 in GEO datasets.