Liver cancer is the fourth leading cause of cancer death worldwide, and hepatocellular carcinoma (HCC) accounts for the greatest proportion of these deaths. Baicalein, a flavonoid isolated from the root of Scutellariae radix, is considered a potential candidate to treat HCC. However, the underlying molecular mechanisms remain poorly understood. In the present study, a network pharmacological approach was combined with microarray data (GSE95504) acquired from the Gene Expression Omnibus database to reveal the therapeutic mechanisms of action of baicalein at a systemic level. We identified 38 baicalein targets and 76 differently expressed genes (DEGs) following treatment with baicalein, including 55 upregulated and 21 downregulated genes. The DEGs were significantly enriched in the biological functions of apoptosis, endoplasmic reticulum stress, and PERK-mediated unfolded protein response. Protein-protein interaction (PPI) network construction and topological screening revealed a core module of PPIs including two baicalein targets, TP53 and CDK1, and two downregulated DEGs, HSPA1A and HSPA1B. Expression and survival data for these genes in the module derived from Gene Expression Profiling Interactive Analysis (GEPIA) were subjected to Kaplan–Meier analysis of overall survival and disease-free survival. Overexpression of CDK1, BRCA1, TUBB, HSPA1A, HSPA1B, and HSPA4 was associated with significantly worse overall survival, while overexpression of CDK1, CLU7, BRCA1, and TUBB was associated with significantly worse disease-free survival. These data suggest that baicalein exerts therapeutic effects against HCC via a PPI network involving TP53, CDK1, HSPA1A, and HSPA1B.

1. Introduction

Liver cancer is the fourth leading cause of cancer death worldwide according to global cancer statistics for 2018, with about 782,000 deaths annually [1]. Hepatocellular carcinoma (HCC), comprising 75%–85% of cases, is the most common type of primary liver cancer and is strongly associated with chronic infection with hepatitis B virus (HBV) or hepatitis C virus (HCV), consumption of aflatoxin-contaminated foodstuffs, and heavy alcohol intake [2]. Although improvements in patient outcomes were observed in clinical trials with first- or second-line drugs, the lack of improvements in overall survival in some patients and the lack of curative treatment options must be urgently addressed [3]. Therefore, elucidation of the molecular mechanisms underlying HCC and development of alternative therapies with lower toxicity is critical for achieving more favorable clinical outcomes and reducing treatment morbidity.

The use of traditional Chinese medicine to treat cancer is built on a foundation of more than 2,500 years of Chinese medical practice, including Chinese herbal medicine (CHM), acupuncture, and dietary therapy. A recently published meta-analysis of 20 randomized controlled trials showed that add-on therapy with CHM improved overall survival in HCC patients and reduced adverse events related to conventional treatments [4]. To date, ingredients derived from CHM have been found to exert suppressive effects on the promotion and proliferation of cancer cells, as well as inhibiting angiogenesis in cancer tissues [5]. Baicalein, a compound originally isolated from Scutellariae radix, is reported to exert anticancer activity by blocking colony formation, activating apoptosis, inducing autophagy, and modulating molecular signaling pathways such as mTOR and Wnt/β-catenin in several human liver cancer cell lines [68]. Multitarget therapies elicit complex changes in multiple biological networks and mechanisms and are therefore considered an effective approach for treating HCC [9].

In contrast to the traditional “one drug, one target” principle of drug design, network pharmacology aims to investigate the influence or intervention of drugs on diseases as a whole, based on the synergism of multitargeted drugs and through a holistic perspective at a systemic level [10]. This approach encompasses systems biology, network analysis, connectivity, redundancy, and pleiotropy [11]. Recently, a network pharmacological platform was successfully applied for screening effective ingredients and revealing the pharmacological mechanisms of CHM. Whether by studying classic formulae, patented modern Chinese medicines, or bioactive ingredients derived from CHM, the network pharmacological approach provides new insights into the systemic connection between CHM, therapeutic targets, and a disease as a whole and provides a powerful and promising tool for the elucidation of disease mechanisms at a systemic level and the discovery of potential bioactive ingredients [1214].

In the present study, in order to recognize the mechanisms underlying the anti-HCC effects of baicalein, we used a network pharmacological approach, including the evaluation of its absorption, distribution, metabolism, and excretion (ADME) properties; identification of differentially expressed genes (DEGs) related to baicalein treatment; prediction of baicalein-related targets; and recognition of core functions and modules via the protein-protein interaction (PPI) network approach. We identified a core modulatory network including two important baicalein targets, TP53 and CDK1 that was significantly related to the regulation of cell death and modulation of the apoptotic signaling pathway. To our knowledge, this study is the first to use a network pharmacological approach to study the efficacy and mechanism of baicalein, providing a powerful and promising tool for the development and application of CHM to HCC therapy.

2. Materials and Methods

2.1. Public Data Collection

The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo) is a public functional genomics data repository of high-throughput gene expression, chip, and microarray data. We downloaded the data series GSE95504 from the GEO database (species: Homo sapiens), which contains samples of the HCC cell line Bel-7402 treated with DMSO and 40 μM baicalein. All samples were processed on the GPL17586 Affymetrix Human Transcriptome Array 2.0 platform. The probes were represented using the corresponding gene symbol according to the annotation information in the platform.

2.2. Identification of DEGs

The microarray data were preprocessed using RMA with the oligo and limma package in Bioconductor (v3.7; http://www.bioconductor.org/) [15]. Background correction, normalization, and calculation of expression were all included in the preprocessing stage. Microarray data probe IDs were transformed into gene symbols with the Bioconductor Annotation Data software packages. When several probes mapped to the same gene symbol, the mean value was set as the final expression value of this gene. P values were acquired using the unpaired Student’s t-test provided by the limma package and were adjusted using the Benjamin–Hochberg (BH) method. The thresholds for identifying DEGs were set as adjusted (adj.) P < 0.05 and fold change > 1. Hierarchical clustering analysis of the DEGs was then performed and visualized using g-plots as implemented in the R package.

2.3. Evaluation of Drug-Likeness (DL)

The TCMSP server (http://lsp.nwu.edu.cn/tcmsp.php) is a systems-level pharmacological database for traditional Chinese medicine that can also be used for calculating the ADME-related properties of naturally occurring compounds of interest [16]. It provides an in silico ADME-systems evaluation model created by Wang et al., which integrates DL, oral bioavailability (OB), Caco-2 permeability, and other features.

2.4. Prediction of Baicalein Targets

We used the drug targeting in silico prediction models developed by Wang and others to identify potential targets for baicalein [17]. In brief, the in silico prediction model efficiently integrates chemical, genomic, and pharmacological information for drug targeting on a large scale, based on two powerful methods: random forest (RF) and support vector machines (SVM). In cases in which drug targets are identified, proteins with an output expectation value (E-value) for SVM > 0.7 or RF > 0.8 are listed as potential targets.

2.5. PPI Network Construction

We used BisoGenet [18], a Cytoscape plugin, to construct a PPI network using six currently available PPI databases, including the Biological General Repository for Interaction Datasets (BioGRID), Biomolecular Interaction Network Database (BIND), Molecular INTeraction Database (MINT), Human Protein Reference Database (HPRD), and Database of Interacting Proteins (DIP). After interactive networks for putative baicalein targets and DEGs were constructed using Cytoscape [19], a merged network was constructed based on the intersection data of the two networks.

2.6. Definition of Topological Feature Set for the Network

We used CytoNCA [20], a Cytoscape plugin, to analyze the topological properties of every node in the interaction network in order to calculate two topological properties: betweenness centrality (BC) and degree centrality (DC). More important nodes received higher quantitative values within the network.

2.7. Clusters of Core PPI Networks

We used MCODE, a plugin of Cytoscape, to obtain clusters of core PPI networks by analyzing the corresponding networks [21, 22]. Based on network theory, connected regions in large PPI networks may represent molecular complexes and together disrupt biological functions, resulting in a particular disease phenotype. As the topological module and functional module have the same meaning in the network, the functional module can be recognized by network properties.

2.8. Gene Expression Data for the Core Cluster for HCC

Data were obtained from the Gene Expression Profiling Interactive Analysis (GEPIA) online database (http://gepia.cancer-pku.cn/), a web server that provides customizable functions [23]. Tumors and normal samples in the GEPIA database were derived from The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) projects. Correlations of disease-free survival and overall survival rates with the expression levels of CDK1, CUL7, BRCA1, TUBB, HSPA1A, HSPA1B, and HSPA4 in HCC patients were also computed using the GEPIA database.

2.9. Gene Ontology and Pathway Enrichment Analysis

We used the Database for Annotation, Visualization, and Integrated Discovery (DAVID; http://david.abcc.ncifcrf.gov/) [24], an online program that provides comprehensive data for high-throughput gene functional analysis for elucidation of biological characteristics, to obtain Gene Ontology (GO) terms belonging to the biological process (BP), cellular component (CC), and molecular function (MF) categories. Additionally, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway functional enrichment analyses were performed for the above DEGs. Results with values of P < 0.05 were considered to be statistically significant. We performed KEGG signaling pathway enrichment analysis of the screened candidate targets of baicalein using ClueGO, a Cytoscape plugin, to visualize nonredundant biological terms for large clusters of genes in a functionally grouped network [25]. The ClueGO network was created with kappa statistics and reflects the relationship between the terms based on the similarity of their associated genes.

3. Results

3.1. Identification of DEGs

Gene expression dataset GSE95504 was downloaded from the GEO database. Statistical analysis software R was used for preprocessing and gene differential expression analysis of microarray data. A total of 76 DEGs were identified between HCC BEL-7402 cells treated with and without baicalein, including 55 that were upregulated and 21 that were downregulated upon baicalein treatment, for subsequent bioinformatics analysis (Table 1, Figure 1).

3.2. GO Enrichment Analyses of DEGs

To analyze the biological classifications of DEGs, functional enrichment analyses were performed using the DAVID server. As shown in Figure 2, GO analysis results indicated that, in the BP category, DEGs were significantly enriched in the intrinsic apoptotic signaling pathway in response to endoplasmic reticulum stress, PERK-mediated unfolded protein response, tRNA aminoacylation for protein translation, cellular response to oxidative stress, endoplasmic reticulum unfolded protein response, regulation of protein ubiquitination, glutamine metabolic process, cellular response to insulin stimulus, positive regulation of interleukin-8 production, and cellular response to glucose starvation. In the CC category, DEGs were mostly enriched in the cytosol, nucleosome, extracellular exosome, GATOR2 complex, inclusion body, nucleoplasm, and cytoplasm. In the MF category, DEGs were mainly enriched in protein binding involved in protein folding, tRNA binding, C3HC4-type RING finger domain binding, virus receptor activity, ATP binding, protein heterodimerization activity, ATPase activity, transcription regulatory region DNA binding, Notch binding, and unfolded protein binding.

3.3. Baicalein ADME Properties, Target Prediction, and PPI Network Construction

As shown in Table 2, the DL value of baicalin was 0.21, OB value of baicalin was 33.52%, and Caco-2 permeability of baicalin was 0.63, indicating the potential applicability of baicalin as an oral drug agent. We further predicted 38 potential targets of baicalein including TP53 and CDK1 (Table 3). PPI networks were constructed of baicalein targets and DEGs. We further merged the intersection of the two networks and obtained a core network (92 nodes, 2173 edges) according to the DC and BC (Figures 3(a)3(c)).

3.4. KEGG Signaling Pathway Enrichment and Main Modules of Core PPI Network Recognition

KEGG signaling pathway enrichment using ClueGO indicated that proteins in the core PPI network were statistically enriched for the following pathways: Epstein-Barr virus infection, bladder cancer, antigen processing and presentation, cell cycle, alcoholism, ubiquitin-mediated proteolysis, and transcriptional misregulation in cancer. Other enriched pathways included the p53 signaling pathway, breast cancer, colorectal cancer, hepatitis B, and microRNAs in cancer (Figure 3(d)). A Cytoscape plugin, MCODE, was used to carry out module analysis, resulting in two main modules of the core PPI network (Figure 4). The first module, containing two baicalein targets, was functionally enriched in beta-catenin-TCF complex assembly, response to unfolded protein, and the ERBB2 signaling pathway. The second module, containing two baicalein targets and two DEGs, was enriched in regulation of signal transduction by p53, regulation of cell death, and the intrinsic apoptotic signaling pathway.

3.5. Analysis of HCC Survival and Expression of Proteins in the Module of Interest

We used the GEPIA online database to perform overall survival and disease-free survival analyses for proteins in module 2 using a Kaplan–Meier curve. HCC patients with higher expression of CDK1, BRCA1, TUBB, HSPA1A, HSPA1B, and HSPA4 showed significantly worse overall survival (Figure 5), while HCC patients with higher expression of CDK1, CLU7, BRCA1, and TUBB showed significantly worse disease-free survival (Figure 6).

4. Discussion

S. baicalensis is a skullcap plant native to several East Asian countries and Russia, and it is cultivated in many European countries. This plant has been widely used in traditional Chinese medicine since about 200–250 AD for the clinical treatment of hypertension, atherosclerosis, dysentery, common colds, inflammatory disorders, and tumors [2628]. Baicalein is a major bioactive flavone derived from the dry roots of S. baicalensis. It has been shown that treatment with baicalein inhibits cell proliferation and enhances apoptosis and autophagy of many types of cancer, including HCC [29], bladder cancer [30], and colorectal cancer [31]. Previously, studies have also demonstrated that baicalein exerts multiple effects against HCC, mainly involving the inhibition of cell proliferation by arresting the cell cycle, suppression of metastasis, induction of apoptosis, as well as induction of autophagy via the modulation of associated molecules and signaling pathways [32]. However, these studies were designed based on the traditional research idea of “one drug, one target”, without considering a systemic picture. Therefore, in the present study, we used a network pharmacological strategy to explore the molecular mechanisms by which baicalein exerts therapeutic effects on HCC from a systemic perspective, and we sought to establish connections between basic and clinical research.

We used in silico approaches to determine whether baicalein is a good candidate for drug discovery for HCC. According to Lipinski’s “rule of five”, which predicts the DL of a chemical compound with a certain biological activity designed for administration via the oral route [33], the properties of baicalein meet the requirements, namely, molecular weight < 500 Da, calculated < 5, number of hydrogen-bond donors < 5, and number of hydrogen-bond acceptors < 10, indicating that this compound is a suitable potential candidate for drug discovery for HCC. Based on data from the TCMSP database, the DL of baicalein was calculated to be 0.21, which is above the average DL value of DrugBank compounds (0.18); these data additionally indicate the potential of baicalein as a promising candidate for HCC treatment.

To understand the anti-HCC effect of baicalein at a systemic level, we predicted targets of baicalein and analyzed microarray data from baicalein-treated HCC cell lines derived from the GEO database. Thirty-eight baicalein targets and 76 DEGs were identified and selected for further analysis. GO analysis of DEGs indicated that apoptosis, endoplasmic reticulum stress, and oxidative stress were statistically related to the anti-HCC effect of baicalein. Through PPI network construction and topological screening, we obtained a core PPI network with 92 nodes and 2173 edges. Following KEGG analysis of proteins from the core PPI network using ClueGO, we identified the Epstein-Barr virus infection, cell cycle, alcoholism, hepatitis B, and p53 signaling pathways, as well as some cancer pathways. An increasing number of studies support the idea that baicalein induces the apoptosis of HCC cells and that the underlying biological mechanisms are strongly related to endoplasmic reticulum stress, reactive oxygen species generation, and the p53 signaling pathway [32, 34, 35]. Baicalein has also been shown to arrest the cell cycle in different HCC cell lines at all three phases: G0/G1 [36], S [37], and G2/M [38]. These findings are consistent with the results from the present in silico network pharmacological analysis.

Using a topological approach, we identified an important module containing 14 genes, including TP53 (baicalein target), CDK1 (baicalein target), HSPA1B (downregulated DEG), HSPA1B (downregulated DEG), and other genes in the PPI network; these genes play important roles in the regulation of signal transduction by p53, regulation of cell death, and the intrinsic apoptotic signaling pathway. We further assessed the expression of these genes in relation to overall and disease-free survival. High expression of CDK1, BRCA1, and TUBB was significantly associated with reductions in both overall and disease-free survival.

Baicalein has been identified as a selective inhibitor of CDK1 by FRET analysis and structure-activity relationship analysis in the Hep G2 and SMMC-7721 cell lines [39, 40]. A549 and H1299 non-small-cell lung cancer cells and primary cultured heart endothelial cells treated with baicalein have additionally been reported to exhibit downregulated expression of CDK1 [41]. CDK1 is a member of the Ser/Thr protein kinase family that is frequently overexpressed in HCC and associated with tumor progression [42, 43]. Depletion or inhibition of CDK1 with microRNAs or small molecular compounds has been shown to lead to reduced clonogenicity by arresting cells in the S–G2 and G2–M phases of the cell cycle and inducing apoptosis in HCC cell lines [4447]. TP53, another baicalein target, is a classic tumor suppressor involved in tumor cell apoptosis, genomic stability, and inhibition of angiogenesis [48]. Previous studies have found that baicalein induces cell cycle arrest and apoptosis in HepG2 cells via a TP53-dependent pathway [49]. Similar results were observed in human lung [50], colorectal [51], and prostate [52] cancer cells.

Heat shock proteins (HSPs) and molecular chaperones are considered to play important roles in protein homeostasis, cell physiology, and protection against stressors [53]. In the present study, HSPA1A and HSPA1B represented two downregulated DEGs present in the PPI module network of baicalein. As members of the heat shock protein 70 (Hsp70) family, they were similarly expressed in both tumor and nontumor tissues of HCC patients, with higher expression levels in HCC tumor tissues [54]. Previous studies have shown that the knockdown of HSP70 inhibits the proliferation of two HCC cell lines, namely SMMC-7721 and Hep3B cells [55]. In vitro evidence showed that high levels of HSP70 promote cell migratory ability and suppress apoptosis [56]. Of note, it has been reported that HSP70 plays significant roles in endothelial cell migration and lumen formation and is considered an important molecule in angiogenesis and the immune response in HCC [57].

To further understand the roles of the ten other proteins in the potential module, based on our current knowledge, we propose that EEF1A1, MDM2, CUL7, and BRCA1 are strongly related to HCC. These proteins are considered to play important roles in the pathophysiology of tumors and participate in cell growth, cell cycle regulation, and the maintenance of cell survival [5861].

In conclusion, our in silico study showed that the anti-HCC effect of baicalein was related to endoplasmic reticulum stress, apoptosis, oxidative stress, and the p53 signaling pathway. We identified a PPI network module containing 14 proteins, and we speculate that baicalein targets CDK1 and TP53 to downregulate the expression of HSP70, thereby exerting preventive effects against HCC via this module. Biological experiments are needed to verify these in silico results in the future.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the submission of this manuscript.

Authors’ Contributions

Chongyang Ma and Tian Xu contributed equally. Xiaoguang Sun and Shuang Zhang are responsible for data curation. Shuling Liu contributed to formal analysis. Qingguo Wang, Xueqian Wang, and Fafeng Cheng contributed to funding acquisition. Shuning Fan, Changming Zhai, and Changxiang Li contributed to investigation. Fafeng Cheng and Xueqian Wang contributed to methodology. Feifei Tang and Chaofang Lei contributed to project administration. Chongyang Ma is responsible for software. Juan Luo is responsible for validation. Tian Xu is responsible for visualization. Chongyang Ma and Tian Xu contributed to writing the original draft. Fafeng Cheng and Wei Wei contributed to review and editing of the paper.


This article was supported by the National Natural Science Foundation of China (Grants nos. 81430102, 81774122, 81774030, and 81874448) and the Beijing University of Traditional Chinese Medicine Independent Subject Selection Project (Grant no. 2017-JYB-XS-014).