Abstract
Background. The present study is aimed at identifying the differentially expressed genes (DEGs) and relevant biological processes and pathways associated with epicardial adipose tissue (EAT) from patients with coronary artery disease (CAD). We also explored potential biomarkers using two machine-learning algorithms and calculated the immune cell infiltration in EAT. Materials and Methods. Three datasets (GSE120774, GSE64554, and GSE24425) were obtained from the Gene Expression Omnibus (GEO) database. The GSE120774 dataset was used to evaluate DEGs between EAT of CAD patients and the control group. Functional enrichment analyses were conducted to study associated biological functions and mechanisms using the Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), and Gene Set Enrichment Analysis (GSEA). After this, the least absolute shrinkage and selection operator (LASSO) and support vector machine recursive feature elimination (SVM-RFE) were performed to identify the feature genes related to CAD. The expression level of the feature genes was validated in GSE64554 and GSE24425. Finally, we calculated the immune cell infiltration and evaluated the correlation between the feature genes and immune cells using CIBERSORT. Results. We identified a total of 130 upregulated and 107 downregulated genes in GSE120774. Functional enrichment analysis revealed that DEGs are associated with several pathways, including the calcium signaling pathway, complement and coagulation cascades, ferroptosis, fluid shear stress and atherosclerosis, lipid and atherosclerosis, and regulation of lipolysis in adipocytes. TCF21, CDH19, XG, and NNAT were identified as feature genes and validated in the GSE64554 and GSE24425 datasets. Immune cell infiltration analysis showed plasma cells are significantly more numerous in EAT than in the control group (), whereas macrophage M0 () and resting mast cells () were significantly less numerous. TCF21, CDH19, XG, and NNAT were correlated with immune cells, including plasma cells, M0 macrophages, and resting mast cells. Conclusion. TCF21, CDH19, XG, and NNAT might serve as feature genes for CAD, providing new insights for future research on the pathogenesis of cardiovascular diseases.
1. Introduction
Coronary artery disease (CAD) is one of the leading causes of death worldwide, and atherosclerosis is its most basic associated pathophysiological change [1]. Obesity represents a significant risk factor for cardiovascular disease, and the expansion of ectopic and visceral fat is strongly involved in the pathogenesis of CAD [2]. Recent evidence revealed the promising role of epicardial adipose tissue (EAT) in the occurrence, development, and prognosis of CAD [3]. EAT is recognized as a unique adipose storage, supplied by the branches of the coronary artery and directly adjacent to the myocardium. It is mainly comprised of adipocytes, stroma-vascular cells, fibroblasts, nerves, and various immune cells. Besides providing energy storage, the EAT serves as an endocrine and immune organ [4, 5]. Under physiological conditions, the EAT plays an important part in cardiac metabolism, prevention of cardiac lipotoxicity, mechanical protection of coronary arteries, and provides immunological support for the heart [6]. The link between EAT inflammation and CAD has increasingly attracted research focus. Over the recent years, the EAT has been proposed as a biomarker for acute coronary syndrome (ACS), major adverse cardiac events (MACE), and atrial fibrillation (AF) [7–9]. Moreover, several large-scale cohort studies demonstrated that the EAT volume is positively associated with the occurrence, development, and prognosis of CAD [10–12]. Specifically, it is currently accepted that some cytokines secreted by the EAT either protect or negatively affect cardiomyocytes’ function and coronary arteries through paracrine or vasocrine mechanisms [13, 14]. Cytokines secreted by the EAT might diffuse through the interstitial fluid into coronary wall layers. Besides, they could be directly released into the vasa vasorum of the coronary arteries [15, 16]. In pathological conditions, the proinflammatory or proatherogenic factors secreted by the EAT, including IL-6, IL-8, monocyte chemoattractant protein 1, leptin, resistin, and tumor necrosis factor α [15], exert their pathophysiological effects through direct diffusion, enhancing the potential to induce atherogenic changes in monocytes and endothelial cells [17]. Leptin, for example, is regarded as an independent risk factor for atherosclerosis that exerts a variety of atherogenic effects, such as increasing endothelial dysfunction, promoting inflammatory responses, oxidative stress induction, platelet aggregation and migration, and the proliferation of vascular smooth muscle cells [3, 18].
Although a high number of studies confirmed the involvement of the EAT in the development and progression of coronary atherosclerosis through adipokines, the exact mechanisms through which the EAT participates in CAD remain unclear [3, 5, 19–21]. A considerable limitation of these studies relates to the sole recruitment of patients who underwent cardiac surgery. Furthermore, it is difficult to collect the EAT from healthy subjects due to ethical concerns, whereby the subcutaneous adipose tissue (SAT) is usually used as control across various studies [22–25]. Bioinformatics analysis has been extensively applied to the identification of differentially expressed genes (DEGs) at the genome-wide level and constitutes a useful strategy for exploring the potential biomarkers and molecular mechanisms associated with the EAT and CAD. Here, we screened two microarray datasets from the Gene Expression Omnibus (GEO) database for DEGs between the EAT and the SAT. We attempted to explore the underlying biological functions using enrichment analysis and identified the best feature genes by employing machine-learning algorithms. In addition, we used CIBERSORT to investigate the proportion of immune cells that are present in the EAT [26, 27] and studied the relationship between the feature genes and infiltrating immune cells to provide a basis for further research.
2. Materials and Methods
2.1. Microarray Data
The GSE120774, GSE64554, and GSE24425 datasets were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). The GSE120774 dataset was used as the discovery cohort, and GSE64554 and GSE24425 datasets were used as the validation cohort. We analyzed a total of 9 EAT and 8 SAT samples from patients with CAD in GSE120774, which was based on the GPL6244 Affymetrix Human Gene 1.0 ST Array. In addition, there were 13 EAT and 13 SAT samples from patients with CAD in GSE64554, which was based on the GPL6947 Illumina HumanHT-12 V3.0 expression bead chip. Furthermore, 6 EAT and 6 SAT samples from patients with CAD in GSE24425 were also analyzed, which was based on the GPL6884 Illumina HumanWG-6 V3.0 expression beadchip. We used the limma package in R to normalize the expression data and ensure a similar distribution among these datasets.
2.2. Identification of Differentially Expressed Genes
The DEGs were identified by the limma package in R. A volcano plot was used to assess the DEGs, and the cutoff was set as (adjusted value < 0.05).
2.3. Functional Annotation for Differentially Expressed Genes
Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analyses were conducted using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). GO was composed of biological processes (BP), cell components (CC), and molecular function (MF). The R package ggplot was used to visualize the results. Functional enrichment analysis on all expression data was performed by Gene Set Enrichment Analysis (GSEA). The R packages clusterProfiler and http://org.Hs.eg.db were used to conduct GSEA. The GSEA cutoff point was set as a value < 0.05 and .
2.4. Feature Genes Identification
We used two machine-learning algorithms to screen for the most significant candidate biomarkers between SAT and EAT. The least absolute shrinkage and selection operator (LASSO), which was based on a regression analysis algorithm, is suitable for both linear and nonlinear cases. We used the glmnet package in R to perform LASSO. Support vector machine (SVM) is another machine-learning algorithm that is used for regression or classification. To avoid overfitting, the SVM-recursive feature elimination (RFE) was used to screen for feature genes from selected genes. We selected the top 20 genes for the SVM-RFE algorithm according to |log2 fold change (FC)| and then merged the obtained genes using the two algorithms to get the intersection. Both LASSO and SVM-RFE were performed using the e1071 and mlbench R packages. To further evaluate the diagnostic ability of the candidate biomarkers, we calculated the area under the curve (AUC) of the receiver operating characteristic (ROC) curve.
2.5. Immune Cell Infiltration Analysis
We used CIBERSORT (https://cibersortx.stanford.edu/) to analyze immune cell infiltration in GSE120774 and obtained 22 types of immune cells. The cutoff point was set as a value < 0.05. The vioplot package in R was used to visualize the different immune cells in the SAT and the EAT. We also built a bar plot in R to show the percentage of immune cells present in each sample.
2.6. Correlation Analysis between Biomarkers and Infiltrating Immune Cells
The relationship between feature genes and immune cells was evaluated using Spearman’s rank correlation analysis in R. The ggplot2 package was used to visualize the results.
2.7. Statistical Analysis
R software (version 4.2.0) was used for all statistical analyses. Continuous variables are expressed as the , and group comparisons were performed using Student’s -test for normally distributed variables and the Mann–Whitney test for abnormally distributed variables. A value < 0.05 was considered statistically significant.
3. Results
3.1. Identification of DEGs
The GSE120774, GSE64554, and GSE24425 datasets were normalized before analysis (Figure 1 and Supplemental File-Figure 1 show both the nonnormalized and normalized data). We identified a total of 130 upregulated and 107 downregulated genes. Genes with the most significant logFC in EAT compared with SAT in CAD patients are shown in the volcano plot of Figure 2.

(a)

(b)

(c)

(d)

3.2. Functional Enrichment Analysis of DEGs
We subsequently conducted functional enrichment analyses, including GO, KEGG, and GSEA, to explore the biological function and pathways associated with the DEGs. GO enrichment analysis revealed that negative regulation of transcription from RNA polymerase II promoter, negative regulation of cell proliferation, cell adhesion, angiogenesis, and response to lipopolysaccharide are enriched terms in BP (Figure 3(a)); plasma membrane, extracellular space, extracellular region, and extracellular exosome are enriched terms in CC (Figure 3(b)); and RNA polymerase II transcription factor activity, calcium ion binding, integrin binding, and DNA-binding activities are enriched in MF (Figure 3(c)). In addition, KEGG pathway analysis revealed that DEGs are mainly involved in the complement and coagulation cascades, fluid shear stress and atherosclerosis, and TNF signaling pathway (Figure 3(d)). In the GSEA, we identified several enriched pathways, including the calcium signaling pathway, complement and coagulation cascades, ferroptosis, fluid shear stress and atherosclerosis, lipid and atherosclerosis, and regulation of lipolysis in adipocytes (Figures 4(a)–4(f)).

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)

(e)

(f)
3.3. Identification and Validation of Feature Genes
The LASSO regression algorithm was used to narrow down the number of DEGs, and 10 genes were then identified (Figures 5(a) and 5(b)). Moreover, 12 genes were obtained using the SVM-RFE algorithm (Figure 5(c)), of which 4 were also identified by LASSO (Figure 5(d)): TCF21, CDH19, XG, and NNAT. The GSE64554 and GSE24425 dataset confirmed that TCF21 and CDH19 were upregulated in EAT compared with SAT in CAD patients, whereas XG and NNAT were downregulated (Figures 6(a)–6(h)). After this, we performed ROC analysis to evaluate the diagnostic ability of these four genes in the GSE64554 dataset and found that the four feature genes have high diagnostic effectiveness in discriminating EAT from the SAT samples, with an AUC of 0.923 (95% ) in TCF21, 0.941 (95% ) in CDH19, 0.953 (95% ) in XG, and 0.970 (95% ) in NNAT (Figures 7(a)–7(d)).

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(a)

(b)

(c)

(d)
3.4. Immune Cell Infiltration
Functional enrichment analysis revealed that DEGs might be involved in immune response, whereby we used the CIBERSORT algorithm to explore immune cell infiltration between EAT and SAT in CAD patients. The composition of immune cells in EAT vs. SAT samples in CAD patients is shown in Figure 8(a), which shows the proportions of plasma cells are notably higher in the EAT compared to the SAT (). In contrast, the proportion of M0 macrophages () and mast cell resting () are notably lower in the EAT than in the SAT (Figure 8(b)).

(a)

(b)
3.5. Correlation Analysis between the Four Feature Genes and Immune Cells
We found that TCF21 is positively correlated with plasma cells (, ), but negatively correlated with M0 macrophages (, ), while CDH19 is positively correlated with plasma cells (, ), and negatively correlated with resting mast cells (, ). In addition, XG is positively correlated with M0 macrophages (, ) and resting mast cells (, ), and negatively correlated with plasma cells (, ). Finally, NNAT is positively correlated with M0 macrophages (, ) and resting mast cells (, ) and negatively correlated with plasma cells (, ) (Figures 9(a)–9(d)). Overall, we found that the four feature genes are highly correlated with immune cells.

(a)

(b)

(c)

(d)
4. Discussion
The EAT participates in the pathological process of atherosclerosis through the endocrine and paracrine pathways, although the specific mechanisms remain unknown [14]. Here, we found 130 upregulated and 107 downregulated genes from a microarray analysis. Functional enrichment analysis indicated that these DEGs are involved in various pathophysiological processes and that four feature genes (TCF21, CDH19, XG, and NNAT) identified via LASSO regression and the SVM-RFE algorithm are correlated with immune cells, including plasma cells, M0 macrophages, and resting mast cells, as shown by infiltration analysis.
Previous studies have revealed that adipokines secreted by the EAT might affect myocardial cells and coronary arteries [3, 19]. Hypoxic and dysfunction of EAT might lead to lipolysis and inflammatory activities through the dysregulated secretion of vasoactive and inflammatory factors, which are involved in the process of atherosclerosis, including vascular remodeling, endothelial dysfunction, the proliferation and migration of smooth muscle cell (SMC), foam cell formation, and plaque destabilization [28]. Intelectin 1 (ITLN1), which in our analysis had the highest expression differences between the EAT and the SAT (Figure 2, Supplemental File-Figure 2A), is abundantly expressed in visceral adipose tissue and known to regulate obesity-related cardiometabolic disorders through its anti-inflammatory activity [29]. Leptin is regarded as an independent risk factor for atherosclerosis that exerts a variety of atherogenic effects. However, the expression level of leptin was not significantly higher in EAT compared with SAT in our analysis (Supplemental File–Figure 2(b)), and we hypothesize that the reasons might be as follows: (1) the samples are not sufficient to show significant differences; (2) leptin in EAT might mainly derived from circulation. In contrast, chemerin, which can bind to the G protein-coupled receptor (CMKLR1), is associated with immune response and the metabolism of glucose and lipids [30], and its expression levels are reportedly positively associated with coronary atherosclerosis [21].
Our study identified four feature genes (TCF21, CDH19, XG, and NNAT) associated with CAD using two machine-learning algorithms. TCF21 is involved in cardiac fibrosis and plays a critical role in the fate of smooth muscle cells [31], promoting SMC dedifferentiation by inhibiting the serum response factor-myocardin axis (SRF-MYOCD) [32]. The specific effects of TCF21 on atherosclerosis are complex. On the one hand, TCF21 suppresses the progression of atherosclerosis by regulating the transition from SMC to fibromyocytes and promoting the formation of antiatherosclerotic fibrous caps on the lesions [33]. On the other hand, when compared with the control, the transfection of TCF21 siRNA (siTGF21) notably decreases the level of reactive oxygen species (ROS) and cell apoptosis-related protein Bax and leads to an increase in the expression of active antiapoptotic protein Bcl-2 in human umbilical vein cells (HUVECs) [34]. This suggests that TCF21 might promote atherosclerosis via increasing the apoptosis rate and ROS accumulation. Cadherin 19 (CDH19) is a gene encoding calcium-dependent cell adhesion proteins involved in vascular remodeling and plays a critical role in the structural integrity of blood vessels [35]. Recent studies have demonstrated the involvement of classic cadherin in many complex processes, such as angiogenesis, morphogenesis, cellular communication, and cellular proliferation [36–38]. Niu et al. [39] revealed that the expression knockdown of CDH12 and CDH19 markedly inhibits monocyte chemotactic protein-1-induced protein (MCPIP) and suppresses the capillary-like tube formation in HUVECs. Moreover, CDH19 might serve as a new target of tumorigenesis and drug development for glioblastoma stem-like cells (GSC) and can be considered an independent prognostic biomarker of lung adenocarcinoma (LUAD) and breast cancer (BC) [36, 40, 41]. XG was one of the blood group systems located at the pseudoautosomal boundary on the short arm of chromosome X, composed of two X-borne alleles, Xg a and Xg [42]. Recent studies evaluating the biological functions of the gene were limited to its association with red blood cells (RBC). Meynet et al. showed that high XG protein expression in Ewing’s sarcoma (EWS) is associated with a worse prognosis. Furthermore, the overexpression of XG increased the proliferation and migration of EWS cells in vitro, while the knockdown of the gene with short hairpin RNA led to the opposite effect [43]. However, the role played by XG in atherosclerosis remains uncharacterized. Finally, NNAT is a paternally imprinted gene, which is expressed in the developing brain, pituitary, pancreas, and adipose tissue, and plays an important role in the appetite behavior, energy balance, adipogenesis, and inflammatory responses associated with insulin resistance [44–46]. Gene set enrichment analysis indicated a significantly negative correlation exists between NNAT and energy metabolism, but uncovered a positive correlation with inflammation [46]. It has been reported that NNAT inhibits oxidative stress and inflammation and promotes adipocyte differentiation by mediating the NF-κB signal pathway [45]. NNAT expression levels are also closely associated with endothelial dysfunction and EAT secretion [45, 47]. Furthermore, it has been found that increased NNAT expression levels are associated with poor prognosis in myxoid liposarcoma, lung cancer, and breast cancer [48–50]. However, very few studies clarified the association of this gene with atherosclerosis.
We calculated immune cell infiltration and estimated the correlation between the four genes and immune cells. We found that the four feature genes are correlated with immune cells, including plasma cells, M0 macrophages, and resting mast cells. To our knowledge, this is the first study to calculate the infiltration of the immune cells in EAT vs. SAT. Adipocytes not only serve as an energy storage depot but also play a critical role in endocrine and immune. Adipokines, such as leptin and adiponectin, are critical for the development of B cells, activation, and antibody production [51]. Hence, adipocytes play a crucial role in adaptive immunity mediated by B cells.
Despite the associations described above, few studies investigating the molecular mechanisms between these four genes and immune cells have been published to date, whereby further experiments are required to explore their pathogenesis. Among the limitations to our study, we can include (1) the choice of the SAT as control rather than the EAT of healthy individuals (due to ethical restrictions). Hence, the difference between the EAT and the SAT in healthy groups remains unknown; (2) the three datasets have limited sample sizes; (3) the association between the feature genes and CAD and their interaction with immune cells needs further investigation on larger sample sizes to confirm our observations.
5. Conclusions
In this study, we identified the DEGs between the EAT and the SAT in patients with CAD and explored the potential biological processes and pathways involved. The identified DEGs are mainly associated with the calcium signaling pathway, complement and coagulation cascades, ferroptosis, fluid shear stress and atherosclerosis, lipid and atherosclerosis, and regulation of lipolysis in adipocytes. In addition, the four feature genes identified (TCF21, CDH19, XG, and NNAT) might serve as feature genes for CAD, bringing new insights into the pathogenesis of cardiovascular diseases.
Data Availability
All the analyses in this study were based on the publicly available datasets (GSE120774, GSE64554, and GSE24425). Original data are available in the GEO database (https://www.ncbi.nlm.nih.gov/).
Conflicts of Interest
The authors declare no conflict of interest.
Authors’ Contributions
Jianyan Wen and Peng Liu designed, guided, and funded the study. Yisen Deng, Xuming Wang, and Zhan Liu drew the original manuscript. Yisen Deng, Xuming Wang, and Jianyan Wen analyzed the data. Xiaoshuo Lv, Bo Ma, Qiangqiang Nie, Xueqiang Fan, Yuguang Yang, and Zhidong Ye critically revised the manuscript. Jianyan Wen and Peng Liu edited and revised the manuscript. All authors contributed to the article and approved the submitted version. Yisen Deng, Xuming Wang, and Zhan Liu contributed equally to this work.
Acknowledgments
This work was supported by grants from the National Natural Science Foundation of China (nos. 81670275, 81670443, and 82170066) and the International S&T Cooperation Program (2013DFA31900).
Supplementary Materials
Supplemental File Figure 1: box plot of datasets before and after normalization. GSE24425 expression profile before (A) and after (B) normalization. Supplemental File Figure 2: the expression levels of the ITLN1 and leptin in the GSE120774. (A) ITLN1; (B) leptin. (Supplementary Materials)