Abstract

Objective. The present study identified methylation patterns of schizophrenia- (SCZ-) related genes in different brain regions and used them to construct a novel DNA methylation-based SCZ diagnostic model. Methods. Four DNA methylation datasets representing different brain regions were downloaded from the Gene Expression Omnibus. The common differentially methylated genes (CDMGs) in all datasets were identified to perform functional enrichment analysis. The differential methylation sites of 10 CDMGs involved in the largest numbers of neurological or psychiatric-related biological processes were used to construct a DNA methylation-based diagnostic model for SCZ in the respective datasets. Results. A total of 849 CDMGs were identified in the four datasets, but the methylation sites as well as degree of methylation differed across the brain regions. Functional enrichment analysis showed CDMGs were significantly involved in biological processes associated with neuronal axon development, intercellular adhesion, and cell morphology changes and, specifically, in PI3K-Akt, AMPK, and MAPK signaling pathways. Four DNA methylation-based classifiers for diagnosing SCZ were constructed in the four datasets, respectively. The sample recognition efficiency of the classifiers showed an area under the receiver operating characteristic curve of 1.00 in three datasets and >0.9 in one dataset. Conclusion. DNA methylation patterns in SCZ vary across different brain regions, which may be a useful epigenetic characteristic for diagnosing SCZ. Our novel model based on SCZ-gene methylation shows promising diagnostic power.

1. Introduction

Schizophrenia (SCZ) is a serious mental illness [1]. The World Health Organization estimates the global lifetime prevalence of SCZ at 3.8–8.4% [2]. SCZ is a severe psychosis induced by multiple factors and it manifests as a clinical syndrome with many symptoms. The course of the disease can include repeated relapses that aggravate disease and reduce quality of life. Some patients suffer from depression or mental disability [3]. Currently, clinical diagnosis of SCZ is based on the diagnostic scale of International Mental Disorders Classification [4]. The heterogeneous nature of SCZ pathogenesis has made it impossible so far to identify a single, reliable diagnostic biomarker or model.

Previous studies have shown that epigenetic changes may be related to the pathology of SCZ [5]. DNA methylation is the most stable epigenetic modification, and it can lead to changes in phenotype although the DNA sequence remains unchanged [6, 7]. DNA methylation may affect neuronal activity, transcriptional output, and synaptic function. Thus, methylation may be important in the pathology of SCZ [8]. Abnormal methylation of some genes, such as DRD2 [9], DLGAP2 [10], or COMT [11], may be associated with the occurrence and development of SCZ. In addition, a few DNA methylation-based classifiers for SCZ diagnosis have been reported [12, 13]. However, these studies were limited because they relied more on statistical associations without in-depth analysis of biological function.

In the present study, we identified common differentially methylated genes (CDMGs) shared across different brain regions in SCZ patients. We used the 10 CDMGs involved in the greatest number of neurological or psychiatric-related biological processes to construct a DNA methylation-based classifier for SCZ diagnosis. This method may be more biologically relevant than previous ones and may provide new insights to guide future research in SCZ.

2. Materials and Methods

2.1. DNA Methylation in SCZ

In the present study, four SCZ methylation datasets (GSE89702, GSE89703, GSE89705, and GSE89706) were downloaded from the Gene Expression Omnibus database (https://www.ncbi.nlm.nih.gov/geo/). DNA was isolated from postmortem brain samples. GSE89702 was derived from cerebellum from the Douglas Bell-Canada Brain Bank, and it included 16 SCZ samples and 17 normal controls. GSE89703 was derived from hippocampus, and it included 14 SCZ samples and 13 normal controls. GSE89705 was derived from striatum, and it included 16 SCZ samples and 17 normal controls. GSE89706 was derived from striatum from the London Brain Bank for Neurodegenerative Disorders, and it included 21 SCZ samples and 28 normal controls. The platform was GPL13534 and included each probe, the position on the chromosome, and the corresponding gene name of each probe. The normalized methylation data matrix was shown as beta values (ranging 0 to 1) of each probe with probe ID in row and patient ID in column. The workflow of the present study is shown in Figure 1.

2.2. Differential Methylation Analysis

Although GSE89705 and GSE89706 were both taken from the region of striatum, they were not combined because we did not know if there is any difference in the processing of samples between the two brain banks and that it is possible that combining datasets may result in some residual inflation according to a published study [14]. Thus, differential methylation analysis was performed separately in the four datasets using the limma package [15] in R software. The size of samples in each dataset was relatively small, and thus, the significance of differential methylation sites may be relatively low. If rigorously value filtering, genes with potential biological function may be filtered out. Therefore, in the present study, differences with a value <0.05 were considered significant.

2.3. Enrichment Analysis

To explore the biological functions of CDMGs that may be related to SCZ, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analyses were performed using the ClusterProfiler package [16] in R. GO terms and KEGG pathways with values <0.05 were considered significant.

2.4. Variable Selection and LASSO Classifier

Differentially methylated genes were involved in larger numbers of neurological or psychiatric-related biological processes and their methylation levels are thought to be more likely associated with SCZ than genes involved in fewer such processes. The differential methylation genes are different in different brain regions, and the differential methylation sites of the same gene in different brain regions may also be different. So we tried to build a brain-specific methylation-based classifier for SCZ. The corresponding different methylation sites of ten CDMGs involved in the most neurological or psychiatric-related biological processes were used to construct a diagnostic model of SCZ. These samples of the four datasets were randomly assigned to the training set (75%) and test set (25%), respectively. The four training sets were, respectively, used to select variables (different methylation sites) for establishing a DNA methylation-based classifier, and the test sets were used to validate the four classifiers. The glmnet package [17] in R used the least absolute shrinkage and selection operator (LASSO) [18] was used to select variables and construct a DNA methylation-based diagnostic classifier.

2.5. Evaluation of Methylation-Based SCZ Diagnostic Model

The diagnostic performance of the DNA methylation-based SCZ diagnostic model was evaluated by accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the receiver operating characteristic curve (AUC) as analyzed in the pROC package [19] in R software.

3. Results

3.1. SCZ-Related CDMGs in Different Brain Regions

We assessed DNA methylation characteristics from the four datasets taken from different regions of the brain. Compared with control groups from the respective datasets, we identified 7887 hypermethylated positions and 7484 hypomethylated positions in GSE89702 (Figure 2(a)), 5087 hypermethylated positions and 4227 hypomethylated positions in GSE89703 (Figure 2(b)), 5116 hypermethylated positions and 4275 hypomethylated positions in GSE89705 (Figure 2(c)), and 5569 hypermethylated positions and 4801 hypomethylated positions in GSE89706 (Figure 2(d)). A total of 849 genes were identified as CDMGs shared by all four datasets (Figure 2(e) and Table S1). However, these CDMGs were methylated at different sites and to different degrees in different brain regions.

3.2. CDMGs Involved in Multiple Neurological or Psychiatric-Related Biological Processes and Pathways

GO analysis of the SCZ-related CDMGs revealed these genes were involved in 244 biological processes, 43 cellular components, and 31 molecular functions (Figures 3(a)3(d)). KEGG pathway analysis showed that the CDMGs were involved in 80 signaling pathways, most significantly in PI3K-Akt, AMPK, and m-activated protein kinases/extracellular regulated protein kinases (MAPK/ERK), as well as several pathways related to neurological or psychiatric-related biological processes (Figure 3(e)). Notably, the CDMGs were involved in multiple neuronal axon-related biological processes (Table S2); this suggests that the methylation level of CDMGs in neuronal axons may be associated with SCZ.

3.3. DNA Methylation-Based Diagnostic Model for SCZ

The following CDMGs were involved in the greatest numbers of neurological or psychiatric-related biological processes: SHANK3, WNT5A, NLGN1, GLI3, PTPRS, DISC1, SHH, BAIAP2, GLI2, and PAX6. The beta values of the differential methylation sites corresponding to these ten genes were used to construct a methylation template for diagnosing SCZ. Since our four datasets were taken from different regions of the brain, we found that these genes were differentially methylated based on their location—both in the brain region and on the DNA methylation position (Tables 14). Therefore, the beta values of the differential methylation sites corresponding to each gene were collected, and the LASSO method was used for variable selection and construction of the DNA methylation-based classifier. The results suggested that the counts of differential methylation sites selected by LASSO in different brain regions were various (Figure 4). The diagnostic ability of the same CDMG in different brain regions varied, which indicated that the epigenetic dysregulation of SCZ is complicated.

3.4. Diagnostic Efficiency of the Composite Model in Training and Validation

In order to evaluate the diagnostic efficiency of the DNA methylation-based classifier, receiver operating characteristic curves were analyzed (Figure 5). In GES89702, GES89703, and GES89705, the accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and AUC in the training and test sets were 1 (Tables 57). In GES89706, the accuracy was 0.950 in the training set and 0.920 in the test set, while the AUC was 0.994 in the training set and 0.943 in the test set (Table 8). The results suggest the DNA methylation-based classifier is a potential biomarker for diagnosing SCZ.

4. Discussion

In recent years, the morbidity and mortality rates of SCZ have increased, and the many health- and social-related problems for patients with SCZ are cause for much concern. However, the effective molecular diagnostic methods are unmet. In particular, diagnostic models that take into account both the molecular statistical and biological significance have not received much attention.

In the present study, a total of 849 CDMGs were identified in different brain regions. Functional enrichment analysis indicated the CDMGs were involved in various neurological or psychiatric-related biological processes and pathways, specifically signaling pathways PI3K-Akt, AMPK, and MAPK. The methylation levels of CDMGs may affect these biological processes and pathways. Our study identified biological processes with confirmed roles in mental diseases, including SCZ [20, 21]. MAPK/ERK and PI3K/Akt signaling pathways can regulate protuberant growth and protein synthesis related to neural plasticity, and it can assist in the normal development of nerve cells, which may protect against SCZ [22]. The initiation of nerve axon regeneration is regulated by the MAPK pathway and this initiates a neuronal response [23]. Further studies should explore directly whether the CDMGs in the present study are associated with SCZ.

The beta values of the differential methylation sites corresponding to 10 CDMGs (SHANK3, WNT5A, NLGN1, GLI3, PTPRS, DISC1, SHH, BAIAP2, GLI2, and PAX6) which were most often involved in neurological or psychiatric-related biological processes were used to construct a DNA methylation-based brain region-specific for SCZ. The methylation sites within genes and degree of methylation varied in different brain regions, suggesting that the methylation patterns of SCZ-related genes are extremely complex. The DISC1 protein regulates the development, maturation, and migration of brain neurons and synaptic signal transmission [2427], and disruption can lead to various mental diseases, including SCZ [28, 29]. The process of nerve development and synaptic transmission regulated by DISC1 can be affected by its degree of methylation [30]. SHANK3 knock out may affect neuronal development and induce SCZ [31]. SHANK3 and NLGN1 are also related to the progression of SCZ [32, 33]. However, few reports exist on the association of genes WNT5A, GLI3, PTPRS, SHH, BAIAP2, GLI2, or PAX6 with SCZ. Our results suggest that the methylation level of these genes may be related to the disease. Indeed, our DNA methylation-based classifier showed strong diagnostic potential based on AUC analysis. It is worth noting that due to the characteristics of LASSO method, the more the inclusion variables, the better the effect of the classifier. However, from the perspective of economics, the more the variables (differential methylation sites) are included, the higher the cost is. So taking into account the effectiveness of the model and the cost of economics, we started to perform feature section and classifier construction from 10 genes instead of fewer or more. From the results of the present results, when we included 10 genes, LASSO method identified differential methylation sites of 7-8 genes for us and obtained the best AUC value (close to 1). So we may foresee that the inclusion of fewer genes may greatly reduce classification efficiency, while the inclusion of more genes may be not necessary because it would increase costs but not increase the effect.

Some critical limitations exist in the present study. Due to the small sample size, our DNA methylation-based SCZ diagnostic model needs to be further validated and improved in larger, independent datasets. The potential roles of CDMGs in SCZ need to be explored experimentally.

Despite these limitations, our findings suggest that gene methylation patterns are significantly associated with SCZ and may be a promising diagnostic method. Methylation levels and sites in CDMGs varied widely across different brain regions, and future studies should explore the potential relevance of this variation for SCZ onset and progression.

Data Availability

The data used to support the findings of this study are included within Supplementary Table S1.

Conflicts of Interest

The authors report no conflicts of interest in this work.

Authors’ Contributions

Donghua Zou and Yufen Qiu contributed equally to this work.

Acknowledgments

This study was supported by the Guangxi Natural Science Foundation (2016GXNSFCA380012), the Project of Nanning Scientific Research and Technology Development Plan (20193093), the High-Level Medical Expert Training Program of Guangxi “139” Plan Funding (G201903049) and sponsored by Nanning Excellent Young Scientist Program (RC20190103).

Supplementary Materials

Table S1: common differential methylated genes in four datasets. Table S2: common differential methylated genes involved in neurological or psychiatric-related biological processes. (Supplementary Materials.)