Background. Pregnancy in systemic lupus erythematosus (SLE) patients is a challenge due to the potential maternal and fetal complications. Therefore, a multidisciplinary assessment of disease risk before and during pregnancy is essential to improve pregnancy outcomes. Objectives. Our purpose was to (i) define clusters of patients with similar history and laboratory features and determine the associative maternal and perinatal outcomes and (ii) evaluate the risk spectrum of maternal and perinatal outcomes of pregnancy in SLE patients, represented by our established risk-assessment chart. Methods. Medical records of 119 patients in China were analyzed retrospectively. Significant variables with were selected. The self-organizing map was used for clustering the data based on historical background and laboratory features. Results. Clustering was conducted using 21 maternal and perinatal features. Five clusters were recognized, and their prominent maternal manifestations were as follows: cluster 1 (including 27.73% of all patients): preeclampsia and lupus nephritis; cluster 2 (22.69%): oligohydramnios, uterus scar, and femoral head necrosis; cluster 3 (13.45%): upper respiratory tract infection; cluster 4 (15.97%): premature membrane rupture; and cluster 5 (20.17%): no problem. Conclusion. Pregnancy outcomes in SLE women fell into three categories, namely high risk, moderate risk, and low risk. Present manifestations, besides the medical records, are a potential assessment means for better management of pregnant SLE patients.

1. Introduction

Systemic lupus erythematosus (SLE) is an autoimmune disease, which is principally found in women, especially in reproductive age. Pregnancy is considered high risk due to a combination of maternal risks (lupus flare, diabetes, and preeclampsia) and fetal risks (miscarriage, intrauterine fetal demise, preterm birth, intrauterine growth restriction, and congenital heart block) [13]. Considering the simultaneous involvement of both mother and fetus, it is always the patient’s and the physician’s concern to know the disease status and changes as well as evaluate pregnancy risks either before conception during six-month control or throughout the pregnancy. Such a fact indicates the importance of a planned pregnancy which can lead to the most favorable maternal-fetal outcomes [4] and is a protective approach against undesired pregnancy complications. For this purpose, counseling and risk assessment of the SLE patients should be considered before conception for evaluating poor pregnancy outcome, discussing contraindications of pregnancy, organ evaluation, and appropriately modifying medicine. During and after pregnancy, the cooperation of obstetrician and rheumatologist should be continued as a team in the planned visits. The best pregnancy outcome will be achieved in patients who are in remission 6 months before conception [2]. Nevertheless, in a patient with a new-onset or diagnosed in pregnancy, it is rather difficult to achieve this goal. Patients with high risk due to the flare of disease should postpone pregnancy until the disease is well-control. With this in mind, the need for risk stratification has always been a major concern of the researchers and physicians [510]. The unmet medical need that the present study aimed at addressing is the risk stratification for pregnancy outcomes by considering both maternal and perinatal outcomes, which remain as a major rheumatological and obstetric challenge.

Researchers have tried different techniques to investigate risk factors for adverse pregnancy outcomes [11] and have examined the values of angiogenic factors in early gestation [12]. Logistic regression has been used for estimating fetal loss in SLE women [13], and the Chi-square test has been adopted to evaluate maternal and fetal outcomes [14]. A meta-analysis of the most recent studies (2017-2019) has investigated maternal and fetal complications associated with SLE to update our knowledge of the present situation [15]. Artificial intelligence has also been implemented in this field to establish clinical decision support systems (CDSS) [9, 16, 17] and also to predict the probability of stillbirth or live birth [9]. Deep learning has been implemented for predicting lupus hospitalization and lupus hospital readmission, estimating extreme preterm birth, and evaluation of pathological observations in Glomerular lupus nephritis [1821]. Moreover, clustering has been used for evaluating the spectrum of serum autoantibodies in pediatric-onset SLE [22], investigating the relationship between damage and mortality rate in the juvenile-onset of lupus disease [23], characterizing lupus patients profiles [24, 25], and also studying patients with antiphospholipid syndrome [2629]. However, evaluating the outcome risk of pregnancy in SLE patients has not been studied using neural network clustering. The artificial intelligence approach has proved more powerful than [3032] or as powerful as statistical methods [33] in dealing with complex medical problems.

In the present study, we also addressed the issue of mixed-type data in clinical studies. Clinical data would be expressed as categorical (nominal and ordinal) and quantitative (continuous, or discrete numbers) values [34]. For each individual kind of these data, there are well-established statistical analysis methods, which are convenient for straightforward usage by physicians. However, how to use two or several kinds of these data together is not simple. The data we had in the present study included both categorical (low, high, and normal) and continuous kinds. Therefore, we employed the concepts of “cooccurrence” and “similarity of phenomena” through a particular algorithm to find the continuous equivalent of a categorical item. Then, the two values would be compared and appropriately used by any statistical or mathematical model for such tasks as dimension reduction, clustering, and classification. For convenient usage of the reader, a step-by-step guide is provided for this algorithm in the supplementary materials (available here).

Therefore, in this study, the primary objective was to detect clusters of SLE women based on their historical background and laboratory features. Then, we determined the association of these clusters with the risk of maternal and perinatal outcomes.

2. Patients and Method

This was a single-center retrospective study with cross-sectional analysis. We clustered the patients based on their historical background and laboratory features. After obtaining clusters, the corresponding maternal manifestations and perinatal outcomes of each cluster were identified, and the clusters were characterized. Then, the obtained clusters were divided into three groups, namely the high-, medium-, and low-risk groups. Finally, to display the disease status, a risk-assessment chart was designed.

2.1. Patients

Medical records of 119 pregnant SLE patients from 2015 to 2019 in Qilu Hospital of Shandong University in Jinan city, Shandong Province of China, were reviewed. The present study was exempt from Institutional Review Board approval (IRB) as it was merely a clinical analysis of pregnant SLE women. It did not include any human experimentation or application of any new drugs or experiments on the tissues of the patients. The patients were all women with a primary or secondary diagnosis of SLE approved by at least two rheumatologists. Herein, 16% of the patients were diagnosed with SLE during pregnancy, and the rest were all diagnosed before pregnancy. The selection criteria of the patients included (i) women with approved pregnancy, (ii) no limitation for the age and disease duration, and (iii) met the American College of Rheumatology (ACR) 1997 revised criteria for SLE [35, 36]. We had no specific criteria for the exclusion of the patients. Information about the neonatal APGAR and neonatal weight as well as the type of delivery was obtained from the delivery room. The pregnant SLE patients included both multipara and nullipara. Although we had no age limit in the selection criteria, the patients were 19 to 43 years old, for whom the gestational age ranged from 35 to 290 days. The average maternal age and gestational age were, respectively, 29.23 years old and 237.65 days. We identified 22 potential variables in this regard; however, one of these variables, CH 50, was not appropriately recorded for all of the patients. Hence, we had to leave it out.

2.2. Data

Three researchers (the first, third, and fourth authors) extracted the variables’ data from the medical records at Qilu Hospital of Shandong University. In the next step, a professional monitor with experience in rheumatologic studies reviewed the extracted data and identified inconsistencies, which were corrected after reconsideration and discussion with other team members. The research team included four experts in the field of Rheumatology and Immunology, two experts in the field of Gynecology and Obstetrics, and two experts in the field of data science and corresponding data analysis of the present study. With regard to pregnancy outcomes, we, respectively, had 15 maternal outcomes and four perinatal outcomes. The maternal outcomes were items - (represented in the next section), and the perinatal outcomes were live birth/fetal loss, term/preterm, newborn’s weight, and first-minute Apgar. The adopted variables included disease duration, mother’s age at pregnancy, gestational age at delivery, type of delivery, and obstetric historical background of the mother (GPAL). Our serological features included antinuclear antibodies (ANA), anti-double-stranded DNA (dsDNA) antibodies, antiphospholipid antibodies (APS), anti-Ro/SSA, and anti-La/SSB. Complement C3 and C4 were also measured. Baseline routine laboratory tests included WBC, RBC, PLT, HB, random urine test, erythrocyte sedimentation rate (ESR), and C-reactive protein (CRP). Our recorded data included both the numerical and categorical types (mixed-type data). While disease duration, mother’s age, urinary protein, gestational age at delivery, newborn weight, and GPAL were numerical data, type of delivery, HB, PLT, RBC, WBC, C3, C4, ESR, ANA, CRP, SSB Ab, SSA Ab, dsDNA, and APS were categorical data. The categories were labeled “low,” “normal,” and “high.” Moreover, urinary protein variations fell in the ranges of negative, trace (less than 10 mg/dl), 1+ (30 mg/dl), 2+ (100 mg/dl), and 3+ (300 mg/dl), 4+ (1000 mg/dl) [37, 38]. Besides to new onset of proteinuria, the kidney activity of the patients had been confirmed by the presence of decreasing complement levels, rising anti-double-stranded DNA, and the new onset of hypertension [39]. Preterm labor was defined as “born before 37 complete weeks of gestation” [40], while term labor was defined as “born at a gestational age between 37 and 42 weeks.” The Apgar score was used to assess neonatal vitality, including the items of appearance, pulse, grimace, activity, and respiration, with total scores ranging from 0 to 10. The definition of stillbirth was fetal death at or after 20 weeks’ gestation [41], and miscarriage was defined as loss of pregnancy before 20 weeks of gestation. It should be noted that we encountered some missing values in the extracted data which is to be expected in such a retrospective study. The missing data were for SSA and SSB (each, four patients), APS (seven patients), and dsDNA and ANA (each, five patients). For the purpose of consistency and quality of the employed data, we tried to complete this data before clustering. The issue was resolved by replacing the mode of each class for categorical variables and the mean of each class for numerical variables. This approach is well-established and has been reported by previous research studies as well [9, 42, 43].

2.3. Joint Application of Numerical and Categorical Values

Considering that our data was mixed-type, based on the concept of cooccurrence, we implemented the algorithm of “two-step method for clustering mixed numerical and categorical data (TMCM).” [44] TMCM starts with selecting a particular characteristic (base attribute) in the categorical data and compares it with other characteristics (nonbase attributes). While cooccurrences (similarity) of the base and nonbase attributes were calculated using Equation (1), the within-group sum () of squares and within-group variance () were, respectively, calculated using Equations (2) and (3). where represents the cooccurrence or similarity between nonbase item with the base items and and , respectively, represent the total number of occurrences of each nonbase and base items. where to indicate the nine numerical attributes, stands for the number of observations, i.e., patients, and is the number of numerical attributes. It was possible to quantify the nonbase categorical items using the following equation: where represents the categorical nonbase item (), is the similarity between the nonbase item and the base item . Table 1 indicates the observed similarity between the base and nonbase variables of the current research. The categorical labels for different variables are represented in Table 2. The adopted methodology is depicted in Figure 1.

2.4. Adopted Self-Organizing Network

Neural networks’ superior performance in healthcare studies has been proven by many research works [9, 45, 46]. The advent of deep learning in recent years has even further highlighted the significance and prospective applications of neural networks in various directions of medical studies. Deep learning and the feature engineering techniques have enabled obtaining deeper insight into the data that were not easily obtainable earlier [47, 48]. These networks are the simple mathematical formulation of the superfast and overcomplicated learning process which occurs in the human brain [49, 50]. The human information processing system is composed of neurons switching at speeds about a million times faster than the logical computer gates [51]. The knowledge in the artificial neural network is not stored in a single neuron but inside all the neurons and their meaningful (weighted) connections with the neighboring neurons [52]. The current study adopted the clustering capability of neural networks through the self-organizing maps (SOM). Clustering is the most typical form of unsupervised learning and is a crucial application of machine learning. However, its potential has not been fully explored in medical studies. Self-organizing maps are used to reduce the dimensionality so as to better understand the data in hand. SOM networks are one of the most important components of artificial neural networks and are very sensitive to the input values as all other machine learning algorithms. Therefore, in the present study, appropriate preprocessing of the input variables was required. We first observed the data and removed the potential outliers. Since values of different variables had different min-max ranges, we normalized all input variables to prevent the impact of these ranges. The normalization was applied as:

In the established self-organizing maps of the present study, the optimal network structure and the number of neurons in the hidden layer were selected based on the trial-and-error method and were confirmed by the researchers’ experience of the field. This is a well-known approach in the application of machine learning algorithms [9, 17, 53, 54]. Herein, different hidden layers, respectively, using 5, 9, 12, and 15 neurons were examined (Figures 2(a)2(d)). Eventually, the best-performing network had a 1-1-1 structure with one input layer, one hidden layer, and one output layer. It was observed that 12 neurons could satisfyingly capture the information among the input characteristics of the patients. Our SOM was trained in the batch mode using “trainbu” function (learnsomb). In batch training, the network’s weight and biases are updated at the end of an entire pass through the input data [55]. The network performance function was mean square error. By selecting 12 neurons in the hidden layer, a topology consisting of miniclusters was constructed where each neuron acted as an independent minicluster center. Then, those miniclusters which were close enough to each other were collected together to form a high-level cluster. These final clusters are depicted in Figure 3(a). The idea of subclusters and high-level clusters is explained in detail in the research work of Vesanto and Alhoniemi [56].

3. Results and Cluster Analyses

Our patients were divided into five clusters. Figure 3(a) shows these clusters on the self-organizing map, and Figure 3(b) represents the distance between the adjacent neurons of the SOM.

Cluster 1 included 33 patients for whom the maternal manifestations were mainly preeclampsia/eclampsia and lupus nephritis. We had 21 patients with preeclampsia/eclampsia among which 8 patients had eclampsia. The perinatal outcomes included 6 miscarriages (5.8%), 3 of which were with LN, 1 with preeclampsia, and 2 with a spontaneous abortion without any manifestation. A total of 15 fetal deaths (11.76%) included 4 with preeclampsia, 3 with gestational hypertension, 3 with lupus nephritis, 1 with AP antibody syndrome, 2 with encephalopathy, and 2 with thrombocytopenia. The cause of encephalopathy in the patients was intracranial hypertension (IH). IH is a poorly understood disorder characterized by increased intracranial pressure usually idiopathic in the absence of an identifiable lesion. It is a rare manifestation of lupus in pregnancy [57].The average APGAR was <7, and the average newborn weight was 950 g. Patients in this cluster were the “most high risk.” In our proposed risk-assessment chart for predicting the disease status and changes (Figure 4), this cluster is shown in red color. It is worth noting that 3 patients had antiphospholipid syndrome (anticardiolipin antibodies) which were in clusters 1 (2 patients) and 2 (1 patient).

Clusters 2 and 3, with 27 and 16 patients, respectively, included a better pregnancy outcome with a few cases of those maternal manifestations in cluster 1. Oligohydramnios and uterus scar were the prominent maternal manifestations found in these two clusters. Upper respiratory tract infection, placenta abruption, encephalopathy, and femoral head necrosis were also observed in a few cases. We identified the risk of these two clusters as “moderate.” The average APGAR was between 9 and 10, and the average newborn weights were 2800 g and 2400 g, respectively. This group is shown with yellow in the risk-prediction chart.

Clusters 4 and 5 were the “low-risk pregnancy” clusters which included 19 and 24 patients, respectively. Cluster 4 was identified with preterm rupture of membrane (PROM) as the prominent maternal manifestations. Cluster 5 was mainly characterized by no damage. Anemia and hypothyroidism were the common manifestations of these two clusters. The average APGAR was almost 10 (9.94 and 9.91, respectively), and the average newborn weights were 2850 g and 2950 g, respectively. This group is shown with green in the risk-prediction chart.

Tables 3 and 4 denote the detailed analysis of the resultant clusters.

It should be noted that Figure 3(b) shows the internal structure of the SOM topology. In this figure, each node contains the data (patients) with the most similar characteristics. The variables of the patients could not be individually interpreted from this figure but the overall differences between any two patients could be inferred. In the self-organizing maps, the neurons are in a competitive structure, which learns to recognize groups of similar input vectors in such a way that neurons physically near each other in the neuron layer respond to similar inputs [55]. This means that for a clustering task using SOMs, one could not use cutpoints to categorize the continuous variables and assign each individual patient to a particular cluster. This is because we are dealing with a high number of variables (more than three) where the human mind is incapable of analyzing all the cutpoints and the entangled relationships of those variables. This is the most important reason for using the dimension reduction technique as in the present study. In SOMs, while a group of similar and dissimilar variables coexists, differentiating between two similar ones is not feasible but differentiating between a similar and dissimilar one is quite accessible. In the context of this study, this means that our model could not discriminate, e.g., between lupus nephritis and preeclampsia, because both manifestations indicate a high-risk pregnancy outcome. This issue is an unmet medical need as well. Meanwhile, our method differentiates among the patients in different clusters based on their input characteristics for risk stratification. If two patients are different in their input data and pregnancy outcomes and fall into two different clusters, then our methods could differentiate between lupus activity and infection; but if two different manifestations are assigned to the same cluster, such a differentiation is not feasible.

4. Discussion

In this retrospective research of pregnant SLE patients, we observed several patterns in the history and laboratory features and analyzed their association with different pregnancy outcomes. Pregnancy outcome in SLE women is often influenced by not a single but several factors to different degrees. Hence, we applied the clustering technique which is the first study to conduct such an analysis in pregnant SLE patients. To the best of our knowledge, the advantage of identifying these clusters compared to the identification of poor or good prognostic factors demonstrated in the daily clinical practice of patients lies in the fact that the clustering method enables us to analyze numerous factors simultaneously through dimensionality reduction. Since these factors have different degrees of influence on pregnancy outcome risk, self-organizing maps could capture the relationship between these factors. Usually, there is no obvious linear relationship between influencing factors and the pregnancy outcome, which adds to the complexity of the problem [5862]. Previous studies on pregnant SLE patients have focused only on a very limited number of these factors (e.g. [41, 63, 64]).

Five distinct clusters were identified based on the history and laboratory features, which were then explored for their association with maternal manifestation and perinatal outcomes. We found considerable differences among obtained clusters not only in the proportion of associative maternal manifestations but also in the perinatal outcomes, both of which determine the risk of pregnancy outcome. Tables 5 and 6 represent the detailed analysis of the mother’s and infant’s features in the obtained clusters.

Based on the obtained information, a risk-assessment chart was proposed for evaluating the disease risk (Figure 4). This chart, which is specially designed for pregnancy, includes most of the possible maternal manifestations in pregnant SLE patients and demonstrates their corresponding probable perinatal outcomes. Moreover, for clarifying when to use the introduced methodology of the present study, a management protocol is introduced (Figure 5). Nevertheless, the detailed analysis of the obtained clusters revealed that cluster 1 was the largest and the only one to include stillbirth and abortion (21 cases, 17.6%) while all the patients in other clusters had a live birth. In the studied patients, no maternal mortality was observed. In clusters 2 and 3, the most observed fetal outcome was preterm (19 cases, 15.9%). Clusters 4 and 5 were the low-risk clusters, and the born babies were mostly full term (32 cases, 26.8%). All of these five clusters differed significantly in their predominant maternal manifestations. Patients with preeclampsia/eclampsia and LN mainly represented cluster 1, while those with PROM damage were mainly in cluster 4. Cluster 5 was the one that included the highest number of no-damage patients. When considering all clusters together, we also observed that pregnant patients without problems were the most common, i.e., 22 patients (18.48%), preeclampsia was the second common manifestation with 21 patients (17.64%), and lupus nephritis and premature rupture of membrane, each with 17 patients (14.28%) were the third common manifestations. There was a good agreement between each cluster’s manifestations severity and the number of live and death births, Apgar score, and the newborn weight. This observation verified the reliability of the clustering results. Besides, based on the perinatal outcomes (live/fetal death, term/preterm birth, newborn weight, and first-minute APGAR) and maternal manifestations, we found that pregnant SLE patients could have three levels of disease risk, i.e., high risk, moderate risk, and low risk. In our research, 27.73%, 36.13%, and 36.13% of the patients were, respectively, at high risk, moderate risk, and low risk for pregnancy. Totally, 39 labors (32.7%) were preterm, and 59 (49.5%) labors were term. We found preterm delivery in 7.5% of high-risk patients, in 15.9% of moderate-risk patients, and in 9.2% of low-risk patients. The obtained results of the present study were in agreement with those of other researchers, which verifies the reliability of the adopted approach. As also follows from these studies, the patients at the high-risk category, with predominantly preeclampsia/eclampsia and LN, are associated with an increased risk of stillbirth and abortion and the risk for cesarean section. Moreover, the risk of newborns who had low birth weight, and newborns with an within 1 minute were significantly associated with SLE [6, 15, 53]. Therefore, cluster 1 which belonged to the high-risk category was the best predictor for fetal death. This cluster was also the best predictor for maternal complications, which in detail were gestational hypertension, preeclampsia/eclampsia, lupus nephritis, thrombocytopenia, AP antibody syndrome, encephalopathy, and placenta abruption. Patients in the moderate-risk category had more chance of preterm delivery; and for the low-risk category, which included no damage and PROM, it would be a full-term baby. Surprisingly, PROM which is usually considered an emergency and needs a quick decision fell in the low-risk category in the present study. The reasons for this phenomenon included the following: (i) We had 17 patients (0.71%) with PROM while only five of them were preterm PROM (0.29%). Most of the fetuses born due to PROM were mature at 37-40 weeks without complications. Hence, most of the newborns were term birth without fetal death, with average 1 minute APGAR of 9.82, and the mean newborns’ weights were 3145 g for PROM and 2570 g for preterm PROM. The existing studies have also indicated that among women with lupus, spontaneous preterm labor has not proved to drive the high preterm birth rate [54]. (ii) In our target study center, induction of labor is more favorable than expectant management which may lead to a lower risk of maternal infection. Moreover, as reported in the previous studies [2], and also, in a single-center case series, researchers observed that preterm PROM was common in SLE pregnancy while it was not associated with disease activity.

According to our results, rheumatologist and obstetrician should bear the high-risk group in mind which shows a high mortality rate in pregnant SLE patients. The previous studies have recommended frequent antenatal visits before 20 gestational weeks in high-risk SLE patients [22]. Moreover, our proposed protocol (Figure 5) which includes screening lupus patients before conception and follow-up visits during pregnancy would be useful for the proper management of SLE patients. Herein, the information about disease status and changes could be obtained as follows. Physicians could follow the protocol in their visits at the designated times. Then, they could implement our introduced methodology to determine associative maternal and fetal outcomes and see the relevant changes of the disease condition by pointing and comparing its present and previous locations on the proposed risk-assessment chart (Figure 4).

In this research, we noticed that cluster 1, which was identified by the high-risk patients, included two patients with no problems. Moreover, cluster 4, which was recognized as the low-risk cluster, had two preeclampsia and two lupus nephritis patients. This implies that only considering manifestations and analyzing the patients based on their manifestations as conducted by the previous research studies could not come to reliable results. However, using the intelligent technique can display the hidden aspect of the patient s’ data.

5. Conclusion

In conclusion, we put forth a new approach for identifying different patterns of history and laboratory features within pregnant SLE patients. We observed that the patients’ pregnancy outcomes could be high risk, moderate risk, or low risk. Herein, our findings were neither manifestations nor lab results because either of these two is not individually enough to represent the patient’s condition reliably. It is better to incorporate different medical records and optimize management procedures. Therefore, a well-planned pregnancy and risk stratification 6 months before pregnancy and continuous multidisciplinary assessments during pregnancy are necessary for SLE pregnant patients.

The application of neural networks, especially deep learning and function fitting tools, for predicting the level of SLE disease activity would be suggested as a topic for future studies.

6. Limitations and Future Research

The most notable limitations of the present study are as follow: (a) The adopted methods of imputation and substitution for making up for the missing data pose bias on the original data and could cause misinterpretation of the results. Therefore, the ratio of the mission data should be considered carefully. (b) Several adopted variables of this retrospective study (such as HB, RBC, WBC, ESR, C3, and C4) had been recorded as categorical items. However, continuous values are preferable because they cover a wider range and are more precise than the categorical values. (c) SLE disease activity should be measured and validated by disease activity instruments, such as SLEDAI and BILAG. The introduced methodology of the present study is not capable of producing such results. (d) The authors believe that a greater number of patients would certainly lead to more reliable results.

Data Availability

The developed codes are included in the Appendix (Supplementary materials) of the present manuscript. Other data will be provided upon request.

Ethical Approval

The study did not require ethical approval. This article does not contain any studies with human participants or animals performed by any of the authors. This was a retrospective study in which the data were extracted from the medical records. The current study was exempt from Institutional Review Board approval (IRB) as it was the clinical analysis of the existing data. It did not include any human experimentation or application of any new drugs or experiments on the tissues of the patients.

Conflicts of Interest

The authors declare no conflict of interest.

Authors’ Contributions

Arezou Bikdeli and Dongxia Liu conceptualized the research. Daqing Li supervised the application of the methodology. The technical review and amendment of the methodology were carried out by Hongsheng Sun, Qingrui Yang, and Naser Golsanami. Arezou Bikdeli, Minati Malide, and Meysam Nouri carried out the investigations, interpreted the results, and wrote the original manuscript. The manuscript was reviewed and revised by Naser Golsanami and Dongxia Liu. All authors have read and agreed to the published version of the manuscript.


This research was supported by the National Natural Science Foundation of China (Grant No. 21576206) and the National Natural Science Foundation of Shandong Province of China (Grant No. ZR2022QD080).

Supplementary Materials

S1 Quantifying categorical items. Table S1: Cooccurrence matrix of numerical and categorical variables. Figure S1: guideline chart on sorting the variables in the cooccurrence matrix. S2. Further details of the adopted neural network. (Supplementary Materials)