Abstract
Esophageal squamous cell carcinoma (ESCC) is one of the highest incidence and mortality cancers in the world. An effective survival prediction model can improve the quality of patients’ survival. Therefore, a parameteroptimized deep belief network based on the improved Archimedes optimization algorithm is proposed in this paper for the survival prediction of patients with ESCC. Firstly, a combination of features significantly associated with the survival of patients is found by the minimum redundancy and maximum relevancy (MRMR) algorithm. Secondly, a DBN network is introduced to make predictions for survival of patients. Aiming at the problem that the deep belief network model is affected by parameters in the construction process, this paper uses the Archimedes optimization algorithm to optimize the learning rate and batch size of DBN. In order to overcome the problem that AOA is prone to fall into local optimum and low search accuracy, an improved Archimedes optimization algorithm (IAOA) is proposed. On this basis, a survival prediction model for patients with ESCC is constructed. Finally, accuracy comparison tests are carried out on IAOADBN, AOADBN, SSADBN, PSODBN, BESDBN, IAOASVM, and IAOABPNN models. The results show that the IAOADBN model can effectively predict the fiveyear survival rate of patients and provide a reference for the clinical judgment of patients with ESCC.
1. Introduction
Cancer is the second leading cause of death in the world and poses a great danger to human health [1, 2]. There will be approximately 19.29 million new cancer cases and 9.95 million cancer deaths worldwide in 2021 [3]. Esophageal cancer is the sixth most common cause of cancerrelated death worldwide, including esophageal squamous cell carcinoma and esophageal adenocarcinoma [4]. More than 90% of esophageal cancers are esophageal squamous cell carcinoma (ESCC). The pathology of esophageal squamous cell carcinoma is complex, and it is often found at an advanced stage, which brings a huge burden to the patient’s family [5, 6]. In recent years, the incidence of esophageal squamous cell carcinoma has been increasing [7], and the mortality rate is still high [8, 9].
One of the most fundamental difficulties in the treatment of ESCC is the lack of effective methods for predicting survival risk [10, 11]. Currently, with the more indepth research on ESCC and the continuous development of medical technology [12], the use of various types of intelligent systems in esophageal cancer diagnosis is increasing [13]. The treatment methods and treatment concepts for patients with ESCC have continued to rise [14]. However, as with other malignancies, the incidence of patients with ESCC is increasing. Even for professional doctors, it is difficult to judge the patient’s ultimate risk of survival [15].
Generally, blood indicators, age, and TNM stage information are considered related to the survival rate of cancer patients, and they are often used to predict the survival status of patients [16–18]. In recent years, with the continuous progress of machine learning technology, more and more intelligent algorithms are proposed and applied in multiple fields [19–21]. In the medical field, the research on the survival risk of cancer patients has become a popular research content [22]. A reasonable survival prediction model will effectively improve the survival of cancer patients. Essentially, the cancer patient survival prediction model is a classification problem [23], including the screening of datasets and analyzing the connections between the data. So far, many data mining methods have been proposed in the literature to predict the survival status of esophageal cancer patients [24, 25]. In [26], 90 breast cancer risk miRNAs are predicted based on the proposed DMTN by using the SVM classifier, which obtained an AUV of 0.9633. The method of backpropagation artificial neural network is adopted to predict whether postoperative fatigue occurred in patients undergoing gastrointestinal tumor surgery in [27], and the accuracy rate reached 0.872.
The above approach based on shallow architecture achieves good performance in cancer prediction problems. However, since the classification accuracy of shallow learning depends largely on the quality of the extracted features, it may cause problems when dealing with more complex applications [28]. In fact, for high latitude and complex cancer patient data, it is not sufficient to use simple traditional shallow architecture to solve it [29]. Correspondingly, the deep learning model has multiple nonlinear network structures, which enable it to extract the features of the original data from the hidden layer step by step and improve the classification and prediction accuracy of the model [30, 31]. Therefore, a network structure with deeper layers is preferred.
Deep learning is a new direction in the field of machine learning that models highlevel abstractions in input data with hierarchies and multiple layers [32, 33]. Through the establishment of artificial neural network with a network hierarchy, multiple layers gradually extract higherlevel features from the original input for learning. Different types of deep neural networks for classification prediction have been used in multiple literatures [34–36]. DBN is a probabilistic generative network, which is considered more suitable for prediction of cancer classification with high feature similarity and complexity [37]. However, in the process of building DBN, improper parameter setting will lead to the instability of the model and the problem of poor classification accuracy. Often, the selection of parameters still relies on the experience of experts to be manually tuned. Aiming at the above problems, a cancer patient survival prediction model based on the improved Archimedes optimization algorithm (IAOA) to optimize DBN parameters is proposed.
In this paper, seventeen blood indicators, age, and TNM staging information of 298 patients with ESCC are studied. Firstly, the clinical data of cancer patients are selected by the minimum redundancy and maximum relevancy algorithm, and the feature indexes are sorted according to their importance. A combination of eleven indicators is selected that is significantly associated with patient survival, which is verified by the Cox regression method in the SPSS software. Secondly, the IAOA is introduced to optimize the parameters in the DBN network training process to improve the stability and classification accuracy of the DBN model. Finally, a survival prediction model of patients with ESCC based on IAOADBN is established. The above eleven related indicators are used as inputs, and the fiveyear survival rate of the patient is used as output. The prediction accuracy rate of IAOADBN is better than the existing AOADBN, SSADBN, PSODBN, BESDBN, IAOASVM, and IAOABPNN. Therefore, the method for survival diagnosis of patients with ESCC proposed in this paper can accurately predict the survival level of patients. The main contributions of this article can be summarized as follows: (1)A combination of eleven indicators is found based on minimum redundancy and maximum relevancy feature selection, which is verified to be significantly associated with survival in patients with ESCC(2)The proposed method uses IAOA to optimize the parameters of the DBN, which effectively improves the stability and classification accuracy of the DBN network. The problem that AOA tends to fall into local optimum and low convergence accuracy is effectively improved by the IAOA. Through the establishment of the IAOADBN model, the fiveyear survival rate of patients with ESCC is effectively predicted
This work is presented as follows. In Section 2, the original data is analyzed, and a combination of multiple indicators that is significantly related to patient survival is found based on minimum redundancy and maximum relevancy algorithm. An improved Archimedes algorithm is proposed in Section 3, which can effectively improve the optimization accuracy and stability of AOA. In Section 4, a survival prediction model based on IAOADBN is proposed, which can effectively predict the fiveyear survival rate of patients with ESCC. In Section 5, the conclusions of this article are presented.
2. Dataset Analysis
2.1. Data Introduction
The clinical data of 298 patients with ESCC used in this article are from patients who were treated in the First Affiliated Hospital of Zhengzhou University from January 2007 to December 2018. The clinical information includes seventeen blood indicators, age, and TNM staging information. The seventeen blood indicators are basophil count (BASO), eosinophil count (EO), fibrinogen (FIB), platelet count (PTL), albumin (ALB), hemoglobin concentration (HGB), white blood cell count (WBC), monocyte count (MONO), activated partial thromboplastin time (APTT), globulin (GLOB), red blood cell count (RBC), prothrombin time (PT), lymphocyte count (LYMPH), neutrophil count (NEUT), total protein (TP), international normalized ratio (INR), and thrombin time (TT). The population proportion information of the dataset is shown in Table 1. Information of seventeen blood indicators is shown in Table 2.
Among all patients, 147 patients survived more than five years, 151 patients survived less than five years, and the data are evenly distributed. The age distribution of the patients ranged from 38 to 82 years, including 190 male patients and 108 female patients. In addition, the selected patients should have complete treatment records and be followed up for more than six months.
2.2. Minimum Redundancy and Maximum Relevancy Algorithm
The minimum redundancy and maximum relevancy (MRMR) algorithm [38] is a typical feature selection method. The purpose of MRMR is to select the features with the minimal redundancy and the maximal relevance with the class label. The relevance between features and class labels is represented by mutual information. The mutual information is calculated as Equation (1). where and are given two random variables, is the joint probability density function of and , and are the probability density functions of and , respectively. The minimum redundancy and maximum relevancy are calculated as follows, respectively.
where and are feature subsets and the number of features contained therein, respectively, is the class label, is the mutual information between feature and class label , is the mutual information between feature and feature , is the mean between each feature in the feature set and the class label , indicating the relevance between the feature set and the corresponding class label, and is the size of the mutual information between the features in the feature set , which represents the redundancy between the features.
The goal of the MRMR algorithm is to maximize the classification performance of the selected feature subset while minimizing the feature dimension. Therefore, it is required that the relevance between the feature subset and the label is the largest, and the redundancy between the features is the least. The minimum redundancy and maximum relevancy are constructed as follows.
The main process of minimum redundancy and maximum relevancy (MRMR) algorithm is as follows.
2.2.1. Step 1: The First Feature Is Selected
The mutual information between all candidate variables and target variables in the clinical data of esophageal cancer patients is calculated. The feature variable with the largest mutual information is the first feature variable selected.
2.2.2. Step 2: The Second Feature Is Selected
The redundancy between the selected first feature and the other features is calculated. The feature variable with the least redundancy is the second feature variable.
2.2.3. Step 3: Sequential Selection of Other Features
Based on the selected two feature variables, the selection of the next feature variable is required to make the selected feature subset have the largest relevance with the target variable and the least redundancy with the selected feature. Therefore, it is necessary to satisfy the minimum redundancy and maximum relevancy criterion of Equation (4). Repeat the calculation of the criteria shown in Equation (4), and add the variables that meet the requirements to the selected feature subset in turn. When the number of selected features meets the requirements, the algorithm ends.
In order to clearly express the MRMR process, the framework of MRMR is shown in Algorithm 1.

2.3. Selection of Optimal Subset Combinations
The patients’ 17 blood indicators, age, and TNM staging information are used as input and fiveyear survival status as output. The patients’ indicators are reordered according to their importance by the MRMR method. The reordered dataset is put into the BP neural network [39], and the classification accuracy of the feature combination is verified by tenfold crossvalidation. When the highest classification accuracy is achieved, the combination with the smallest number of features is the optimal feature combination. The result is shown in Figure 1. When the highest classification accuracy is achieved, the number of features is eleven. Therefore, the features selected in this paper are the first eleven features. The eleven features are TNM stage, BASO, Age, PT, FIB, LYMPH, RBC, TT, PLT, T stage, and GLOB.
2.4. Cox Regression Analysis to Verify the Correlation of Indicators
Cox regression models [40] are widely used in the medical field to analyze the effects of multiple variables on survival status and survival time. In this section, Cox regression models are used to further validate the correlation of selected features with a 5year survival status and survival time of patients with ESCC. The SPSS 26.0 statistical software is used to make the Cox model. The survival time and survival outcome of patients with ESCC are used as dependent variables. The above eleven indicators are independent variables. The survival function at the mean of the covariate is shown in Figure 2. The results show that the value of the overall score of the eleven indicators is much less than 0.05. The combination of these eleven indicators is significantly related to the survival rate of patients.
3. Improving the Archimedes Optimization Algorithm
3.1. Basic Archimedes Optimization Algorithm
The Archimedes optimization algorithm [41] (AOA) is a new metaheuristic algorithm proposed in 2020. In this algorithm, the population individuals are submerged objects, and the population position is updated by adjusting the density, volume, and acceleration of the objects. According to whether the objects collide in the liquid, AOA is divided into a global exploration stage and a local search stage. If the objects do not collide, the global exploration phase is performed. Instead, a partial development phase is performed.
3.1.1. Initial Stage
In the initialization phase, AOA randomly initializes the density (), volume (), and acceleration () of individuals in the population. The current optimal individual (), optimal density (), optimal volume (), and optimal acceleration () are selected. In the AOA, the individual density, volume and transfer factor TF are calculated as Equations (5)–(7), respectively. where is a random number between (0,1). and are the densities of the individual for the generation and the generation , respectively. and are the volumes of the individual in the generation and the generation , respectively. where is the current iteration number and is the maximum iteration number.
When , performs a global search, and the update of the individual acceleration is calculated as follows.
When , AOA is developed locally, and the individual acceleration is updated to the following:
The acceleration of the individual is normalized to obtain Equation (10).
where is the normalized acceleration of the individual in the generation, and are the parameters for adjusting the normalization range.
During the global search phase, the individual positions are updated by Equation (11).
where and are the positions of individuals in the and generations and is the positions of random individuals in the generation . is a random number. is a fixed constant. is the density factor, which is calculated as follows.
During the local development stage, the individual position is updated by Equation (13). where is a fixed constant and is the direction factor that determines the update direction of the individual position, which is constructed as follows:
where and is a fixed constant. , and
3.2. Improved Archimedean Optimization Algorithm
In the basic AOA, the update of the optimal individual of the population depends on the update of the population in each iteration. After each iteration, the optimal individual is replaced by the individual with the best fitness, and the algorithm does not actively disturb the optimal individual. When the optimal individual of the population falls into the local extremum space, the algorithm will fall into the local optimum, and the phenomenon of premature convergence will occur [42]. Therefore, this paper introduces the corresponding improvement strategy to improve the defects of the basic AOA. Firstly, Sine chaos mapping and reverse learning strategies are used to initialize the population, which can enhance the population diversity and improve the solving efficiency. Secondly, Gaussian variation and superior selection strategies are used to perturb the positions of optimal individuals, which can enhance the global search ability and help the population to jump out of the local optimum. In this paper, the improved AOA is called IAOA. The specific strategy is as follows.
3.2.1. Sine Chaos Reverse Learning Initialization Strategy
The population of AOA is initialized by random generation. This leads to uneven distribution of individuals in the initial population, which affects the later iterative optimization. The Sine chaotic model [43] is a chaotic model with good randomness and ergodicity with infinite number of map foldings. Reverse learning [44] can obtain its corresponding reverse solution through the current solution. The optimal initial solution can be obtained by comparing and selecting a better solution. In this paper, the Sine chaotic strategy is used to generate an initial population with better diversity. Second, the reverse population is generated according to reverse learning. Finally, the fitness of the obtained population is calculated, and the solution with low fitness is selected as the initial population to improve the probability of obtaining the optimal initial solution. The 1dimensional mapping expression of Sine chaos is calculated as the follows. where and .
The population , is obtained by mapping the Sine chaos into the solution space. The population individuals are represented as follows. where is the dimensional value of the population .
The reverse population can be represented as , . The reverse population individual can be calculated by the following. where is the population search dynamic boundary.
The new population is formed by the Sine chaotic population and the reverse population . The fitness values of the new population are ranked, and individuals with the best fitness values are selected to form the initial population.
3.2.2. Gaussian Operator and Superior Selection Strategy
The Gaussian operator [45, 46] is introduced in this paper in order to avoid AOA from falling into local optimum and to maintain the diversity of individuals in the population. The current optimal solution is subjected to Gaussian variation with certain probability , and a meritocratic selection strategy is taken. The expression of the Gaussian variational operator is calculated as follows: where denotes the individual position after variation and is a random variable satisfying a Gaussian distribution. The global optimal solution position is updated as follows. where is a random variable between , is the probability of superior selection, and is the individual fitness value. Therefore, variational operations on the global optimal solution can avoid the algorithm from falling into a local optimum and effectively improve the search efficiency.
In order to clearly express the IAOA process, the framework of IAOA is shown in Algorithm 2.

3.2.3. IAOA Validation and Comparison
In order to fully verify the effectiveness of the IAOA proposed in this paper, the improved Archimedes optimization algorithm, Archimedes optimization algorithm, sparrow search algorithm [47], and bald eagle search algorithm [48] are compared and tested under thirteen benchmark functions at the same time. The selected benchmark functions are classified into three categories. The first category is the singlepeak benchmark function, as shown in F1F5 in Table 3. The second category is the multipeak benchmark function, as shown in F6F10 in Table 3. The third category is the multimodal benchmark function with fixed dimension, as shown in F11F13 in Table 3. The basic parameters of the algorithm are as follows: the population size is 30, and the maximum number of iterations is 500. The other parameters within the algorithm are shown in Table 4. The experimental results are presented in Tables 5 and 6. The optimization ability of the algorithm is reflected by the optimal value and the average value, and the stability of the algorithm is reflected by the standard deviation. Firstly, for the five singlepeaked functions, IAOA has higher convergence accuracy and stability compared to other algorithms. Secondly, F6 and F8 are able to reach the theoretical optimum when solving for the multipeak function. For other multipeaked functions, IAOA has the best search accuracy and stability. For fixed dimensional functions, IAOA is also better than other algorithms. Therefore, the improvement strategy proposed in this paper has improved the performance of the algorithm to some extent.
4. Survival Prediction Model of Patients with ESCC Based on IAOADBN
4.1. An Overview of DBN
The deep belief network (DBN) is a probabilistic generative network. It is composed by a bunch of restricted Boltzmann machines (RBMs) and a backpropagation (BP) neural network [49]. The learning process of DBN can be divided into pretraining and finetuning. During the pretraining process, each RBM is trained individually by an unsupervised learning algorithm in turn, and the network parameters of each layer are gradually adjusted. In the finetuning process, the classification labels are used as the output layer of the DBN. The BP neural network is trained sequentially from top to bottom, and the training error is propagated back to the RBM to finetune the parameters of all layers to reach the global optimal parameters of the DBN. The structure of DBN is shown in Figure 3.
4.1.1. Pretraining of RBM
The restricted Boltzmann machine (RBM) is a neural perceptron consisting of a visible layer () and a hidden layer (). Its structure is shown in Figure 4. There are bidirectional connections between the visible and hidden layers, while there is no connection between units in the same layer. In RBM, there is a weight between any two connected neurons in the visible layer and the hidden layer to represent the connection strength. Each neuron has a bias coefficient (for the neurons in the visible layer) and (for the neurons in the hidden layer) to represent its own weight. Therefore, the energy function contained in each RBM is calculated as follows:
where represents the parameter set of RBM, including the state and bias of the visible layer and the state and bias of the hidden layer, is the connection weight between the visible layer node and the hidden layer node, and and represent the number of neurons in the visible layer and the hidden layer, respectively.
According to the energy function of the RBM, the joint distribution of the visible layer and the hidden layer is calculated as follows. where , called the normalization factor.
The independent probability distribution of the visible layer is calculated as follows.
There is no connection between nodes in the same layer in the RBM, so the conditional probability distribution of each neuron in the visible layer and the hidden layer is as follows:
where is the sigmoid function.
The goal of RBM training learning is to make the Gibbs distribution of the RBM network representation as close as possible to the distribution of the original data so that is maximized.
The network structure parameters of the RBM can be obtained using the maximum likelihood estimation method, and the parameter set can be updated by the comparative scattering method, as expressed by the following. where represents the expected value of the partial derivative under the distribution of .
The model parameter update method is as follows:
where represents the expectation of defined by the current RBM model, represents the expectation of defined by the reconstructed RBM model, represents the learning rate, and is the batch size.
The pretraining of DBN starts from the bottom layer. After the first RBM is trained, the current hidden layer is transformed into the visible layer of the next RBM. The network is trained layer by layer from bottom to top to avoid falling into local optimum.
4.1.2. FineTuning of RBM
In the finetuning stage, a BP neural network is constructed using the hidden layer of the last RBM and the output layer of the DBN for supervised training. The parameters of each layer are optimized from the top to the bottom to obtain the final model parameters.
4.2. The Proposed Parameter Optimization of DBN Based on IAOA
During the construction of DBN, the choice of hyperparameters such as learning rate and batch size has an important impact on the training results of DBN. However, the selection of hyperparameters in traditional DBNs mainly relies on subjective experience, which also causes the problem of insufficient training efficiency. This also leads to a decrease in the classification accuracy and model stability of DBNs. In this paper, in order to reduce the influence of human interference factors and improve the classification accuracy of DBN, the IAOA is proposed to optimize the learning rate and batch size of DBN. The classification error rate of DBN is used as the objective function of IAOA optimization, and the objective function is . The larger the fitness value, the higher the classification effect of DBN. In order to clearly express the IAOADBN process, the framework of IAOADBN is shown in Figure 5.
4.3. Survival Prediction Model of Patients with ESCC
In this paper, eleven indicators significantly related to the survival rate of patients with ESCC are obtained through the MRMR algorithm, and these indicators are TNM stage, BASO, age, PT, FIB, LYMPH, RBC, TT, PLT, T stage, and GLOB, respectively. Eleven indicators and all indicators of the patients are used as inputs to the IAOADBN model, respectively, and the fiveyear survival rate of the patients is used as the output. A survival prediction model for esophageal cancer patients is established. The established survival prediction model for patients with ESCC is shown in Figure 6. To verify the validity of this model, the Archimedes optimization algorithmdeep belief network (AOADBN), sparrow search algorithmdeep belief network (SSADBN), particle swarm optimizationdeep belief network (PSODBN) [50], bald eagle searchdeep belief network (BESDBN), improved Archimedean optimization algorithmsupport vector machines (IAOASVM), and improved Archimedean optimization algorithmbackpropagation neural networks (IAOABPNN) are used for comparison. The initial population of AOA, SSA, PSO, and BES is uniformly set to 20, and the maximum number of iterations is 500. The dataset is divided into ten parts, and the tenfold crossvalidation method is used to verify the classification accuracy of the model. The prediction results of the DBN optimized by the five optimization algorithms, IAOASVM, and IAOABPNN model are shown in Table 7.
When eleven patient indicators are used as input, the Tables 5 and 6 show that the prediction results of IAOADBN, AOADBN, SSADBN, PSODBN, BESDBN, IAOASVM, and IAOABPNN are 89.66%, 87.46%, 88.14%, 86.78%, 87.29%, 86.27%, and 86.61%, respectively. When all patient indicators are used as input, Table 7 shows that the prediction results of IAOADBN, AOADBN, SSADBN, PSODBN, BESDBN, IAOASVM, and IAOABPNN are 88.13%, 86.24%, 86.93%, 85.46%, 86.12%, 85.19%, and 85.32%, respectively. The comparison shows that IAOADBN has a high accuracy rate and can accurately predict the fiveyear survival rate of ESCC patients. In addition, when the input to the model is eleven indicators, the prediction results are better than using all indicators. Therefore, the MRMRIAOADBN model proposed in this paper can better predict the fiveyear survival of patients with ESCC.
To better demonstrate the effectiveness of the proposed model, the Wisconsin Diagnostic Breast Cancer (WBCD) dataset is used for testing. In Wisconsin Diagnostic Breast Cancer (WBCD) dataset, 30 indexes of patients are used as input, and the benign and malignant tumors of patients are used as output. The dataset is divided into ten parts, and the tenfold crossvalidation method is used to verify the performance of the model. The test results are shown in Table 8. From the test results, it can be seen that IAOADBN has higher prediction accuracy than other models. Therefore, the survival prediction model proposed in this paper can effectively predict the prognosis of cancer patients.
5. Conclusions
A novel survival prediction model for patients with ESCC is presented in this paper. Firstly, a minimum redundancy and maximum relevancy algorithm is used to screen out indicators significantly correlated with survival in patients, which is validated by the Cox regression analysis. Secondly, an IAOADBN model is proposed. The model uses IAOA to optimize the parameters in the DBN training process, which improves the stability and classification accuracy of the DBN model. Finally, the model is applied to the survival prediction model for patients with ESCC. The results of comparison with four methods verify the validity and superiority of the model. The key conclusions are expressed as follows. (1)The patients’ clinical indicators are ranked by importance using the minimum redundancy and maximum relevancy algorithm, and a new subset of features is selected. The experimental results show that the new feature subset is with better prediction results than the allfeature set(2)Aiming at the problem that poor convergence accuracy and easy to fall into local optimum of AOA, an improved AOA (IAOA) is proposed in this paper. The experimental results show that the improved strategy proposed in this paper improves the performance of AOA to a certain extent(3)The learning rate and batch size of DBN are optimized using IAOA to obtain the optimal parameters, which improved the classification prediction accuracy and stability of the DBN model. Compared with AOADBN, SSADBN, PSODBN, and BESDBN, the results verify the effectiveness and superiority of the IAOADBN model
Data Availability
The datasets presented in this article are not readily available because the data used in the study are private and confidential data. Requests to access the datasets should be directed to Junwei Sun, [email protected].
Conflicts of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
This work was supported in part by the Joint Funds of the National Natural Science Foundation of China under grant U1804262, in part by the Foundation of Young Key Teachers from University of Henan Province under grant 2018GGJS092, in part by the Youth Talent Lifting Project of Henan Province under grant 2018HYTP016, in part by the Henan Province University Science and Technology Innovation Talent Support Plan under grant 20HASTIT027, in part by the Zhongyuan Thousand Talents Program under grant 204200510003, and in part by the Open Fund of State Key Laboratory of Esophageal Cancer Prevention and Treatment under grant K20200010 and grant K20200011.