Abstract

Dry weight is the normal weight of hemodialysis patients after hemodialysis. If the amount of water in diabetes is too much (during hemodialysis), the patient will experience hypotension and shock symptoms. Therefore, the correct assessment of the patient’s dry weight is clinically important. These methods all rely on professional instruments and technicians, which are time-consuming and labor-intensive. To avoid this limitation, we hope to use machine learning methods on patients. This study collected demographic and anthropometric data of 476 hemodialysis patients, including age, gender, blood pressure (BP), body mass index (BMI), years of dialysis (YD), and heart rate (HR). We propose a Sparse Laplacian regularized Random Vector Functional Link (SLapRVFL) neural network model on the basis of predecessors. When we evaluate the prediction performance of the model, we fully compare SLapRVFL with the Body Composition Monitor (BCM) instrument and other models. The Root Mean Square Error (RMSE) of SLapRVFL is 1.3136, which is better than other methods. The SLapRVFL neural network model could be a viable alternative of dry weight assessment.

1. Introduction

Fluid overload in patients with chronic renal failure is closely related to poor cardiovascular outcomes [1, 2]. Maintenance of hemodialysis (HD) is the main method for patients with renal failure [3]. However, the accurate assessment of body water volume is still a concern [4]. At present, dry weight has been used as an important indicator to assess the homeostasis of fluids in hemodialysis patients. Medical staff can use the patient’s dry weight to estimate the amount of water needed for dialysis during hemodialysis. The conventional clinical-based dry weight assessment method is time-consuming and labor-intensive [1]. There are already some methods based on bioelectrical impedance analysis (BIA) [5] to determine dry weight, including body composition monitor (BCM) [6] and lung ultrasound (LUS). However, all the above methods require special instruments and professional technicians to complete. Medical staff can use some clinical data to build predictive models [7] to accurately assess dry weight. Currently, machine learning (ML) or deep learning has solved many common clinical problems in medicine, such as brain diseases [810], cancer analysis, and diabetes.

Some scholars have used artificial neural networks (ANN) to predict the total water volume of hemodialysis patients and have obtained better results than conventional clinical calculation equations [11]. In addition, deep learning methods are also emerging in clinical diagnosis, including pixel-based convolutional neural networks to diagnose skin cancer [12]. In the biological field, microbiology analysis [13], CircRNAs [14], microRNAs, and cancer association prediction [1517], lncRNA-miRNA association prediction, O-GlcNAcylation site prediction [18], DNA methylation site [1921], protein remote homology [22], function prediction of proteins [2329], electron transport proteins [30], breast cancer [31], cell-specific replication [32], osteoporosis diagnoses [33], and drug complex network analysis [3438].

In our previous research, a Multiple Kernel Support Vector Regression (MKSVR) [39] predictor was proposed to assess the dry weight and obtain good predictive performance. Inspired by the previous work and baseline Random Vector Functional Link (RVFL) network [40], we propose a new dry weight assessment model, called Sparse Laplacian regularized RVFL neural network with L2,1-norm (SLapRVFL), which considers the topological relationship between samples and more sparse connections between the input layer and the hidden layer.

2. Materials and Methods

2.1. Materials

This work collects demographic and anthropometric data and bioimpedance spectroscopy (BIS) from historical data (2018-9 to 2019-9) from Wuxi people’s hospital and the northern Jiangsu people’s hospital. This study has been approved by the ethics committees of the hospitals (Nos. KYLLKS201813 and 2018KY-001). The collected patient data meet the following requirements: age greater than 18 years; ESRD for more than three months and maintenance hemodialysis [41]; no heart failure, no metal implants, no pregnancy, no disability, no infection, and no edema and other diseases; and hemodialysis treatment 3 times a week, 4 hours each time. Finally, we obtain a data set of 476 hemodialysis patients. DW is the normal body weight after clinical diabetes. DW is obtained by a clinician under strict clinical supervision using a clinical scoring system (using trial and error method) [42, 43].

We choose 7 features, including age, gender (binary feature), systolic blood pressure (SBP), diastolic blood pressure (DBP), body mass index (BMI), heart rate (HR), and years of dialysis (YD) to build our predictive model. Table 1 shows the information of the data set. BMI is measured before hemodialysis treatment.

2.2. Methods

The baseline RVFL was proposed for regression or classification. The schematic diagram of RVFL is shown in Figure 1. The basic information of the patient is put into the RVFL neural network model for processing, and the predicted dry weight is the output.

Suppose, there are training samples with . The output value is and the input data is . denotes the dimension of . As per Figure 1, RVFL randomly initializes all weights and deviations between the hidden layer and the input layer. These parameters are fixed during the training process and do not need to be tuned. There are connections between the output layer, input layer, and hidden layer. This part of the weight needs to be obtained by training RVFL. The output layer of RVFL is connected to both the input layer and the hidden layer, so as to ensure the nonlinear and linear relationships between the input and the output. The RVFL network with hidden nodes are formulated as where denotes the output weight matrix; is the concatenated matrix, which combines the output of the hidden layer and the input layer; and denotes the label matrix. and can be represented as

In Equation (4), and are the weights and bias of the hidden and input layers. and are numbers of output and hidden layer nodes. In general, the activation function is a Gaussian function: . The activation function has a nonlinear approximation effect. To consider the potential linear relationship between the input data and the output value, RVFL adds a direct connection weight between the input layer and the output layer. Therefore, RVFL is a model that contains both linear and nonlinear approximations to improve prediction performance. For optimal , the RVFL can be formulated as a regularized least-squares: where is the parameter of regularization term. The solution of Equation (6) can be found by setting its gradient to 0: where denotes the identity matrix. However, the RVFL network did not consider the topological relationship between samples. For the output node, it must be connected to both the input and the hidden layer.

In order to further improve the robustness of RVFL, we propose Sparse Laplacian regularized RVFL neural network with L2,1-norm (SLapRVFL). The objective function is where denotes the Laplacian matrix. and are the coefficients of Laplacian regularization the and L21-norm term, respectively. Laplacian regularization is used to indicate the potential manifold between samples. It can better describe the topological association between samples to improve the generalization ability of the model. Since the third term of is not diversified, we convert Equation (8) to where denotes a diagonal matrix whose th-diagonal element

We take the derivative of the formula Equation (10) as

We use the baseline RVFL solution with Equation (7) as the initial . In addition, the Laplacian matrix can be calculate as where is diagonal matrix, . Similarity matrix is built by Radial Basis Function (RBF):

The process of SLapRVFL is list in Algorithm 1.

Require: Training set , test set , the numbers of hidden layer nodes (), the maximum number of iterations tmax, coefficients of and ;
Ensure: The predictive values of
(1) Randomly initializing all weights and deviations between the hidden layer and the input layer. Calculating the hidden layer output matrix (training set)and Laplacian matrix by Equations (2), (12), and (13);
(2) Set , estimate the initial using Equation (7);
Repeat
(3) Update the diagonal matrix with
(4) Update via Equation (11d);
Until;
(5) Calculate the hidden layer output matrix (test set);
(6) Estimate by .
Algorithm 1. Algorithm of SLapRVFL

3. Results

We test our model on the benchmark data set and obtain the optimal parameters of the predictor through cross-validation. The SLapRVFL network is compared to other machine learning-based models. In addition, the body composition monitor (BCM) device (Fresenius Medical Care, Baden Humboldt, Germany) is also compared with the SLapRVFL network.

3.1. Evaluation Measurements

The 10-fold cross-validation (10-CV) is employed to evaluate the robustness of methods. Root Mean Square Error (RMSE), square, correlation coefficient (), Bland–Altman analysis, and Empirical Cumulative Distribution Plot (ECDP) [44] are all used in our study. To evaluate the agreement of two different methods, the Bland–Altman analysis usually can obtain whether the two methods can be substituted for each other (equivalence). Evaluating the agreement of the two methods can answer the question, “Can these two methods replace each other?”

3.2. Selection of Optimal Parameters

To get the optimal parameters of the predictive method, we obtain them through a grid search method. The parameters that need to be determined include the numbers of hidden layer nodes , maximum iterations, and coefficients of and . For the numbers of hidden layer nodes , we fix the iterations, and . Setting the maximum number as 50, and . The value of is from 10 to 140 with step of 10. The results are shown in Figure 2. From 10 to 100, the more neurons in the hidden layer, the lower the RMSE. Since then, RMSE has gradually increased. So, we get the lower RMSE under .

Next, , , and . We gradually increase the number of iterations from 1 to 100 (shown in Figure 3). After the number of iterations reaches 10, the RMSE value drops to a minimum and slightly oscillates within a certain value. In our study, maximum number of iterations is 10.

Then, we use the better number of hidden layer nodes and iterations to search for the best and . The search range of parameters is from to (with step of ). Figure 4 shows the results of different parameters. When and are and , RMSE is the lowest.

3.3. Comparison to Other Predictive Models and BCM

To evaluate our model, SLapRVFL is compared with our previous work of Multiple Kernel Support Vector Regression (MKSVR) [39], Multikernel Ridge Regression (MKRR), Linear Regression (LR), Artificial Neural Network based on Back Propagation algorithm (ANN with BP), and BCM measuring instrument. Clinical dry weight is our reference standard (also the regression target value of the prediction model). The comparisons are listed in Table 2, which shows that SLapRVFL achieves best performance of RMSE (1.3136). Although the ECDP median value (peak) of MKSVR (0.0082) is more close to zero, Figure 5 shows that SLapRVFL has the least bias and much less tails than MKSVR (smaller width). The RMSE of BCM is 1.9694, which is larger than SLapRVFL.

3.4. Bland–Altman Analysis

Bland–Altman plot is a useful tool to evaluate the agreement between predictive methods and clinical DW. In Table 3 and Figure 6, SLapRVFL, MKSVR, LR, ANN (BP), MKRR, and BCM are analyzed via Bland-Altman difference plot. SLapRVFL achieves the smallest range of 95% confidence interval (-0.1133 to 0.2866) and standard deviation (2.2202). In addition, the number (ratio) of outside agreement interval for predictive models is all less than 24 (5%) predictive samples. These results of models are clinically acceptable. SLapRVFL achieves least number (20) of the outside agreement interval in Table 3. As shown in Figure 6, two red horizontal dotted lines (upper and lower) denote the upper and lower limits of the 95% agreement limit, respectively. The middle blue solid line is the average value of the difference (between measurement methods and clinical DW). While one measurement method and clinical method can be considered as a better agreement, they can be substituted for each other (equivalence). If 95% of the points of the data set are in the agreement range, the measurement method (predictive model) is clinically acceptable. The results of the evaluation show that SLapRVFL can help clinicians assess DW with low cost.

4. Discussion

Due to the limitations of clinical and BCM measurement (more time and cost), this study uses a machine learning method to assess the dry weight of hemodialysis patients. Based on the basic RVFL, we propose a sparse Laplace regularized RVFL network (SLapRVFL) model. SLapRVFL is compared not only with other machine learning methods (such as LR, MKRR, ANN with BP, and MKSVR) but also with BCM equipment (commonly used in hospitals). The RMSE and Bland–Altman analysis of the model are better than the BCM instrument. It is proven that the predictive model driven by data can provide reference for clinical dry weight assessment.

BCM requires the patient’s information on weight (before hemodialysis) and height. It is a portable, inexpensive, and noninvasive technology that has been used to measure DW [45, 46]. For the Bland–Altman analysis, SLapRVFL achieves the least number (20) of outside agreement interval. However, BCM has 30/476 (6.30%) points (ratio) of the outside agreement interval. Obviously, our method has better agreement with the clinical method.

5. Conclusions

To further improve the robustness of RVFL, we introduce sparse Laplacian regular term with L2,1-norm. In the training process, the graph topology information and the sparse weight matrix (output) are employed to improve the robustness of the RVFL. In fact, our work provides a new idea for assessing patients’ dry weight. Not only that, in the fields of biology [4757], pharmacy [58], and medicine [12, 59, 60], machine learning methods have helped solve many analysis tasks. In future research, we will consider collecting more samples, introducing more patient personal information, and building a predictor based on a deep learning model to more accurately assess the dry weight of hemodialysis patients.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Ethical Approval

This study had been approved by the ethics committee of the hospital (ethical approval Nos. KYLLKS201813 and 2018KY-001). The experimental protocol was established, according to the ethical guidelines of the Helsinki Declaration, and was approved by the Human Ethics Committee (Wuxi People’s Hospital Ethics Committee and Northern Jiangsu People’s Hospital Ethics Committee).

Written informed consent for publication was obtained from all participants.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Authors’ Contributions

Xiaoyi Guo and Wei Zhou are joint first authors.

Acknowledgments

The authors give their thanks to the Hemodialysis Center of Wuxi People’s Hospital and Northern Jiangsu People’s Hospital for collecting the data in this study. This work is supported by a grant from the National Natural Science Foundation of China (NSFC 61902271, 61772362, and 61972280) and the Natural Science Research of Jiangsu Higher Education Institutions of China (19KJB520014).