Abstract

Background. Liver cirrhosis (LC) is the final stage of most of chronic liver diseases and is almost caused by chronic hepatitis B (CHB) in China. Liver biopsy is the reference method for the evaluation of liver cirrhosis. However, it is an invasive procedure with inherent risk. The aim of this study was to construct a new classifier based on the routine clinical markers for the prediction of HBV-induced LC. Subjects and Methods. We collected routine clinical parameters from 124 LC patients with CHB and 115 with CHB. Training set ( ) and test set ( ) were built for model construction and evaluation, respectively. Results. We describe a new classifier, MLP, for prediction of LC with CHB. MLP was built with seven routinely available clinical parameters, including age, ALT, AST, PT, PLT, HGB, and RDW. With optimal cutoff, we obtained a sensitivity of 95.2%, a specificity of 84.2%, and an overall accuracy of 89.9% on an independent test set, which were superior to those of FIB-4 and APRI. Conclusions. Our study suggests that the MLP classifier can be implemented for discriminating LC and non-LC cohorts by using machine learning method based on the routine available clinical parameters. It could be used for clinical practice in HBV-induced LC assessment.

1. Introduction

Patients with liver cirrhosis (LC) induced by chronic hepatitis B (CHB) are at high risk of developing hepatocellular carcinoma (HCC) [13]. The proportion of people chronically infected with hepatitis B virus (HBV) is about 350 million people worldwide [4]. The lifetime risk of HBV carriers to develop cirrhosis is estimated to be more than 15% [5]. At present, liver biopsy is still the golden standard for the evaluation of liver fibrosis and cirrhosis [6]. Although histological assessment provides valuable information on the degree of necroinflammation and fibrosis in such patients, it is an invasive procedure associated with a finite albeit small risk of severe complications of 0.5%, patient discomfort, and expense [7]. Moreover, a liver biopsy does not provide information regarding the balance between production and destruction of the extracellular matrix (ECM) or the rate of progression to cirrhosis [8]. Over the past decade, attempts have been made to develop noninvasive methods to assess LC, including physical approaches and biological approaches. Transient elastography (TE), a recently developed noninvasive technique based on physical approach, has proved to have high diagnostic accuracy for LC [9, 10]. However, the accuracy of TE is highly dependent on the experience of operators and clinicians, and its applicability is not as good as that of serum biomarkers with limitation in some patients, such as pacemakers, defibrillators, or overweight patients [11]. Up to now, about 20 numerical scores or indices are reported mostly based on the routine laboratory parameters [3]. Some models such as Fibrotest and AST to Platelet Ratio Index (APRI) have been proposed for clinical application in patients with chronic hepatitis C (CHC) [12, 13]. However, some models based on CHC patients may not be suitable for predicting significant fibrosis and cirrhosis in hepatitis B-related fibrosis and cannot reduce the number of liver biopsies. Recently, a validation study which examined 13 panels of indirect blood markers in CHB patients [6], including FIB-4, APRI, and Forns, demonstrated that the performance for predicting liver fibrosis in CHB patients had yet to be improved.

Consequently, the objective of this work was to construct and evaluate a new classifier for predicting liver cirrhosis in CHB patients using supervised machine learning methods based on the routine clinical parameters.

2. Materials and Methods

2.1. Patients and Data Collection

From October 2010 to March 2013, a total of 239 subjects were collected, comprising 124 LC patients with CHB and 115 patients with CHB. All patients were admitted to Jinan Military General Hospital and Changhai Hospital of Second Military Medical University. This study was conducted in accordance with the Declaration of Helsinki and approved by the ethical committees of the hospitals mentioned above. There was no influence on the subsequent management of patients in this work. Both HBV-induced chronic hepatitis and liver cirrhosis after HBV infection were diagnosed depending on the criteria established by the Chinese Medical Association (Chinese Society of Hepatology and Chinese Society of Infectious Diseases) [14]. All the participants were not coinfected with HIV or hepatitis C. Patients were excluded if they consumed >5 g of alcohol per day on average or were taking intravenous drugs or antihypertensive medications. In addition, patients with evidence of a concurrent liver disorder such as primary biliary cirrhosis, autoimmune hepatitis, HCC, and Wilson’s disease were excluded from this study. An abdominal ultrasound scan was carried out on these patients to confirm normal hepatobiliary anatomy and to exclude biliary obstruction and hepatic space-occupying lesions.

Clinical characteristics and laboratory parameters of patients were documented, including age, gender, total bilirubin (TBIL), creatinine (CRE), prothrombin time (PT), albumin (ALB), platelet count (PLT), alanine transaminase (ALT), aspartate aminotransferase (AST), alkaline phosphatase (AKP), red cell distribution width (RDW), hemoglobin (HGB), and mean corpuscular volume (MCV). For the patients with liver cirrhosis, the Child-Pugh score was calculated, as previously described [15]. Only the clinical characteristics and laboratory parameters of the first admission were recorded for the patients receiving hospital care more than once between October 2010 and March 2013. Similarly, the results of the first measurement were accepted regarding serial laboratory tests during hospitalization.

2.2. Data Processing and Feature Selection

The aim of this study was to construct and evaluate the classifiers for prediction of liver cirrhosis with chronic hepatitis B based on the mentioned 13 routinely available clinical parameters. Data were randomly divided into a training set and a test set. The training set consisting of 120 patients (50.2%) was used to build the classifier. The remaining 119 patients (49.8%) were used to construct the test set for validation. Waikato Environment for Knowledge Analysis (WEKA) [16] was employed for feature selection and classifier construction. Based on the training set, variable selection was performed by using the GeneticSearch-based strategy in WEKA. A list of clinical parameters sorted along the statistical difference between the two classes (e.g., LC and CHB) was obtained, which was used for classifier construction.

2.3. Classifier Construction and Evaluation

Classifiers were constructed based upon the training set using multilayered perceptron (MLP) and Naïve Bayes (NB) method in WEKA. A 10-fold cross-validation was performed to avoid model-specific overfitting, as previously described [17]. Briefly, all the entries were randomly divided into ten parts; nine sets were used for training and the remaining one for testing. The process was repeated ten times and the accuracy for true, false, and total accuracy calculated. The final accuracy is the average of the accuracy in all ten tests. To evaluate the generalization performance of the two classifiers, in the next step, we carried out the validation on the test set. Accuracy (ACC), sensitivity (SE), specificity (SP), positive and negative predictive values, and likelihood ratios were calculated.

To further evaluate our classifier, the classifier with optimal cutoff was compared to two reported noninvasive indices using the test set: the APRI and the FIB-4 index. The APRI was calculated using AST [U/L]/(ULN of AST)/PLT [×109/L] × 100 [18]. The FIB-4 index was calculated using (age [yr] × AST [U/L])/((PLT [109/L]) × (ALT [U/L])1/2) [19].

2.4. Statistical Analysis

Data were expressed as mean ± standard deviation (SD). Student’s -test or Mann-Whitney test was used to test the difference between mean or median values. To compare the performance of a range of algorithms, StAR [20] was used to plot receiver operator curves (ROC) and statistical comparison of area under curve (AUC) of each ROC. All statistical analyses were performed using SPSS version 17.0 (SPSS Inc., Chicago, IL, USA) and GraphPad Prism software (version 5.0). A two-sided value of 0.05 was considered statistically significant.

3. Results

3.1. Patient Characteristics

A total of 239 patients were enrolled in this study. Clinical characteristics and laboratory findings from individuals with HBV induced LC and those with CHB are summarized in Table 1. There were significant differences between the two groups with respect to age, ALP, Albumin, TBIL, PT, Hb, MCV, RDW, PLT, and CRE. The Child-Pugh scores of LC patients were as follows: (40.3%), (36.3%), and (23.4%).

3.2. Dataset and Feature Selection

To construct and evaluate the classifier, the 239 patients were divided into training set ( ) and test set ( ). Details of patients from the training set and test set in this study were given in Table 2. The source distribution showed similarity between training set and test set. After data preprocessing, feature variables were evaluated by using the GeneticSearch-based strategy. A panel of seven features was selected for classifier construction based upon the training set, including age, ALT, AST, PT, PLT, HGB, and RDW.

3.3. Construction and Evaluation of the MLP Classifier

To construct classifiers for predicting LC with CHB, two supervised machine learning methods including MLP and NB were employed. A preliminary test by tenfold cross-validation on the training set (Table 3) was carried out to evaluate the performance of the two classifiers. MLP showed better results with ACC of 82.2% and SE of 80.6% but gave a poorer SP of 77.6%. NB gave better results on specificity (ACC, 77.5%, SE, 67.7%, and SP, 87.9% for NB). However, the sensitivity of NB was not as high as expected. Meanwhile, the AUC of MLP was higher than that of NB ( ). The ability of a classifier to discriminate data correctly in the test set is known as its generalization performance. We thus compared the generalization performance of the two classifiers. The MLP classifier (Table 4) gave better results on the test set, with SE of 85.5% and SP of 89.5% (overall accuracy 87.4%). The AUC of the MLP classifier was significantly better than that of NB ( ).

3.4. Comparison to the Two Reported Algorithms

The MLP classifier was then compared with two previously published noninvasive indices, including APRI and FIB-4 index. The AUC for predicting liver cirrhosis on the test set for all three algorithms is shown in Figure 1. The AUCs of the MLP classifier, APRI, and FIB-4 index were 0.942, 0.817, and 0.726, respectively (Table 5). The AUC of the MLP classifier was significantly better than those of FIB-4 ( ) and APRI ( ). Furthermore, the prediction probability value of MLP increased significantly with the increase of Child-Pugh score in all liver cirrhosis patients (Figure 2).

Using StAR, the suggested cut-off values of all three algorithms were given in Table 5. The accuracy, sensitivity, specificity, predictive values, and likelihood ratios of these algorithms were also calculated in the test set. With an optimal cut-off value of 0.281, the MLP classifier showed the best ACC (89.9%), SE (95.2%), and SP (84.2%) when compared to FIB-4 and APRI.

4. Discussion

Liver cirrhosis is the final stage of most of chronic liver diseases with the histological development of regenerative nodules surrounded by fibrous bands, which is mainly induced by HBV in China [3, 21, 22]. According to the latest European Association for the Study of the Liver (EASL) treatment guidelines on HBV, liver biopsy should be performed with abnormal ALT levels and high HBV DNA levels (>2,000 IU/mL). Although liver biopsy has been the “golden standard” for evaluation of stage of liver fibrosis and cirrhosis, it is limited as it is an invasive procedure with significant expense, manpower issues, and some risks [23]. Therefore, there is a need for a simple, reliable, and noninvasive alternative method for regular monitoring of disease progression [17]. In this study, we filtered out seven routine clinical parameters for the prediction of HBV-induced liver cirrhosis by statistical comparison of those of LC and CHB, including age, ALT, AST, PT, PLT, HGB, and RDW. We then investigated two supervised machine learning methods for predicting liver cirrhosis with these seven parameters. We found that MLP gave better results (AUC = 0.942) between the two classifiers. When compared to the two reported noninvasive algorithms using routine clinical parameters, our results indicate that the MLP was superior to them in the independent test set. The MLP classifier with optimal cut-off gives better accuracy (89.9%), higher sensitivity (95.2%), and acceptable sensitivity (84.2%) for predicting HBV-induced liver cirrhosis. A diagnostic model is considered as good if the AUC is greater than 80% and excellent if the AUC is greater than 90% [24]. Therefore, we concluded that MLP was an excellent tool for HBV-induced liver cirrhosis prediction. Moreover, our finding confirmed that predicted probability value of MLP was increased with the raising of Child-Pugh scores in HBV-related liver cirrhosis patients. Since Child-Pugh score is a well-recognized prognostic index for liver diseases, our result suggests that MLP has potential prognostic value for liver disease. Besides noninvasiveness, two advantages of the MLP classifier should be noted. First, MLP was based on seven routine clinical and laboratory parameters without any additional costs. Therefore, MLP represents a cost-effective tool for HBV-induced liver cirrhosis prediction. Second, the parameters in MLP are easily acquired in clinical practice, even in community hospitals. These advantages may facilitate the clinical utility of the MLP classifier for predicting liver cirrhosis in CHB patients and reduce the number of liver biopsies.

APRI and FIB-4 are two widely used, noninvasive and inexpensive tools to predict liver cirrhosis. APRI was initially reported for predicting significant fibrosis based upon patients with CHC [18]. It has satisfactory sensitivity and specificity together with a high predictive value for reducing the frequency with which biopsies need to be carried out to monitor the evolution of CHC [25]. AUC values of APRI in CHB fibrosis-related studies range between 0.541 and 0.86 [2633]. In a study of meta-analysis [24], the summary AUC of APRI was 0.75 with regard to HBV-related liver cirrhosis. It was consistent with our result (AUC = 0.726), which suggested that the diagnostic adequacy of APRI was limited as a marker of liver fibrosis in CHB patients compared to the patients with CHC. In Sterling’s report [19], FIB-4 index was first created for predicting significant fibrosis in patients with HIV/HCV coinfection with an AUC value of 0.737. In our study, the AUC of FIB-4 was greater than APRI in the test set which was consistent with the study of Erdogan et al. [6]. In order to clarify whether MLP was better than APRI and FIB-4 for the prediction of LC with CHB, we compared these tools in a head-to-head manner. The results showed that the MLP classifier had higher AUC when compared with APRI or FIB-4. Therefore, we concluded that the MLP classifier had superior diagnostic efficiency than APRI or FIB-4, two the most widely used noninvasive and inexpensive tools for liver cirrhosis prediction.

Haydon et al. [34] first used machine learning methods in predicting cirrhosis in patients with CHC based on viral and clinical factors. In Cazzaniga’s work, they obtained better AUC for predicting cirrhosis in CHC patients using artificial neural networks [35]. Recently, Wang and his colleagues predicted significant liver fibrosis of CHB patients using an artificial neural network based upon routine and serum markers [36]. In their work, the AUCs of training, validation, and test set were 0.883, 0.884, and 0.920, respectively. Although 455 patients were enrolled in their study, there were 27.7% patients with significant liver fibrosis, which might generate predictive bias during modeling. Moreover, it requires further validation for predicting liver cirrhosis in CHB patients because only 9 cirrhosis patients were included in their study.

In our study, seven common clinical parameters were selected to build the MLP classifier, including age, ALT, AST, PT, PLT, HGB, and RDW. Among the seven routine clinical parameters, age, PLT, AST, ALT, PT, and HGB had been reported for predicting significant liver fibrosis [6, 30, 36, 37]. In our published study, we found that RDW was increased with the worsening of HBV-related liver disease [38]. Similar work from Lou et al. was also reported [39]. The mechanisms underlying the increased RDW in liver fibrosis are not clear. It could be explained by the following two facts. First, increased RDW was potentially associated with inflammation response during the process of liver fibrosis [40, 41]. Second, the prevalence of renal failure is higher in patients with liver cirrhosis than in the general population [42], while increased RDW is related to impaired renal function [43]. In our study, similar result was observed that the levels of serum creatinine and RDW were higher in liver cirrhosis patients.

There are some limitations in our study. Firstly, only Chinese CHB patients were included in our study. Due to small sample size, verification of the MLP classifier with accuracy, sensitivity, and specificity in more large population from different race and regions would be important before considering clinical use. Secondly, our work was a retrospective study, and some clinical details of participants such as body mass index (BMI), cholesterol levels, alpha-fetoprotein (AFP), INR, and duration of disease were not available in this study. So it remains unclear whether including these clinical details in our MLP classifier will improve the diagnostic accuracy. In addition, this deficiency also made us unable to compare our classifier with some reported algorithms, such as Forns index [44], APGA [45], and PAPAS [27]. However, in validation studies reported recently [6, 33, 46], FIB-4 index gave the best diagnostic accuracy for the evaluation of hepatic fibrosis in patients with CHB among these models, including Forns index, APGA, PAPAS, and APRI. Likewise, FIB-4 index showed better AUC than that of APRI with optimal cutoff value in our work.

5. Conclusions

The main findings of this work are listed as follows. (1) The MLP classifier was developed for predicting liver cirrhosis in patients with CHB using seven routine clinical parameters which are of low cost and easily implemented, even in community hospitals. (2) High AUC of the MLP classifier which was superior to the two reported models could reduce the number of liver biopsies in clinical practice. (3) The MLP classifier has potential prognostic value for liver disease, especially in HBV-related liver cirrhosis patients.

In conclusion, we describe an MLP classifier, a noninvasive, accurate, inexpensive, and easily acquired tool for predicting and evaluating liver cirrhosis in CHB patients. Other studies are necessary for further verification of its accuracy, sensitivity, and specificity in a larger population from different races and regions.

Abbreviations

ALT: Alanine transaminase
AST: Aspartate aminotransferase
ALP: Alkaline phosphatase
AFP: Alpha-fetoprotein
APRI: AST to Platelet Ratio Index
AUC: Area under curve
ACC: Accuracy
ALB: Albumin
BMI: Body mass index
CRE: Creatinine
CHB: Chronic hepatitis B
CHC: Chronic hepatitis C
ECM: The extracellular matrix
HCC: Hepatocellular carcinoma
HGB: Hemoglobin
HBV: Hepatitis B virus
MLP: Multilayered perceptron
MCV: Mean corpuscular volume
NB: Naïve Bayes
PLT: Platelet count
PT: Prothrombin time
RDW: Red cell distribution width
SD: Standard deviation
TBIL: Total bilirubin
TP: True positive
TN: True negative
FP: False positive
FN: False negative
SE: Sensitivity
SP: Specificity
ROC: Receiver operator curves
LC: Liver cirrhosis
WEKA: Waikato Environment for Knowledge Analysis.

Conflict of Interests

The authors state that there is no conflict of interests regarding the publication of this paper.

Authors’ Contribution

Yuan Cao, Zhi-De Hu, and Xiao-Fei Liu contributed equally to this work.

Acknowledgment

This work was supported by grants from Shandong Provincial Natural Science Foundation, China (ZR2010HQ027).