Computational and Mathematical Methods in Medicine

Volume 2015 (2015), Article ID 328273, 7 pages

http://dx.doi.org/10.1155/2015/328273

## Application of a Hybrid Method Combining Grey Model and Back Propagation Artificial Neural Networks to Forecast Hepatitis B in China

School of Preclinical Medicine, Guangxi Medical University, No. 22, Shuangyong Road, Nanning, Guangxi 530021, China

Received 20 September 2014; Revised 22 January 2015; Accepted 22 January 2015

Academic Editor: Chung-Min Liao

Copyright © 2015 Ruijing Gan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Accurate incidence forecasting of infectious disease provides potentially valuable insights in its own right. It is critical for early prevention and may contribute to health services management and syndrome surveillance. This study investigates the use of a hybrid algorithm combining grey model (GM) and back propagation artificial neural networks (BP-ANN) to forecast hepatitis B in China based on the yearly numbers of hepatitis B and to evaluate the method’s feasibility. The results showed that the proposal method has advantages over GM (1, 1) and GM (2, 1) in all the evaluation indexes.

#### 1. Introduction

Hepatitis B is a vaccine preventable disease caused by the hepatitis B virus (HBV) that can induce potentially fatal liver damage. It has infected approximately 2 billion people worldwide, which represents one-third of the world population. Each year around the world, HBV infection is responsible for about one million deaths due to liver failure and cirrhosis and more than 75% of the hepatocellular carcinomas world-wide develop from HBV infection [1–3]. HBV is most prevalent in China, South East Asia, sub-Saharan Africa, and the Amazon basin of South America where health care resources are most limited [4]. In the Chinese population of 1.3 billion individuals, there are estimated to be 93 million HBV carriers. Each year, 300,000 deaths are attributed to chronic hepatitis B, including deaths associated with liver cirrhosis and hepatocellular carcinoma (HCC) [5]. Therefore, it is critical for early prevention of hepatitis B and an accurate forecasting which would enable public health officials to evaluate intervention strategies and make educated decisions.

Mathematical and computational models have gained in importance in the public-health domain, especially in infectious disease epidemiology, by providing rationales and quantitative analysis to support decision-making and policy-making processes in recent years. And many researchers advocate the use of these models as predictive tools [6–12].

The accurate forecasting of hepatitis B can be obtained by analyzing the sufficient historical data. However, in China and perhaps some other developing countries, the current public health surveillance system does not collect detailed essential epidemiological information as they are often difficult to obtain. The forecasted of hepatitis B will be inaccurate only by the limited data. Therefore, it is significant to make the limited data-processing.

The grey systems theory chiefly including the theory of grey system analysis, modeling, prediction, decision-making, and control is established by Deng, which focuses on uncertainty problems with small samples, discrete data and incomplete information that are difficult for probability, and fuzzy mathematics to handle. Grey prediction is an important embranchment of grey systems theory, which makes scientific, quantitative forecasts about the future states of grey systems. The precise prediction of system can be performed by generating and extracting the useful information from the small samples and the partially known information [13–15].

Artificial neural networks (ANN) are complex and flexible nonlinear systems with properties not found in other modeling systems. It allows a method of forecasting with understanding of the relationship among variables and in particular nonlinear relationships. ANN function by initially learning a known set of data from a given problem with a known solution (training) and then the networks, inspired by the analytical processes of the human brain, are able to reconstruct the imprecise rules. Once a model is trained, the forecasted outputs can be generated from novel records [16–19].

The aim of this study is to investigate the use of a hybrid method combining grey model (GM) and back propagation artificial neural networks (BP-ANN) to forecast hepatitis B in China based on the yearly numbers of hepatitis B from the years 2002 to 2012 and to evaluate the method’s performances of prediction.

#### 2. Materials and Methods

##### 2.1. Data Sources

The incidence data of hepatitis B are collected from the Ministry of Health of the People’s Republic of China from the years 2002 to 2012, which are opening government statistics data [20].

##### 2.2. Methods

The proposed method is established based on the grey systems theory and BP-ANN theory. MATLAB software version 2011b is used for the statistical analysis.

The incidence data are considered as the original time series , where is the length of the time series.

Through grey generations or the effect of sequence operators to weaken the randomness, grey prediction models are designed to excavate the hidden laws; through the interchange between difference equations and differential equations, a practical jump of using discrete data sequences to establish continuous dynamic differential equations is materialized. Here, GM is the main and basic model of grey predictions, that is, a single variable first order grey model, which is able to acquire high prediction accuracy despite requiring small sample size (but the sample size must be at least 4). The GM model is suitable for sequences that show an obvious exponential pattern and can be used to describe monotonic changes. As for nonmonotonic wavelike development sequences, or saturated sigmoid sequences, one can consider establishing GM model.

The establishment for a GM model is derived as follows.

(1) Let nonnegative time sequence expressing be an original time sequence. Where is the sample size of the data.

(2) First-order accumulative generation operation (1-AGO) is used to convert into

(3) Let be the sequence generated from by adjacent neighbor means. That is, , . The least square estimate sequence of the grey difference equation of GM is defined as , where and are referred to as the development coefficient and grey action quantity, respectively.

Then where

(4) The whitenization equation is given by

(5) The forecasting model can be obtained by solving the above equation, which is shown as follows:

(6) The predicted value of the primitive data at time point is extracted:

The procedure for a GM model is derived as follows.

(1) For a given sequence of original data , let its sequences of accumulation generation and inverse accumulation generation be and , respectively, where , and the sequence of adjacent neighbor mean generation of is .

(2) The GM model is and the whitenization equation is given by . The least squares estimate of the parametric sequence is where

(3) Solve the whitenization equation. If is a special solution of the whitenization equation and the general solution of the corresponding homogeneous equation , then is the general solution of the GM whitenization equation. There are three cases for the general solution of the homogeneous equation: (i) when the characteristic equation has two distinct real roots ; (ii) when the characteristic equation has a repeated root ; (iii) when the characteristic equation has two complex conjugate roots and , . A special solution of the whitenization equation may take of the three possibilities: (i) when 0 is not a root of the characteristic equation, ; (ii) when 0 is one of the two distinct roots of the characteristic equation, ; and (iii) when 0 is the only root of the characteristic equation, .

The steps of the forecasting method can be described as follows.

*Step 1 (train the BP-ANN). *In order to obtain the input of the BP-ANN, the GM and GM model are used to predict for the original time series of hepatitis B, respectively. The two groups of predicted are taken as the input of the BP-ANN. At the same time, the original time series are taken as the output. Thus the structure of a three-layer BP-ANN is constructed and the trained BP-ANN model will be obtained by training.

*Step 2 (forecast by the trained BP-ANN). *The GM and GM model are used to forecast for the original time series of hepatitis B, respectively, at first. Then the two groups of forecasted data are taken as the input of the trained BP-ANN. Finally, the forecasted of hepatitis B will be obtained by running the trained BP-ANN.

The method flow chart is shown in Figure 1.