Mathematical Problems in Engineering

Volume 2015, Article ID 832621, 13 pages

http://dx.doi.org/10.1155/2015/832621

## Blink Number Forecasting Based on Improved Bayesian Fusion Algorithm for Fatigue Driving Detection

^{1}School of Information and Control, Nanjing University of Information Science & Technology, Nanjing 210044, China^{2}Jiangsu Key Laboratory of Big Data Analysis Technology, Nanjing 210044, China^{3}School of Computer and Software, Nanjing University of Information Science & Technology, Nanjing 210044, China^{4}The NEXTRANS Center, Purdue University, West Lafayette, IN 47907, USA^{5}School of Electronic and Information Engineering, Nanjing University of Information Science & Technology, Nanjing 210044, China

Received 9 January 2015; Accepted 1 May 2015

Academic Editor: Yakov Strelniker

Copyright © 2015 Wei Sun et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

An improved Bayesian fusion algorithm (BFA) is proposed for forecasting the blink number in a continuous video. It assumes that, at one prediction interval, the blink number is correlated with the blink numbers of only a few previous intervals. With this assumption, the weights of the component predictors in the improved BFA are calculated according to their prediction performance only from a few intervals rather than from all intervals. Therefore, compared with the conventional BFA, the improved BFA is more sensitive to the disturbed condition of the component predictors for adjusting their weights more rapidly. To determine the most relevant intervals, the grey relation entropy-based analysis (GREBA) method is proposed, which can be used analyze the relevancy between the historical data flows of blink number and the data flow at the current interval. Three single predictors, that is, the autoregressive integrated moving average (ARIMA), radial basis function neural network (RBFNN), and Kalman filter (KF), are designed and incorporated linearly into the BFA. Experimental results demonstrate that the improved BFA obviously outperforms the conventional BFA in both accuracy and stability; also fatigue driving can be accurately warned against in advance based on the blink number forecasted by the improved BFA.

#### 1. Introduction

Fatigue driving is one of the major causes of serious accidents in transportation. Statistics show that traffic accidents caused by driver fatigue account for about 20% of the total number of accidents and more than 40% of serious traffic accidents [1, 2]. Experts agree that the actual contributions of fatigue driving to road accidents may be much higher [3]. Frequent road traffic accidents, serious casualties, and property losses caused by fatigue driving of drivers bring a heavy burden to the society and families. Thus, the accurate and efficient detection of driver fatigue is very important [4].

Many methods have been developed for detecting the driver’s fatigue recently, including the measurements of physiological features detection like EEG [5, 6], heart rate variability (HRV) [7, 8], and electrooculogram (EOG) [9], facial features detection such as blink [10, 11] and yawning [12, 13], and vehicle behaviors detection, for instance, lane deviation [14, 15] and steering angle movement [16, 17]. Among those methods, the physiological features based methods are intrusive for drivers, because these methods need to place detection electrodes on the skin, drivers will feel uncomfortable if these electrodes are in touch with the skin for a long time. Vehicle behaviors based methods are easily affected by different sizes and shapes of vehicles, as well as different driving habits of drivers [18]. Facial characteristics based methods especially blink based detection methods using machine vision are attracting more and more researchers [18–20]. The objective of the study is to propose a new forecasting algorithm of blinking number and further verify the effectiveness of the proposed algorithm according to the forecasting results.

For the fatigue driving detection methods based on eye blinking, almost all researchers pay more attention to real time detection algorithms that can fast detect opening or closing state of the eyes and correctly make recognition of whether a driver is fatigued currently. However, the detection methods usually are not able to meet the requirement of real time processing. It is computationally costly in detecting and recognizing fatigue characters, especially for fusion detection methods based on multiple fatigue features [21] which require detecting and recognizing many more characteristic parameters [22]. Moreover, statistics show that if drivers’ responses are only half a second faster when traffic accidents occurred, sixty percent of traffic accidents can be avoided. Therefore, if fatigue driving can be warned against in advance before traffic accidents happen using a certain detection instrument or forecasting method, mostly traffic accidents can be avoided [22, 23]. Furthermore, physiological researches show that fatigue is a gradual process where drivers experience from consciousness to drowsiness [24]. Therefore, if we develop a predictor that can forecast the blink number of the driver in a given time interval according to previous time intervals, then we can recognize the fatigue driving in advance and also save more time for fatigue feature extraction and detection algorithm enhancement. In a word, the advantage of using predictor method is that we can know and estimate the future driving state in advance according to the past and current states as long as we set appropriate time intervals. Meanwhile, we do not need to detect the driving state of drivers in real time, which can save more computation time for other more important processing tasks.

Currently, many successful applications of single predictor based forecasting algorithms have been reported from various fields, for instance, in the fault diagnosis [25, 26], transportation flow forecasting [27], time-series prediction [28], and so on. Based on this, a variety of methods has been put forward for goal state prediction, including the Kalman filtering (KF) model [29], nonparametric regression model [30], and autoregressive integrated moving average (ARIMA) model [31]. Generally, these prediction methods can be categorized as statistical time series analysis methods, which conduct their predictions based on historical data analysis. One benefit of statistical time series analysis methods is that they can make very good predictions when the goal state varies temporally. However, due to linear properties, these methods are inadequate for capturing the rapid variations of goal state. To overcome this problem, numerous studies have used machine learning methods such as artificial neural networks (ANN) [32, 33] and support vector machines (SVM) [34] as alternative predictors. A machine learning method is able to approximate goal state in any degree of complexity without prior knowledge of problem solving. In addition, because of its ability to learn from data, it can capture the underlying relationships of goal state even when they are not apparent.

Though the previously mentioned prediction approaches are powerful and useful methods for goal state prediction and can generate accurate results for certain patterns, they each have their own drawbacks in dealing with predictions. None of them can maintain excellent prediction performance under all application conditions [35, 36]. Generally, driver’s fatigue is related to not only the duration time of continuous driving but also the current time period. For example, drivers feel fatigued more likely at 3:00–5:00 and 14:00–16:00 than at other time periods, and drivers usually do not feel fatigued until after three hours of continuous driving. Additionally, the fluctuation of blink number of drivers from nonfatigue to fatigue is very obvious and abrupt when detected by a device or instrument. Therefore, forecasting algorithm based on a single predictor is hardly suitable for the abrupt fluctuation of blink number when drivers continuously drive for a long time. This result occurs because goal state generally exhibits a spatiotemporal behavior characterized by irregular randomness, and it is very difficult for a single prediction method to capture such a disturbed pattern.

In view of these deficiencies, some researchers have turned to multivariate modeling in which models are developed by combining multiple methods to take advantage of the merits of each algorithm. A feasible fusion method that could effectively combine the predictions of single predictors is the Bayesian fusion algorithm (BFA) proposed by Petridis et al. [37]. The BFA generates a prediction by a weighted fusion of forecasts of its all component predictors based on posterior probabilities and the Bayesian rule. However, the BFA pays no attention to the relevance between the historical data flows and the current data flow. It assumes that, at a particular interval, the weights of the component predictors in the BFA depend on the cumulative prediction performance of all past intervals. This may result in making the prediction quite impervious to the greatly fluctuated prediction accuracy of the component predictors. To overcome this problem, an improved BFA has been proposed in this paper. The underlying assumption is that the data flow of blink number at a particular prediction interval is only affected by the data flows from only few of the previous time intervals, which have a comparatively higher relevancy with it. Based on this assumption, the prediction errors of only a few intervals of the component predictors are required to be considered when calculating their weights.

Simulation experiments of fatigue driving find that the improved BFA is more sensitive to the fluctuation of the component predictors and can adjust their weights more rapidly compared to the conventional BFA and also can obtain better forecasting accuracy and stability. Therefore, it can be used to judge in advance whether drivers are fatigued by the blink number forecasted in next time period.

#### 2. Shortcoming of Conventional BFA

The conventional Bayesian fusion algorithm was originally proposed by Petridis et al. [37]. Its general idea is summarized as follows.

Let denote the actual blink number at time interval . Then, we havewhere is the th predictor, is the predicted blink number at time interval by the th predictor, and is the corresponding prediction error.

For a certain time interval, we can use to denote the uncertainty. The posterior probability of the th predictor to be the best model at time interval is defined as

Then, according to Bayesian rule, we can obtainwhere is the total number of component predictors; notice that

Assuming that the prediction error is a Gaussian white noise time series with zero mean and standard deviation , we have

Combining (3), (4), and (5) yields

Then, at time interval , the prediction of the conventional BFA can be written as the linear combination of the output of all the predictors as follows:where is the prediction generated by the conventional BFA and is the result estimated by the th predictor.

Similar to (6), , is formulated as

Combine (6) and (8) and we have

Similarly, we can formulate explicit expressions for . Substituting them into (9), we have

From (10), we can see that, rather than only depending on the prediction error at prediction interval , the weight depends on the prediction errors of all past intervals. This characteristic makes the conventional BFA very inert to the fluctuating accuracy of component predictors. If the dominant predictor is no longer the most accurate, it will take many intervals to reduce the dominant status of that predictor, thus imposing a negative impact on the predictions of the conventional BFA.

This problem arises because the conventional BFA does not consider the correlation between the historical data flows and the data flow at the prediction interval. Generally, only the data flows of blink number at the latest few intervals may strongly correlate with the data flow at a given prediction interval. For the data flows of blink number at earlier time intervals, they may have less impact on the current data flow.

#### 3. Improved Bayesian Fusion Algorithm

##### 3.1. Selection of Correlative Dada Flow

For most of the forecasting methods, the selected blink data flows from only a few intervals are considered. Inspired by this idea, this paper hypothesizes that, at a particular prediction interval, the data flow of blink number is only affected by the past few intervals’ data flows which have comparatively higher relevance with it. With this assumption, the weight of each component predictor in the BFA at a given prediction interval would just depend on the accumulative prediction performance of a few selected intervals rather than all intervals. Based on this assumption, (1) can be rewritten as where represents the set of the previous intervals at which the data flows of blink number have a comparatively higher correlation with the data flow at prediction interval .

Following the same inferential procedures from (2) to (6) and from (9) to (10), we havewhere represents the dimension of set . From (12), we can see that the weights of the component predictors just depend on their prediction errors at the previous intervals. Therefore, theoretically, the weights calculated by (12) are more sensitive to the fluctuated accuracy of component predictors, and the number of data flows to be used to calculate is also reduced greatly. Substituting it into (7), we can obtain the prediction of the improved BFA.

##### 3.2. Grey Relation Entropy-Based Analysis

According to the discussion in the previous section, one challenge to implementing the improved BFA is how to identify the time interval set where the data flows of blink number have comparatively higher correlation with the data flows at the prediction interval. Because (12) includes a nonlinear, complex exponential function, the regression method is not applicable here. An alternative method is the grey relation entropy-based analysis (GREBA) [38, 39]. It has been extensively applied for relevance analysis in various disciplines, such as logistics [40], economics [41], and engineering [42]. In this paper, we use this method for relevance analysis between current and historical data flows of blink number.

Assuming that, at the prediction interval , blink number is affected by its previous intervals’ data flows, listed as , should be large enough to cover most of the important data flows. We use and express the target data flow time sequence at the th interval and the alternative data flow time sequence at the interval, respectively; , is the amount of the alternatives data flow time sequence; , is the length of the target and alternative data flow time sequences; and . Here, the sequences of and are comparable because they are obtained from the same data sequence ; therefore, it is unnecessary to normalize the original data. The grey relation coefficient between and is then calculated aswhere is the distinguishing coefficient in the range of and it can be adjusted to make a better distinction between the target sequence and the alternative sequence.

To satisfy the rule of grey entropy, the grey relation coefficients should be transformed into a grey relation density , calculated by

Once the grey relation density is obtained, the entropy of the grey relation coefficient of each alternative , , is then computable, which represents the relevance degree of the grey relation coefficient. The calculation is shown below aswhere is the grey entropy between the target data flow time sequence and the alternative data flow time sequences and is the maximum grey entropy, which guarantees .

Finally, by multiplying the entropy and the average grey relation coefficient of alternative , we can obtain the grey relevancy grade (GRG). In this study, the GRG is defined as the numerical measurement of the relevancy between the alternative data flow time sequence and the target data flow time sequence , which is calculated aswhere is the GRG of the alternative time sequence with respect to the target sequence . It is distributed between 0 and 1. Based on the definition of the GRG, it can be seen that the higher the is, the more relevance there is between the alternative data flow time sequence and the target data flow time sequence .

##### 3.3. Procedures of the Improved BFA

Based on the above discussions, the procedures of the improved BFA are described in Figure 1 and detailed steps are as follows.