Abstract

Driver fatigue is the main cause of traffic accidents. How to extract the effective features of fatigue is important for recognition accuracy and traffic safety. To solve the problem, this paper proposes a new method of driver fatigue features extraction based on the facial image sequence. In this method, first, each facial image in the sequence is divided into nonoverlapping blocks of the same size, and Gabor wavelets are employed to extract multiscale and multiorientation features. Then the mean value and standard deviation of each block’s features are calculated, respectively. Considering the facial performance of human fatigue is a dynamic process that developed over time, each block’s features are analyzed in the sequence. Finally, Adaboost algorithm is applied to select the most discriminating fatigue features. The proposed method was tested on a self-built database which includes a wide range of human subjects of different genders, poses, and illuminations in real-life fatigue conditions. Experimental results show the effectiveness of the proposed method.

1. Introduction

Traffic safety is a common problem faced by most of the countries in the world. Road traffic accidents each year come up to 10 million [1], accounting for about 90% of the global security incidents, a majority of which are caused by driver fatigue. So the ability to effectively identify driver fatigue and timely give reminder is the key to safe driving.

Current methods for the detection of driver fatigue can be generally summarized as the following three categories. The first method is identification based on physiological information, such as electroencephalogram (EEG), electrocardiogram (ECG), pulse, and blood pressure [24]. The second method is identification through the behavior of the driver, including driver’s grip force on the steering wheel, speed, acceleration, and braking [5, 6]. The first method obtains high recognition accuracy but is not easy to be accepted by drivers as it is intrusive to measure the heart and breathing rates and the brain activity, while the second method can be implemented nonintrusively, but it is subject to several limitations, including the driver experiences, vehicle type, and driving conditions. With the rapid development of computer vision technology, the third method, fatigue detection based on computer vision, is put forward, which uses optical sensors or cameras to get fatigue features and then analyzes whether a driver is in the state of fatigue. Due to its characteristic of nonintrusion and high recognition accuracy, this method becomes more and more practical and popular.

In recent decades, the method based on computer vision has been working primarily through extracting the features of eyes and mouth. The most representative is Perclos (percentage of eyelid closure) method [7]. Wierwille et al. [8] first used Perclos as metrics of the degree of fatigue in a research project funded by NHTSA to monitor driver fatigue. Then Dinges et al. [9] proved that Perclos could accurately reflect the driver fatigue level. There are also other scholars trying to detect fatigue through other features of eyes. Heitmann et al. [10] tried to detect physical and mental condition of the driver through the driver’s direction of the gaze and the change in diameter of the pupil. Ito et al. [11] used eye blink frequency to detect fatigue. Azim et al. [12] extracted features from eyes and mouth to monitor fatigue. Liu et al. [13] incorporated Kalman filter and mean shift to track eyes, extracting eye’s motion information as driver features. Dong and Wu [14] used the distance of eyelids to decide whether the driver was in fatigue.

Mouth features are also often extracted to detect fatigue. Wang and Shi [15] identified and located the mouth by a priori knowledge and then used the degree of mouth openness to determine driver yawning. Fan et al. [16] extracted features through analysis of the degree of the mouth openness. Chu et al. [17] used the Fisher classifier to extract the mouth position and shape, then used the mouth region’s geometry character as the feature value, and put all of these features together to make up an eigenvector as the input of a three-level BP network; then he got the output among three different spirit states: normal, speaking, and dozing.

Through the review above, we find that the previous work focuses only on features of eyes or mouth. As a result, the extracted features could not well represent the state of driver fatigue. And the premise of the implementation of these methods is the position of the eyes and mouth that can be accurately located, which is difficult in the real-life driving condition. Besides, in the previous work, the features in a single image were used to detect whether the driver is tired. In some cases these methods have achieved good results, but the facial performance of human fatigue is a procedure that changes over time. Methods based on a single face image cannot reflect the dynamic process of fatigue, thus leading to a high false detecting rate. Therefore, in this paper we extract fatigue features from the whole human face by multiscale and multiorientation Gabor wavelets and then analyze the features in the image sequence. Finally, Adaboost algorithm is applied to select the most discriminating fatigue features.

The rest of the paper is organized as follows. Section 2 introduces the features extraction based on Gabor wavelets. Section 3 presents the fatigue features extraction in the image sequence. The approach to features selection based on Adaboost is discussed in Section 4. In Section 5, experimental results are given, and finally, conclusions come in Section 6.

2. Features Extraction Based on Gabor Wavelets

Currently, most of the fatigue features extracted are geometry features of the eyes or mouth. However, accurate extraction of these features needs precise positioning, which is difficult in the real-life driving condition. This problem can be solved by analyzing the texture features of fatigue. In recent decades, many scholars have done a lot of researches to explore the methods of texture features extraction, including those based on cooccurrence matrix [18] and wavelet transformation [1921]. Smith and Chang [22] extracted texture features from wavelet subband and experiments show that it can achieve better classification results. Ma and Manjunath [23] compared varieties of wavelet transformation, such as orthogonal wavelets, biorthogonal wavelets, tree structured wavelets, and Gabor wavelets. The results show that Gabor wavelets transform is optimal. Therefore, this paper uses Gabor wavelets to extract the texture features of fatigue.

2.1. Gabor Wavelets Transformation

Daugman [24] extended Gabor filter from one-dimensional to two-dimensional. The two-dimensional Gabor wavelets, a set of filter components of different orientations and different scales, can analyze gray change of the image in various scales and orientations. They have been used in varieties of image-processing applications such as face recognition [25, 26]. The two-dimensional Gabor wavelets can be defined as where and , respectively, represent the orientation and scale of Gabor wavelets, is a complex operator, defines the bandwidth of wavelet filter, is the pixel coordinates, is wavelet frequency, and represents wave vector.

When a subject is in fatigue, features of different facial behaviors have different scales. In order to extract all the important features of fatigue, the paper convolves the face image with 5-scale and 8-orientation Gabor wavelets, defined as where represents convolution operation, is pixel value at the point , and is the filtering result in orientation and scale .

Before Gabor wavelets are applied to each image in the sequence, original face images are preprocessed, gray-scaled, and normalized with a size of 64 × 64. Each image in the sequence is convolved with 40 Gabor wavelets, resulting in 40 multiscale and multiorientation feature images. Figure 1 shows the procedure.

2.2. Block Features Extraction

Using Gabor wavelets to extract multiorientation and multiscale features from the whole face will lead to an extremely high dimension. It is not conducive to features extraction quickly. To solve this problem, in this paper, each face image in the sequence is divided into nonoverlapping blocks of the same size, and then Gabor wavelets are employed to extract multiorientation and multiscale features; finally, the calculated mean value and standard deviation of each block’s features are calculated, respectively. Figure 2 illustrates the process of block features extraction.

Mean value and standard deviation of each block’s features are defined as follows.(1)Calculate the mean value of all features in the block: (2)Calculate the standard deviation of all features in the block: where stands for the block and is the number of all features in the block.

In this paper, the face image is divided into 8 × 8 blocks. When the image with a size of 64 × 64 is transformed by Gabor wavelets, the dimensions of features reach 64 × 64 × 40 = 163,840. However, by calculating the mean and standard deviation of each block’s features, the dimensions reduce to 2 × 40 × 8 × 8 = 5120, which is much smaller than 163,840. This is appropriate for fast features extraction.

3. Extraction of Features from the Sequence of Images

In the previous work, the features in a single image are used to detect whether driver is tired. In some cases these methods have achieved good results. But the facial performance of human fatigue is a procedure that changes over time. Methods based on a single face image may result in high false detecting rate. For example, when a person is tired, he may yawn with mouth broadly open. We can use the height of mouth to detect fatigue. However, as we know, the height of the mouth sometimes is great as well when he is sneezing. The difference between sneezing and yawn sometimes cannot be reflected with the features in a single image. Considering the process of sneezing and yawn, we can see that the height of the mouth changes quickly when the person is sneezing, while the frequency is relatively slow if he is yawning. So by using the information hidden in a sequence of images, it is easy to discriminate between sneezing and yawn. Based on this observation, it can be concluded that human fatigue is a state developing over time, which should be reflected by dynamic features. Therefore, this paper uses nine formulas to extract the features from image sequence to reflect the dynamic process of fatigue, including peak value, mean, standard deviation, root of mean square, shape factor, skewness, kurtosis, crest factor, and pulse index.

For a given sequence with images, each image is labeled with , where is the index of image. And each image is divided into blocks; the features of each block in the sequence are defined as . Therefore, the features of sequence can be represented as This paper fuses the features ( ) of each block in the same position, same scale, and same orientation by nine formulas. They are defined as follows:(1)peak value: (2)mean: (3)standard deviation: (4)root of mean square: (5)shape factor: (6)skewness: (7)kurtosis: (8)crest factor: (9)pulse index: where is the features of each block, is the block features in the sequence , is the number of the images in a sequence, is mean of , and is standard deviation of . Figure 3 illustrates the procedure of features extraction in the sequence.

Through the nine formulas above, all the features of fatigue are extracted. But these features are not of the same importance, some of which may be redundant or irrelevant. In the next section, the paper uses Adaboost algorithm to select the most discriminating fatigue features.

4. Features Selection

AdaBoost, introduced by Freund and Schapire [27], is a leaning algorithm that selects a small number of weak classifiers from a large weak classifier to construct a strong classifier. Due to its good generalization capability, low implementation complexity, and fast performance, Adaboost has been used successfully in face detection [28, 29], face recognition [30], face expression recognition [31], and other applications. In this paper, Adaboost algorithm is used to select fatigue features. Let each weak classifier correspond to only one feature. Then the selection of weak classifiers is the selection of features.

The training set with positive samples and negative samples is used and represented as . means the negative sample and positive sample, is the total number of samples and , and . The algorithm to select features is as follows:(1)initialize weights: (2)normalize weights: (3)set weak classifiers: where is a feature, is a parity for inequality sign, and is the classification threshold of the weaker classifier,(4)select a classifier of minimum weighted error : (5)update the weights, , where : (6)if , back to (2), else end the cycles.

After all the iterations of above algorithm, we use test samples and training samples to make a cross validation. Finally, the most discriminating fatigue features are chosen.

5. Experiments

In order to verify the effectiveness of the proposed method, this section will test it on the self-built database which contains a wide range of human subjects of different genders, poses, and illuminations in real-life fatigue conditions. Experimental results are shown as follows.

5.1. Experimental Data

There are many public databases for face detection and face expression recognition, but the database of driver fatigue remains vacant. To test the proposed method, the research team builds a fatigue face database. The database is set up by using Web cameras of 320 × 240 pixel with the transmission rate of 20 frames/s to capture video data. It contains 1000 image sequences of 30 men and 20 women. Examples of image sequences are shown in Figure 4.

In order to reflect the real driving environment, sequences different in illuminations and poses are included in the database, as shown in Figure 5.

To verify the advantage of extracting fatigue features from image sequences, the database also includes some image sequences of pseudofatigue like sneezing, as shown in Figure 6.

5.2. Decision of the Number of Blocks

In this paper, each face image in the sequence is divided into nonoverlapping blocks of the same size. The number of the blocks will affect the feature dimensions and recognition accuracy. On one hand, if the number is too small, the block area is too large; thus the extracted feature is too rough to precisely describe the fatigue state. On the other hand, too large number of blocks will cause high dimension of the features.

Considering the above, the number of blocks is discussed in the following situations: l × 1 = 1, 2 × 2 = 4, 2 × 4 = 8, 4 × 4 = 16, 4 × 8 = 32, 8 × 8 = 64, 8 × 16 = 128, and 16 × 16 = 256. In order to test the recognition accuracy of different situations, the paper uses support vector machine to classify fatigue. First, it selects 500 sequences of 25 people as training samples and the other 500 sequences of 25 persons as test samples randomly. Then it exchanges the training set and test set to make a cross validation. The commonly used performance evaluating criterion, correct rate, is adopted, defined as the following formula: In the formula, represents the number of alter samples classified as alter ones, and represents the number of fatigue samples classified as fatigue ones. and are the numbers of alter and fatigue test samples, respectively. Experimental results are shown in Figure 7.

As can be seen from Figure 7, the correct rate increases with the number of blocks; when the number gets up to 8 × 8 = 64, the correct rate reaches the highest, 98.3%. But when the number of blocks is further increased, the correct rate decreases. This result proves the previous analysis. Therefore, the final number of blocks used in this paper is 8 × 8 = 64.

5.3. Analysis of Selected Features

In Section 4, Adaboost algorithm is used to select the most discriminating features. In order to verify the effectiveness of selected features, the following two aspects are discussed.

5.3.1. Relationship between the Dimension and Correct Rate

As we know, different dimensions of selected features will lead to different recognition accuracy. So we need to analyze how many features should be selected to get a high correct rate. An experiment about the relationship between the dimension of selected features and the correct rate is done. The results are shown in Figure 8.

As can be seen from Figure 8, when the dimension of selected features is low, the correct rate increases quickly, while when the dimension of selected features becomes high, the correct rate increases slowly. The best correct rate (98.3%) is arrived when the number of the selected features is 346. It is much smaller than that of the whole features (64 × 5 × 8 × 9 × 2 = 460,80). This is appropriate for fast extraction and accurate classification.

5.3.2. Distribution of the Selected Features

In the proposed method, AdaBoost algorithm is used to select the most discriminative features. To observe the characteristics of the selected features intuitively, distribution of the selected features will be discussed in this section. The first analysis is of the scale and orientation distribution of the selected features. Table 1 shows the results.

It can be seen from Table 1 that the selected features are distributed on each scale and orientation. It illustrates the use of multiscale and multiorientation Gabor wavelets to extract the fatigue features which are necessary. Meanwhile, we can see that the number varies on different scales and orientations. For example, in terms of scale, there are more selected features distributed on scales 3 and 4, while on scales 0, 1, and 2, there are fewer features. Similarly, in different orientations, the distribution of the selected features differs.

Figure 9 shows the division of the face image and uses different brightness to show the distribution of the selected features in blocks. The lighter block means that more features are selected and vice versa. The distribution of the selected features in different blocks will be shown in Figure 10, where the coordinate value of -axis is corresponding to the block number in Figure 9. From the histogram, it can be safely concluded that features of different blocks contribute to fatigue identification quite differently. On nearly 1/3 parts of the face, no feature is selected. While in eyes and mouth region, the numbers of selected features are large, with a sum of more than 50%, indicating their large contribution to fatigue identification and proving why previous work focused on eyes or mouth. But there are also many selected features in other parts of the face which they ignored. Thus, this paper’s method to extract fatigue features from the overall face can more accurately reflect the state of driver fatigue.

5.4. Advantages of Extracting Sequence Features

This section will compare the two methods based on image sequences and on a single image to verify the advantage of extracting fatigue features from sequences. The first method is known as method 1, as proposed in this paper. The second method is called method 2 without sequence features. Since the camera captures images faster than human expression changes, there is almost no difference in several consecutive frames, so only the middle frame of every five frames is chosen. In method 2, the chosen middle frame is used as a single face image to detect fatigue. While in method 1, when the system gets successive five middle frames of twenty-five frames from the sequence, features are extracted by the proposed algorithm and SVM classifier is used to recognize fatigue. To evaluate their performance, false detecting rate and missing rate are introduced here. They are defined as follows: where represents the actually detected number of fatigue, means the accurately detected number of fatigue, represents the detected number of the pseudofatigue, and indicates the number of fatigue not detected.

The results are shown in the Table 2, in which we can see that the missing rate of methods 1 and 2 is almost the same but in method 2 the false detecting rate is higher, which may be mainly because the performance of human fatigue changes over time, and a single image cannot truly reflect the state of fatigue. As a result, pseudofatigue like sneezing is judged as fatigue, causing false detection.

In order to more intuitively reflect the advantage of the proposed method, image sequences containing the stage of increasing fatigue, stage of eliminating fatigue, and pseudofatigue stage are captured by the camera. The sequence lasts about 20 s which includes 400 frames. Experimental results of methods 1 and 2 are shown in Figures 11 and 12, respectively. In these figures, -axis indicates the degree of fatigue. When the parameter is below 0, it indicates an alertness state. On the other hand, when the parameter is over 0, it indicates a fatigue state. And the higher the value is, the more fatigue the driver is.

Figure 13 compares the fitting results of method 1, method 2, and real situation. It can be seen from the figure that both methods are close to the real situation of fatigue when there is an increase or a decrease in fatigue. But when fatigue disappears and the driver sneezes, at the 200th frame, an error occurs in method 2, while method 1 is close to the real situation of fatigue. This result demonstrates the advantage of the method proposed in this paper, which not only ensures the accuracy of fatigue detection but also can reduce the false judgment caused by occasional mouth open.

5.5. Comparison with the Existing Methods

In this section, two existing methods are compared with this paper’s driver fatigue features extraction to find which is more effective. One method, proposed by Fan et al. [16], extracts features through analysis of the degree of the mouth openness. The other is proposed by Azim et al. [12], which extracts features from eyes and mouth to monitor fatigue. The experiment which uses the same data in Section 5.4 is run on a PC with a Core i5 2.66 GHz CPU, 4G MB memory, and Windows XP operating system. Figure 14 shows the result.

As can be seen from Figure 14, the result of this paper’s method is superior to that of Azim et al. [12] and Fan et al. [16]. There are two advantages.(a)This paper’s method gets the highest correct rate. The average correct rate of Fan et al. [16] and Azim et al. [12] is 84.7% and 91.4%, respectively. This paper’s method achieves a very encouraging correct rate of 98.2%. From Figure 14 we can see that this paper’s method is more close to real situation. The main reason is that this paper’s method extracts fatigue features from the whole face. However, in the method of Azim et al. [12] and Fan et al. [16], they extract features only from mouth or eyes, which leads to the loss of some important information. As a result, the extracted features could not well represent the state of driver fatigue.(b)The false detecting rate of Azim et al. [12] and Fan et al. [16] is higher than that of this paper’s method, as is shown in Figure 14. From 200th frame to 250th frame, there is sneezing in the video. The method of Azim et al. [12] and Fan et al. [16] cannot effectively distinguish between the state of sneezing and yawning, resulting in false detection. The main reason is that they have not analyzed the dynamic process of fatigue. However, this paper’s method avoids this problem by using nine formulas to extract the dynamic features from the sequence of images.

6. Conclusions and Future Work

Fatigue detection is a key step in constructing a system for driver fatigue monitoring. One of the core issues is how to extract and select effective features from driver images. In this paper, multiscale and multiorientation Gabor wavelets are used to extract fatigue features from the whole face, overcoming the defect of losing some important fatigue features when extracting features from eyes or mouth only. Besides, taking into account the fact that the facial performance of the human fatigue is a dynamic process that changes over time, the fatigue features are extracted based on image sequences in the paper. Experimental results show that the proposed method can reduce the false detecting rate and missing rate. In future work, efforts will be focused on how to detect driver fatigue in some special circumstances, when the driver has sunglasses or masks on.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.