Abstract

We present an intelligent driver identification system to handle vehicle theft based on modeling dynamic human behaviors. We propose to recognize illegitimate drivers through their driving behaviors. Since human driving behaviors belong to a dynamic biometrical feature which is complex and difficult to imitate compared with static features such as passwords and fingerprints, we find that this novel idea of utilizing human dynamic features for enhanced security application is more effective. In this paper, we first describe our experimental platform for collecting and modeling human driving behaviors. Then we compare fast Fourier transform (FFT), principal component analysis (PCA), and independent component analysis (ICA) for data preprocessing. Using machine learning method of support vector machine (SVM), we derive the individual driving behavior model and we then demonstrate the procedure for recognizing different drivers by analyzing the corresponding models. The experimental results of learning algorithms and evaluation are described.

1. Introduction

Automobiles are by now indispensable to our personal lives, as well as to the activities of business, public services, and even national security, but the problem of car thefts is a reality and it threatens the automobile security seriously. According to the National Insurance Crime Bureau, a vehicle is stolen every 25 seconds in the U.S.A. Each year, over 1.2 million vehicles are stolen across the country, causing the loss of 8 billion US dollars. Therefore, the work on vehicle security is significant.

Not surprisingly, many classical intelligent technologies are already well established within the automotive industry for vehicle security. GM has developed the system [1], which can supply the vehicle information to the infrastructure once the vehicle owner reported for a theft. However, it has no solution to how to detect the theft by vehicle automatically. There are a large number of vehicle companies which have commercialized their intelligent electronic car keys [2], however, the level of “intelligence” is only limited to activating the vehicle remotely. The deeper research for intelligent recognition of a vehicle theft will boost the vehicle intelligence significantly.

In the past decade, a number of groups have participated in the research of intelligent vehicles, which have led to projects including ARGO [3], ROVA [4], and MOSFET [5]. These vehicles mainly apply machine vision techniques to perform road-condition analysis and assist people in driving.

In recent years, the paradigm of learning human behaviors has attracted considerable amount of attention. It is difficult to describe the desired instructions into specific and proper code statements. In the past decade, several researchers have proposed various experimental designs and applications [68]. Significant research towards learning skills directly from humans on intelligent vehicles has been conducted primarily by the Navlab group at the CMU [9] and our group at the CUHK [10, 11].

Support vector machine (SVM) has recently become popular in the machine learning. SVM is a new learning-by-example paradigm spanning a broad range of classification, regression, and density estimation problems. This systematic approach motivated by statistical learning theory combines ideas from various scientific branches such as mathematical programming, exploiting the quadratic programming for convex optimization, functional analysis, indicating adequate methods for kernel representations, and machine learning theory, exploring the large maximum classifiers concept [12]. It was first introduced by Vapnik and co-workers and is described in more detail in [13, 14]. The roots of this approach, the so-called support vector (SV) methods of constructing the optimal separating hyperplane for pattern recognition, were already presented and had been used in machine learning in [15]. The SV technique was generalized for nonlinear separating surfaces in [16], and was further extended for constructing decision rules in the nonseparable case [17]. The training task involves optimization of a convex cost function conveying to a technique without local minima.

SVM has been applied to many areas, such as pattern recognition, regression, equalization [18, 19]. It is adopted in applications such as dynamic robot control [20], space robot control [21], image classification [22], human dynamic gait recognition [23], and so on.

In this paper, we focus on the research of utilizing dynamic human behavior models for vehicle security (preventing vehicles from being stolen) application. A methodology of modeling dynamic human behaviors is proposed. By learning from driving performances, the intelligent classifier can be embedded into an IC-based car key, through which the vehicle security system can identify valid drivers based on the ways the vehicle are driven and the drivers behave. When an illegitimate driver come to use the car and the demonstrated driving behaviors do not match the specified model, the car will be enabled to automatically stop running and deliver alarm signals accordingly. In [24, 25], we constructed a steering-by-wire vehicle and its steering interphase, which will be able to work as an on-road test platform for vehicle security.

We highlight the following aspects of our system in this paper: First, live biometrical features in dynamic human behaviors are adopted in the system, which brings the enhanced security to the proposed security system. Second, since we collect the signals directly from human driving controls, which include steering, acceleration and braking, we do not utilize other car dynamics and environmental variables such as the car's yaw angle with respect to the road, lateral offset to the road's center, and the road curvature. Therefore, no complicated sensor is required, which brings to the system robust and efficient performance in realtime. Third, the intelligent security system is easy to install on a normal vehicle by adding on functional modules. No complicated requirements means little space and time needed for system installation and drivers are likewise not distracted by the addition of the in-car system. Finally, we develop a methodology to capture and analyze the characteristics of human behaviors into computational representations. It is easily scalable for other applications.

The paper is organized as follows. Firstly, technical descriptions of the experimental hardware and software platforms are presented. In Section 3, we then extract features from the data collected from the experimental platform. Thereafter, we utilize SVM to classify the features of human driving behavior data in Section 4. Section 5 studies and analyzes the experimental results. Section 6 concludes the paper.

2. Experimental Platform and Data Collection

2.1. Overview

In this section, the technical descriptions of the implemented hardware and software platforms are presented. Figure 1 shows the architecture of the experimental intelligent vehicle security system. We design an experimental platform which consists of three parts, a real-time graphic simulator offering full controls including steering as well as the brake and acceleration pedals, a sensory system with a processor circuit board to capture human driving behavior data, and an analysis system to model and identify human behavior. While the human subject is “driving” on the driving simulator in part 1, the signals of his/her driving performance are captured by part 2 and then transferred to part 3 for modeling processes.

Figure 2 shows the hardware architecture of the proposed intelligent vehicle security system. The sensory module is implemented in the vehicle (we use simulated driving system instead) to measure the values of steering, braking, and acceleration. The data is sent to the monitoring module in realtime with the driver models embedded in the IC-based car key. Then the recognition result from the monitoring module is sent to user interface incar and to the driving data recorder as well. If the illegitimate driver is detected, the alarm module will be active and the alarm message can be delivered through the wireless communication network.

2.2. Driving Simulation Subsystem

Figure 3 shows the diagram of proposed driving simulation subsystem. In this system, a simulated driving environment which bears substantial resemblance to a comparable real driving task is developed. Although some aspects of a real driving task cannot be modeled well in a simulated driving environment, including driving control reality variable road conditions, and, we choose a simulated driving task because it embodies the qualities which meet that criteria for comparing and identifying individual driving behavior. Moreover, the focus of this paper is the analysis of human behaviors itself.

In the simulation subsystem, we apply one PC to offer the simulated driving environment for the driver, including rendered 3-D graphics display as well as realistic controls, which are the steering wheel and pedals for acceleration and brake. We adopt a set of commercial racing game controllers from Logitech (in Figures 4(a) and 4(b)) and the racing game named “Need for Speed: Underground” from Electronic Arts. The racing game provides the features of 3-D road scenes with dynamic traffic stream (in Figure 4(c)). Human subject uses the front view and game controller to drive in the simulated environment, as if sitting in the driving seat in a real car. In this experiment, the host PC used to run the racing game is based on Pentium 4 2.0 G and 1024 MB RAM, the graphics card is Matrox Parhelia 128 MB which supports 3 monitors at a time for wide angle display.

2.3. Data Sensing and Capturing Subsystem

In the data capturing subsystem, a processor circuit board (in Figure 4(d)) is utilized to sense and gather the individual driving behavior data from the driving environment simulation subsystem. With the driving control sensing device, 3 analog signals are gathered and then processed by an A/D converter at the sampling time of 100 ms and then the digitized values are sent to the microcomputer (ATmega8535L). The received data can be represented by where represents the normalized acceleration value, represents the normalized braking value, and represents the normalized steering value.

The is a time sequence of 3-Dimensional data (shown in Figure 5) and is transferred to the human behavior analysis subsystem through the RS232 port for further processing.

2.4. Data Analysis Subsystem

In the human behavior analysis subsystem, the methods introduced in the following sections are applied to the retrieved human behavior data. For our goal in identifying the drivers from their driving skills, the human behavior model library of each driver is generated from the corresponding behavior data input. Once the models are ready, we implement them as the classifier in the system in response to the real-time individual driving performance.

Before modeling human driving behaviors, we apply data preprocessing methods towards data collected from the previous subsystem. Fast Fourier transform (FFT), principal component analysis (PCA), and independent component analysis (ICA) are investigated in this paper. The output of this data preprocessing module is for the support vector machine (SVM) modeling and evaluation. These methodologies are presented as follows.

3. Feature Extraction

In this section, we apply data preprocessing methods towards data collected from the aforementioned experimental platform. It is necessary and important to apply data reduction and feature selection in data preprocessing for human behaviors modeling because failures in feature selection reduces the efficiency of the system performance significantly, even bad feature selection causes the failure of whole recognition procedures. Among several feature extraction methods, fast Fourier transform (FFT), principal component analysis (PCA), and independent component analysis (ICA) are investigated in this paper.

3.1. Fast Fourier Transform

To determine the extent of preprocessing human behaviors, we consider factors such as the existence of a preprocessing algorithm, its necessity, its complexity, and its generality. We select fast Fourier transform. In fact, if we have a function given by whose Fourier transform is , when we shift by a constant time, , that is, , its Fourier transformation is that is, time shifting affects phase only; the magnitude remains constant throughout.

Although the Fourier transform does not explicitly show the time localization of frequency components, the time localization can be presented by suitably prewindowing the signal in time domain [26]. Accordingly, short time Fourier transform (STFT) [27] of a signal is defined as

STFT at time is the Fourier transform of the signal multiplied by a shifted analysis window centered around . All integrals are from − to . Because multiplication by the relatively short window effectively suppresses the signal outside a neighborhood around the analysis time point , the STFT is simply a local spectrum of the signal around analysis window . The windows can be overlapped to prevent loss of information. Although human behavior is a nonstationary stochastic process over a long interval, it can be considered stationary over a short time interval. Thus, STFT should give a good spectral representation of the human behaviors during that time interval.

3.2. Principal Component Analysis

The method can be described in brief as follows suppose that we have two sets of training samples: and . The number of training samples in each set is . represents each eigenvector produced by principal component analysis (PCA). Each of the training samples, including positive samples and negative samples, can be projected into an axis extended by the corresponding eigenvector. By analyzing the distribution of the projected points, we can roughly select the eigenvectors which have more human behavior information. The following is a detailed description of the process.For a certain eigenvector , compute its mapping result according to the two sets of training samples. The result can be described as .Train a classifier using a simple method such as Perception or Neural Network which can separate into two groups: specific valid driver or others with a minimum error .If , then we delete this eigenvector from the original set of eigenvectors.

is the number of eigenvectors and is the total number of training samples. is the predefined threshold. The remaining eigenvectors are selected.

3.3. Independent Component Analysis

Apart from PCA, we also propose using independent component analysis (ICA) to reduce the dimensions of the data inputs for human behavior modeling. Independent component analysis is a statistical method which transforms an observed multidimensional vector into components that are statistically as independent as possible.

A fixed-point algorithm is employed for independent component analysis [28]. The goal of the ICA algorithm is to search for a linear combination of the prewhitened data , where , such that the negentropy (non-gaussianity) is maximized. is assumed to be bounded to have a norm of 1 and is the derivative of . The fixed point algorithm [28] is as follows:Generate an initial random vector , Stop if converged ( is smaller than a certain defined threshold). Otherwise, increment by 1 and return to step .

If the process converges successfully, the vector produced can be converted to one of the underlying independent components by Due to the whitening process, the columns of are orthonormal. By projecting the current estimates on the subspace orthogonal to the columns of the matrix which are found previously, we are able to retrieve the components one after the other.

4. Learning via Support Vector Machine

In this paper, SVM is applied within the framework of modeling human behaviors for intelligent vehicle security application. Inherent complexities and the nonlinearity of human dynamic behavior make mathematical modeling difficult, hindering the use of conventional methods for process modeling and condition monitoring.

4.1. Mathematical Description

In SVM, the basic idea is to map the data into a high-dimensional feature space via a nonlinear mapping , and to do linear classification or regression in this space where is a threshold. Thus, linear regression in a high-dimensional (feature) space corresponds to nonlinear regression in the low-dimensional input space . Note that the dot product in (4) between and would have to be computed in this high-dimensional space (which is usually intractable), if we are not able to use the kernel that eventually leaves us with dot products that can be implicitly expressed in the low-dimensional input space . Since is fixed, we determine from the data by minimizing the sum of the empirical risk [] and a complexity term , which enforces flatness in feature space where denotes the sample size , is the penalty term and is a regularization constant. For a large set of loss function, (5) can be minimized by solving a quadratic programming problem, which is uniquely solvable. It can be shown that the vector can be written in terms of the data points with being the solution of the aforementioned quadratic programming problem. and have an intuitive interpretation as forces pushing and pulling the estimate towards the measurements . Taking (6) and (4) into account, we are able to rewrite the whole problem in terms of dot products in the low-dimensional input space where are Lagrangian multipliers, and are support vectors.

In (7), we introduce a kernel function . As explained in [29], any symmetric kernel function satisfying Mercer's condition corresponds to a dot product in some feature space.

4.2. Approach

We propose to use SVM to model human driving behaviors. We consider human driving data as the input vector of human dynamic behavior features as the time sequence. Since the SVM has an ability for classification, we use the human behavior data to “train" the SVM classifiers in the human dynamic behavior features space. For each driver, we want to design an SVM model to separate them from the other drivers. The task is to build up models across individuals by using the SVM training procedure. Then, since a dual class SVM model is capable of classifying two different classes of data only, in our applications, more than one dual class SVM is utilized and they are organized in a hierarchical manner. If there are drivers to be recognized, SVMs will be connected together as shown in Figure 6.

In any predictive learning task, such as classification, an appropriate representation of examples as well as the model and parameter estimation method should be selected to obtain a high level of performance of the learning machine. Traditional statistical approach to estimating models from data is based on parametric estimation. The basic fact that an assumption of an underlying dependency with a simple known parametric form is an ensuing need, limits its applicability in practice. Recent approaches allow a wide class of models of varying complexity to be chosen. The task of learning then amounts to selecting the model of optimal complexity and estimating parameters from training data. Under the SVM approach, the parameters to be usually chosen are the following.The penalty term which determines the tradeoff between the complexity of the decision function and the number of training examples misclassified.The mapping function .The kernel function such that .

5. Experimental Study

In this section, we conduct experiments based on the proposed methodology for recognition of driver identities by analyzing the driving performances. In order to estimate the performance of the proposed system, we invite 7 human subjects to attend the experiment, who are Meng, Ou, Ye, Huang, Wang, Wu, and Shen. They are asked to “drive" on the designed experimental platform individually. The raw data of their driving behaviors is collected by the Data Sensing and Capturing Subsystem. The data recorded is to be analyzed by the Data Analysis Subsystem aforementioned. Our objective is to identify the driving data by trained SVM models. We use the accuracy rate of the SVM classifications to evaluate the performance of the proposed system. The experimental results of applying different data preprocessing methods and choosing different parameters of SVM when modeling human driving behaviors are shown in what follows.

5.1. Preprocess Data Analysis

In the first series of experiments, we run different data preprocessing methods for the optimization. The raw data is captured at a rate of 10 Hz and overlapping windows are applied on the data to cut the data into segments. Each segment is 40 seconds long and can be considered as a matrix of size .

We then apply FFT, PCA, and ICA to reduce the input size to the SVM for classification. The following steps are performed on each data segment.Apply FFT of order 20 to transform each column of data of size 400 into 20, so the result retrieved is a matrix of size , as each data segment. Then, align the data into a single row as a vector.Divide the raw data matrix into 10 parts by time sequence and align these matrix to a matrix. Extract two features from the gained data matrix using PCA. With 2 PCs, a feature retrieved is aligned to form a vector.Divide the raw data matrix into 10 parts by time sequence and align these matrix to a matrix. Extract two features from the gained data matrix using ICA. With 2 ICs, a feature retrieved is aligned to form a vector.

We compare the data preprocessing using PCA and ICA with FFT. We simply train an SVM to distinguish one tester from all testing data, which is Meng's, to evaluate the performances of three methods of feature selection and data reduction. We have 2 groups of data containing 348 raw data segments totally, 104 segments representing the behaviors of driver Meng (the authorized driver) and 244 segments representing non Meng (the unauthorized drivers). These segments are sent to SVM for training and testing. Due to the aforementioned rules, each segment is processed to a vector as the input to SVM.

Three data preprocessing methods are tested independently and the SVM testing results are shown in Table 1 for comparing the ability of feature selection of each method. As seen, FFT achieves the best performance among those three feature selections and data reduction methods, so we choose FFT as the data preprocessing for the further procedures.

Next we examine the different parameters of FFT to the classification results. Table 2 shows the test results of classification using different sampling times (length of data segment) and different FFT orders (size of input vector). Different sampling times from 10 seconds to 160 seconds and different FFT orders of 5, 10, and 20 are conducted. When using the sampling time and FFT orders , it means each data segment is a matrix of size as the original sampling rate of the hardware is 10 Hz. The FFT transforms each column of data of size into and the result retrieved is a matrix of size . Then the data is aligned into a single row as a vector for the SVM training and testing.

Table 2 shows the average success rate of identifying driver Meng from all testing data, using different sampling times and FFT orders. Different combination of sampling time and FFT order affects different classification rates. Shorter sampling time and smaller FFT order may cause the loss of important features involved in the human behavior signal, while longer time and larger order may lead to a mixed redundant signal to the classification procedure, both of them lower than the recognition rate. By comparing the success rate, sampling time at 40 seconds and FFT order of 20 achieve the best results. In addition, the data segment length of 40 seconds, namely, the time interval of the system examining the current driver, is efficient for not causing huge data segments for computation and providing adequate sampling frequency for testing driving performance in realtime.

5.2. Models Design

Training SVM requires the selection of parameters which influence the ensuing model performance. Therefore, to achieve a good model those parameters have to be chosen correctly. Examples, as stated earlier, are cost function and the mapping function . In the first part of our experiments, we have considered Gaussian radial basis function (RBF) as the kernel function. The RBF kernel is very advantageous in complex nonseparable classification problems due to its ability of nonlinear input mapping.  It has the property that exp, and subsequently (defined as being the kernel width) is an important parameter to be chosen.

In this series of experiments, we run the SVM classifier with several values of and somehow trying to determine which combination of parameters might be the best for a “good” model. That is, the one that could better express the causal relation among variables which govern the quality within the driving platform. This is accessed through the evaluation of performance accuracy. One possible way is to divide the original data into a data training set and into a validation data set for model evaluation. Figure 7 shows the testing accuracy as a function of kernel both parameterized with for identification of driver Meng. Figure 8(a) shows the variation of number of SVs versus and and Figure 8(b) shows the variation of number of learning iterations versus and . The stopping tolerance for solving the optimization problem is set to .

In the second part of our experiments, we consider a polynomial kernel as the kernel function.  It has the property that , where and as the default settings in this experiment, and subsequently (defined as being the kernel width) is an important parameter to be chosen.

Figure 9 shows the testing accuracy as a function of kernel both parameterized with for identification of driver Meng. Figure 10(a) shows the variation of number of SVs versus and and Figure 10(b) shows the variation of number of learning iterations versus and . The stopping tolerance for solving the optimization problem is set to .

From the results shown above, larger corresponds to a smaller number of SVs as well as a larger number of training iterations. Larger corresponds to poorer balanced classification, which means the deflective classifier model causing the lower accuracy on one side of the classification destination but higher accuracy on the other side, as well as the lager doing so. Further explanation is required for these results taking into account both and parameters. For nonseparable data, the penalty term is able to reduce the training errors in the working data set. Therefore, the margin is an indicator of the generalization accuracy. In the absence of a method to compute the best tradeoff between the regularization term and the training errors, the balance sought by the SVMs technique is hard to find. Thus, a larger corresponds to a higher penalty of training errors and clearly overfitting occurs. On the other hand, the higher when the kernel parameter becomes, the greater the variety of the decision boundaries that can be formed originating a more complex model. The added flexibility initially decreases the generalization error as the model can better fit the data. However, there is the danger that this can lead to overfitting as well.

Due to the requirements of the proposed system, we aim to achieve a high classification accuracy as well as low computational consumption. The aim of the identification system is for the vehicle to judge if it is his own driver, so we set the Meng's success rate with higher priority. It is found experimentally that and in the polynomial kernel has the highest success rate for Meng. With the same method, we train 7 SVM models for all 7 human subjects and utilize the SVM network to form a multiple classifier. To test with our testing samples and evaluate with the identification accuracy, we derive the final results in Table 3. As seen, we achieve the average success rate over 85%.

5.3. Discussion

In this section, we demonstrate that SVM is a feasible parametric model for our proposed application. The first aspect investigated is to use preprocessing methods for feature extraction from large human dynamic behavior data for modeling purposes. The extension of the implementation is to the data sets in a larger scale and different methods of problem multiclass formulation. The feature extraction method based on FFT is found to be able to give the best data reduction results compared to PCA and ICA in the presented experiments of modeling human driving behaviors through SVM. FFT establishes a one-to-one mapping between the time domain and the frequency domain and preserves information from the original signal, ensuring that important features are not lost as a result of the transformation. Under the experimental criteria in this paper, FFT is proved to have a better performance to model human dynamic behavior for driver identification than PCA and ICA. Although PCA and ICA are often used for input reduction, it is not always useful because the variance of a signal is not necessarily related to the importance of the variable. Human behavior contains much signals at lower frequencies and FFT can retain the energy at this frequent area but PCA and ICA work bad as there are too many isotropically distributed clusters. By reducing the redundancy in the input data, the training process of the human driving behavior model becomes more efficient. After the unnecessary information is removed from the inputs, not only the key characteristics of the human behavior data can be retained, but also the modeling power of the SVM is actually improved.

Besides choosing preprocessing methods, the SVM model design is an important issue in this section. We have discussed the application of the multiclass SVM's classifiers and compared them with different SVM parameters to identify different drivers. The basic idea of SVM is to determine the structure of the classifier by minimizing the bounds of the training error and generalization error. The SVs close to the boundary decision surface determine the efficacy of the classifier. Based on the results from our application, SVM with polynomial kernel achieves the better performance. Our results demonstrate that the SVMs have the potential to obtain a reliable distinction among our testing human subjects, individual identification can be recognized with the multiclass SVM's classifiers with a success rate of over 85%, which verifies that the proposed SVM modeling method is valid and useful against the vehicle thefts problem.

6. Conclusion and Future Work

In this paper, we focus on the research of utilizing dynamic human behavior models for vehicle security (preventing vehicles from being stolen) application. By learning from driving performances, the intelligent classifier can be embedded into an IC-based car key, through which the vehicle security system can identify the valid drivers based on the ways the vehicle are driven and the drivers behave.

6.1. Conclusions

We proposed the innovative idea on driver identification system for detecting vehicle theft based on dynamic human behaviors. The dynamic and stochastic feature is difficult to be handled by traditional mathematical methods. We compared FFT, PCA and ICA in the data preprocessing, and proved FFT has better performance to process human dynamic behavior. Thereafter, machine learning method based on SVM is applied. We discussed the application of the multi-class SVM classifiers and compared the performance of different SVM parameters. SVM with polynomial kernel performs better than other functions.

6.2. Future Work

Choosing the best parameters, especially if a systematic approach is not used and/or the problem knowledge do not aid for proper selection, can be time consuming since we have to rely upon guessing and trial-and-error techniques. Therefore, an interactive grid search model selection method can be studied, which may further enhance the accuracy.

Acknowledgments

This work is partially supported by Hong Kong RGC CUHK417605, Hong Kong ITF ITP/003/09AP, GHP/006/09SZ, the grant from Key Laboratory of Robotics and Intelligent System, Guangdong Province (2009A060800016), the Knowledge Innovation Program of the Chinese Academy of Sciences Grant No. KGCX2-YW-156, and the grant from Shenzhen Hong Kong Innovation Circle.