Abstract

Vehicle type recognition is a demanding application of wireless sensor networks (WSN). In many cases, sensor nodes detect and recognize vehicles from their acoustic or seismic signals using wavelet based or spectral feature extraction methods. Such methods, while providing convincing results, are quite demanding in computational power and energy and are difficult to implement on low-cost sensor nodes with limitation resources. In this paper, we investigate the use of time encoded signal processing (TESP) algorithm for vehicle type recognition. The conventional TESP algorithm, which is effective for the speech signal feature extraction, however, is not suitable for the vehicle sound signal which is more complex. To solve this problem, an improved time encoded signal processing (ITESP) is proposed as the feature extraction method according to the characteristics of the vehicle sound signal. Recognition procedure is accomplished using the support vector machine (SVM) and the -nearest neighbor (KNN) classifier. The experimental results indicate that the vehicle type recognition system with ITESP features give much better performance compared with the conventional TESP based features.

1. Introduction

Along with the development of communication technology, wireless sensor network (WSN) is playing an increasingly important role in our daily life. Recent advances in wireless communications, electronics, and ubiquitous computing, in combination with intensive research on the field of WSN, have changed the way we interact with the physical environment [13]. Vehicle type recognition, which can be used in intrusion detection, transportation, and border monitoring, is a significant and demanding application of WSN. In most cases, the usage of fast Fourier transform (FFT) [4], wavelet transform (WT) [5, 6], and Hilbert-Huang transform (HHT) [7] to extract the frequency or time-frequency features of the signals acquired from the acoustic and seismic sensors is a common approach for the vehicle recognition. These methods, while providing convincing results, are quite demanding in computational power and energy and are difficult to implement on low-cost sensor nodes with limited resources. In WSN, the sensor nodes usually process signals locally to come to a decision rather than transmitting the measurements. Due to network bandwidth limitation and energy consumption of sensor nodes, we usually wish to use low-complexity and low-energy consumption algorithms to recognize the vehicle type [8]. In this paper, we investigate the use of time-domain encoding and feature extraction methods and propose an improved time encoded signal processing (ITESP) algorithm. The conventional time encoded signal processing (TESP) algorithm, which uses a symbol table with 29 characters to encode the time-domain information of the signal, has performed well in speech recognition [9]. The advantage of the method is its computational simplicity and low memory requirements. However, it turns out that TESP is not suitable for the vehicle sound signal which is more complex. To solve this problem, the improved time encoded signal processing feature extraction method is proposed for the vehicle sound signal recognition. Figure 1 shows the flow chart of vehicle type recognition system based on ITESP algorithm.

The rest of the paper is organized as follows. In Section 2, the methodologies of TESP and ITESP algorithms are discussed, respectively. Section 3 describes the simulation experiments of feature extraction based on ITESP algorithm. The experimental results on the recognition performance of the feature extraction methods are discussed in Section 4. Finally, Section 5 presents discussion and conclusion.

2. Methodology

2.1. TESP Algorithm

TESP is a digital language that originated as a means of coding signals for speech recognition, and it describes signal waveforms according to its real and complex zeros based on a mathematical waveforms representation. TESP quantisation procedure has been developed to encode signals according to the period between two consecutive zero-crossings and the shape of the curve thus contained [10]. This period is named an epoch. The TESP procedure can be described using four simple steps.

Step 1. Divide the signal into successive epochs.

Step 2. Characterize each epoch with two descriptors, duration and shape, as follows.(i)Duration () which is the number of samples between two successive real zeros and provides information on the fundamental frequency of the waveform.(ii)Shape () which is the number of local minima (for a positive epoch) or the number of local maxima (for a negative epoch). The shape of an epoch contains harmonic information of the signal.

Step 3. Map each epoch, from its corresponding / descriptors, to a predefined symbol table.
The encoding procedure results in the mapping of every epoch of the waveform in a two-dimensional space with dimensions of . This bidimensional space can get very large and depends on the bandwidth and the complexity of the signal. To reduce the number of descriptors needed, a quantization method is used to create a one-dimensional symbol stream from the two-dimensional space. / pairs are mapped to a character using a symbol table created beforehand, to approximate the / space using fewer characters [11].

Step 4. Create a fixed-dimensions matrix containing the appearance probability of each symbol in the entire waveform. This matrix will be used for the recognition task.
To make the methodology more clear, here we give an example [10]. Figure 2 shows an epoch encoded into its TESP parameters where and .

The encoding of a waveform using the aforementioned coding scheme results in a one-dimensional symbol stream. This symbol stream can be further manipulated to create a one-dimensional -size matrix (where is the total number of the characters in the symbol table) which contains the number of appearances of each symbol in the symbol stream, called -matrix. It can be created using the following expression: where , , and represent the th element of -matrix, the th epoch of the signal, and the number of total epochs in the waveform, respectively. is the symbol describing the th epoch:

2.2. ITESP Algorithm

In conventional TESP algorithm, the standard symbol table used in Step 3, which contains 29 characters, has been found to be sufficient for speech signals description, but may not be suitable for the vehicle sound signal which is more complex. In this paper, according to the characteristics of the vehicle sound signals, an extensional symbol table with 40 characters is designed, and then based on the symbol stream which is encoded by the symbol table, the one-dimensional -matrix is constructed using the appearance probability of each symbol. Meanwhile, using the appearance probability of the two identical consecutive symbols, the two-dimensional -matrix is constructed as well in order to obtain more accurate features of the signal. The flow chart of feature extraction procedure based on ITESP algorithm is shown in Figure 3.

3. Simulation Experiment of Feature Extraction

3.1. Data Acquisition

In this paper, the sound signals of two typical vehicles types (wheeled vehicles and tracked vehicles) are selected as samples to evaluate the performance of the feature extraction method. The wheeled vehicles sound signals were recorded during a real world WSN experiment at Chengdu, China, and the data set was gathered from 15 microphone sensors which were deployed at three different roads. All the signals studied were sampled at 22050 Hz and quantized with 8 bits per sample. Since most of the tracked vehicles are military vehicles which are difficult for acquisition in real environment, the sound signals of tracked vehicle were downloaded from the sensor website. The sensor website indicates that the tracked vehicle data is gathered from two different sensor types: geophone and microphone, each sampled at 4096 Hz with 16-bit accuracy. Figure 4 shows the comparison between the two types of vehicle signal in time domain.

3.2. Signal Preprocessing

To reduce the complexity of data processing, the downsampling frequency of 4096 Hz is firstly employed. Moreover, the vehicle sound signals must be filtered before the encoding procedure for three main reasons as follows.(1)To minimize the number of symbols needed for the symbol table by keeping only the important frequency range of the signal. In this way the dimensions of the -matrix are minimized.(2)To eliminate high frequency “flicker” on the waveform which can be translated to local minima or maxima inside an epoch, thus increasing its descriptor.(3)To prevent the introduction of quantization noise.

By analyzing the main noise source of the sound signals, we find that the frequencies of the vehicle sounds are mainly below 800 Hz. Therefore, an 800 Hz low-pass Butterworth filter is employed accordingly. Furthermore, frequency components below 50 Hz for the vehicle sound signals are not very important in recognition task [12]. Such low frequency signals can be ignored without decreasing performance, which can significantly reduce the maximum value of descriptor.

3.3. Feature Extraction Based on ITESP
3.3.1. One-Dimensional -Matrix

The recognition performance and the -matrix length depend on the symbol table used. In most cases of speech recognition, a standard 29-character symbol table shown in Table 1 is employed, allowing for a maximum of 35 and a maximum of 5. The standard symbol table is optimized for speech signals. However, its performance in vehicle sound signals should be examined. Figure 5 shows the -matrices of the two types of vehicle sound signal based on 29-character symbol table.

As can be seen from Figure 5, the differences between the appearance probabilities for the two-type sound signals are small and scattered, which could not be used to obtain high recognition rate in theory. The main reason of the undesirable -matrices is that the standard 29-character symbol table is not suitable for the vehicle sound signal which is quite different from the speech signal in frequency distribution. Compared with the low-frequency part of speech signal, its counterpart of vehicle sound signal is lower, which means more is needed to describe the waveforms. Besides, the vehicle sound signal contains more harmonic component; consequently, the numeric values of should be a little larger. According to the characteristics of the vehicle sound signal, an extensional symbol table using 40 characters is designed to obtain more effective -matrix. Table 2 shows the extensional 40-character symbol table.

Compared with Table 1, the characters in Table 2 change more frequently with the increase of and , resulting in higher separability of different / descriptors. For the vehicle sound signal that contains more harmonic information, using the extensional symbol table can obtain more time-domain features of the signal. The -matrices of the two types of vehicle sound signal based on 40-character symbol table are shown in Figure 6.

Figure 6 indicates that, after encoding the signal using the 40-character symbol table, the -matrix possesses enough difference from the different signals. Thus, we can assume that -matrices abstracted from wheeled vehicles and tracked vehicles are different enough to enable recognition.

3.3.2. Two-Dimensional -Matrix

Using the extensional 40-character symbol table, the sound signal is encoded and a one-dimensional symbol stream is generated accordingly. In Section 3.3.1, the -matrix is obtained by calculating the appearance probability of each symbol. In order to obtain more accurate time-domain features of the signal, we use the appearance probability of the two identical consecutive symbols to construct the two-dimensional -matrix. Figure 7 shows the -matrices features distribution of the two types of vehicle sound signal, where the -axis and -axis represent the symbols while the -axis represents the appearance probability.

As can be seen from Figure 7, the two-dimensional -matrices of the wheeled and tracked vehicle sound signals present obvious feature distribution differences from each other. Compared with the one-dimensional -matrix, -matrix not only shows the probability features of each symbol but also presents the spatial probability features of the symbols. Thus, more accurate time-domain features of the signal are obtained, which can further improve the recognition performance.

4. Recognition Experimental Results

After the extraction of the time-domain features from -matrix and -matrix based on ITESP, respectively, the classifier is employed to evaluate the performance of the proposed algorithms. In this paper, we design two classifiers for comparison, that is, -nearest neighbor (KNN) classifier and support machine vector (SVM).

4.1. -Nearest Neighbor Algorithm

In pattern recognition, the KNN algorithm is a nonparametric method for classifying objects based on closest training examples in the feature space [13]. KNN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. The KNN algorithm is amongst the simplest of all machine learning algorithms: an object is classified by a majority vote of its neighbors, with the object being assigned to the class most common amongst its -nearest neighbors ( is a positive integer, typically small). If , then the object is simply assigned to the class of that single nearest neighbor. KNN algorithm has been extensively studied in the field of computational geometry and used in many applications [14].

4.2. Support Vector Machine

SVM, originally introduced by Vapnik, has been shown to be effective in learning linear and nonlinear decision boundaries and is successfully used in many applications [1518]. SVM performs classification by constructing an -dimensional hyperplane that optimally separates the data into two categories. SVM models are closely related to neural networks. In fact, a SVM model using a sigmoid kernel function is equivalent to a two-layer, perceptron neural network. SVM has been used very successfully in recent years as a substitute to neural networks. A basic SVM can handle only two-class classification. To use SVM in multiclass classification, the problem must be broken down into several two-class classification tasks. The effectiveness of SVM depends on the selection of kernel, the kernel’s parameters, and soft margin parameter . A common choice is a Gaussian kernel, which has a single parameter . In this paper, the SVM algorithm is implemented by using the Gaussian kernel defined by where is a user-defined variance parameter. The best combination of and is often selected by a grid search with exponentially growing sequences of and . Typically, each combination of parameter choices is checked using cross validation, and the parameters with best cross validation accuracy are picked. The final model, which is used for testing and for classifying new data, is then trained on the whole training set using the selected parameters.

4.3. Recognition Results

In the recognition experiments, for the wheeled vehicle type, a heavy wheeled truck and a sedan car are recorded while moving on the roads for multiple times. The sound signals of tracked vehicle type we downloaded from the sensor website also involve different types. After that, we selected 80 signals for each vehicle type as the recognition experimental dataset which is randomly divided into two subsets: 30 for training and the other 50 for validation to study the recognition performance. The recognition rate is defined as the percentage ratio of the number of vehicle sounds correctly recognized to the total number of sounds considered for recognition. To evaluate the performance of the proposed ITESP algorithm, we take the conventional TESP algorithm based on 29-character symbol table for comparison. The computational time is measured on a laptop with an Intel i3-2310M 2.1 GHz processor using MATLAB commands tic and toc. Tables 3 and 4 show the comparison of the recognition results using different feature extraction methods based on KNN classifier and SVM classifier, respectively.

From Tables 3 and 4, we can see that, for the two types of classifiers, the SVM exhibits better performance than KNN in both recognition rate and computational time, indicating that SVM classifier is more suitable for vehicle sound recognition system. On the other hand, for the three feature extraction methods, the ITESP algorithm (including 40-character based -matrix and 40-character based -matrix) obtains much higher recognition rate than TESP algorithm. Compared with the recognition rate based on the conventional TESP algorithm (51%), the recognition rates using the proposed -matrix and -matrix are up to 84% and 87%, respectively. However, the computational time based on -matrix (2.37 s) is longer than that of -matrix (0.79 s). Therefore, the selection between the -matrix and the -matrix as the feature extraction method should depend on the actual demand of the recognition system for the recognition rate or the computational time.

However, as to the entire vehicle type recognition WSN system, the sensor network lifetime mainly depends on the energy consumption due to the difficulty in charging batteries [19]. Next, in the same sensor network, we compare the energy performances of the network that using the time-domain signal processing method based on ITESP algorithm and using the time-frequency domain signal processing method based on wavelet transform (WT). The initial energy of the network is set to 200 J. The comparison of the residual energy with the network lifetime is shown in Figure 8.

Figure 8 shows that the energy consumption ratio of the WSN system using WT is much quicker than that using the ITESP algorithm, indicating that the vehicle type recognition WSN system based on ITESP can save network energy and extend the life cycle of network more effectively.

5. Conclusion

In this paper, we presented a promising method for vehicle type recognition, using the improved time encoded signal processing (ITESP) for signal encoding and support vector machine for classification. Aiming at the problem that the conventional TESP algorithm was effective for the speech signal feature extraction but not suitable for the vehicle sound signal, we designed an extensional 40-character symbol table, and then used the symbol stream which was encoded by the symbol table to construct one-dimensional -matrix and two-dimensional -matrix, respectively, as the time-domain features of the sound signal. Experimental results indicated that the ITESP methods provided higher recognition rates between two types of vehicles (wheeled type and tracked type) using their sound signature. Compared with the feature extraction method based on time-frequency domain analysis (WT), the ITESP algorithm needs less energy consumption. Our future work will focus on the recognition performance using more vehicle types, hardware implementation of the classifiers on the prototype sensor nodes, and field testing of the system.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by the Grants from the National Science Foundation of China (nos. 61272448, 61302028), STMSP project (no. 2012RZ0005), and the Foundation of Sichuan University Early Career Researcher Award (nos. 2012SCU11036, 2012SCU11070).