Abstract

Feature extraction method using Mel frequency cepstrum coefficients (MFCC) based on acoustic vector sensor is researched in the paper. Signals of pressure are simulated as well as particle velocity of underwater target, and the features of underwater target using MFCC are extracted to verify the feasibility of the method. The experiment of feature extraction of two kinds of underwater targets is carried out, and these underwater targets are classified and recognized by Backpropagation (BP) neural network using fusion of multi-information. Results of the research show that MFCC, first-order differential MFCC, and second-order differential MFCC features could be used as effective features to recognize those underwater targets and the recognition rate, which using the particle velocity signal is higher than that using the pressure signal, could be improved by using fusion features.

1. Introduction

The underwater target recognition is the key technology for submarines and surface ships. It is the premise of effectively confronting with the enemy [1].

Feature extraction is not only the key process of underwater target classification and recognition, but also an important area of underwater acoustic signal processing. Auditory feature extraction method based on the nonstationary speech signal has been developed and those satisfying results in speech signal processing have been obtained [2]. Recently, auditory feature extraction has been applied to feature extraction of underwater target radiated noise and higher target recognition probability has been obtained [3].

In this paper, auditory feature extraction method combined with acoustic vector sensor technology for underwater target recognition is researched. The higher recognition rate is obtained with fusion of the Mel frequency cepstrum coefficients (MFCC), first-order differential MFCC features, and second-order differential MFCC features by using Fisher criterion with correlated distance (fdc). For testing the validity of the proposed method, an experiment for getting the sound information of two different kinds of underwater targets with the use of a single acoustic vector sensor is carried out. The paper is organized as follows. A theoretical treatment of the proposed method, which includes the simulation of the sound pressure and particle velocity signals of underwater targets, MFCC model, and fdc criterion theory, is presented in Section 2. The simulation analysis of proposed method is provided in Section 3. Those sound feature extraction results of two different kinds of underwater targets and target identification results using Artificial Neural Networks are given in Section 4. In Section 5, results will be summarized and suggestions will be made for future research efforts.

2. Theory

2.1. The Modeling and Simulation of the Scalar and Vector Signal of Underwater Target

The spectrum of ship radiated noise is composed of line and continuous spectrum. Line spectrum is mainly concentrated below 1.5 kHz and often is used as the basic features to distinguish underwater vehicles [4]. Those main sound sources with line spectrum features are mechanical noise and propeller noise with periodic characteristic; therefore the periodic signal is often used to model line spectrum [5].

The mathematic expression of the acoustic pressure is given aswhere is the line spectrum number, is the angular frequency, is the sound propagation speed in water, and is the distance between the acoustic vector sensor and the sound source.

Based on the Euler formula , the particle velocity can be expressed aswhere represents the density of water. According to above theoretical analysis, two kinds of acoustic signals of underwater targets are simulated. These line spectrum frequencies of signal for the first target are 42 Hz, 109 Hz, 274 Hz, 216 Hz, 320 Hz, 442 Hz, and 549 Hz. Line spectrum frequencies of signal for the second target are 90 Hz, 186 Hz, 288 Hz, 381 Hz, 478 Hz, 655 Hz, 900 Hz and sampling frequency is 5 kHz. The signal amplitude of each line spectrum is randomly distributed in the range 0.6~1 Pa. The noise is the white Gaussian noise with the same bandwidth as signal. The particle velocity has dipole directivity so the SNR can be improved by using particle velocity comparing with the use of the sound pressure. The power spectrum of sound pressure and particle velocity signals of two kinds of targets are shown, respectively, in Figures 1 and 2.

2.2. MFCC Model

Human ear can easily distinguish various sound auditory features which better represents the features of the various acoustic signals [6]. The method of auditory features extraction based on the Mel frequency cepstrum coefficient (MFCC) has been put forward.

The relation between Mel frequency and the actual frequency of signal can be expressed asThe unit of the actual frequency is Hz. According to Zwicker and Fastl [7], the critical frequency bandwidth varies with the change of frequency, which is consistent with the increase of Mel frequency. Critical frequency bandwidth is approximately linear distribution with bandwidth of 100 Hz below 1000 Hz, while a logarithmic increase is shown above 1000 Hz. A series of bandpass filters form Mel filter group.

The steps of MFCC calculation are as follows. Firstly, the FFT operation is made to get frequency domain signal and further to get the energy spectrum. Secondly, the logarithm for the output of the Mel filter is obtained and the discrete cosine transform (DCT) is utilized to achieve MFCC.

First-order differential Mel frequency cepstrum coefficients (ΔMFCC) are calculated as follows:where is the frame of the first-order differential Mel frequency cepstrum coefficients, is the frame of Mel frequency cepstrum coefficients, and is a constant which is usually took 2 [3].

Using the same method, the second-order differential Mel frequency (ΔΔMFCC) cepstrum coefficients are calculated as follows:where is the second-order differential Mel frequency cepstrum coefficients.

2.3. Fisher-Ratio and Correlated Distance Criterion

fdc is the feature selection method which is combined with the traditional Fisher-ratio criterion and the features of each dimension correlation distance [8].

The definition of data concentration has categories which are , and samples are included in each category, where , , and represent sample , respectively, the th class sample average, and all samples mean values in the th dimension. Each dimension of the Fisher criterion can be expressed aswhere represents the interclass variance of th the dimensions feature in the training sample set, and represents within the class variance

Equation (9) is the expression of the th dimension feature within the class variance, and (11) is the expression of the interclass variance of the th dimension feature.

The distance of the vector and is defined as follows:

For dimensions feature vector, each dimension correlation distance is defined as follows:where and represent the th dimension and the th dimension of the th class, respectively.

The th dimension feature correlation distance of all classes is defined as follows:fdc criterion is defined as follows:

3. Simulation

Sound pressure signal and particle velocity signal of two types of underwater vehicles which have been simulated in Section 2.1 are applied to MFCC feature extraction. There are 165 sound pressure data samples and 165 particle velocity data samples with 6000 snapshots in each sample. 30 bandpass filters formed Mel filter group when MFCC features are extracted. The results of MFCC features extraction are shown in Figures 3 and 4.

An acoustic vector sensor can simultaneously measure the acoustic pressure and particle velocity components at one point of sound field. Lately, acoustic vector sensor technology has been widely used in underwater noise measurement, underwater target detection, and noise source location [9].

The results of MFCC features extraction of sound pressure signals of two kinds of vehicles are demonstrated in Figure 3, where Figure 3(a) is the MFCC features of 165 samples of the first kind of vehicle signal, -axis is the dimension number of MFCC, -axis is the sample number, and -axis is the amplitude of MFCC features. Specifications of each coordinate in the following maps are the same as Figure 3(a), which shows that the feature vectors of these 165 samples of the first kind of vehicle are similar. Figure 3(b) is the MFCC features of 165 samples of the second kinds of vehicle signal, which shows that the feature vector of these 165 samples of the second kind of vehicle is similar, too. Comparing Figure 3(a) with Figure 3(b), the MFCC features of different vehicles are quite different, which can be used to distinguish underwater targets. Figures 3(c) and 3(d) demonstrate the results of ΔMFCC features of the first kind and second kind of vehicle, with 165 samples, respectively. Figures 3(e) and 3(f) demonstrate, respectively, results of ΔΔMFCC features of the first kind and second kind of vehicle, with 165 samples. Figures 3(a), 3(c), and 3(e) are MFCC, ΔMFCC, and ΔΔMFCC features of the first kind vehicle, respectively. Figures 3(b), 3(d), and 3(f) are MFCC, ΔMFCC, and ΔΔMFCC features of the second kind vehicle, respectively. Multiple dimension features are obtained by using multiple methods, which can be fused to distinguish and recognize those targets.

Figure 4 demonstrates results of MFCC feature extraction of two kinds of vehicles by using particle velocity signals instead of sound pressure signals, where Figures 4(a), 4(c), and 4(e) are results of MFCC, ΔMFCC, and ΔΔMFCC features of the first kind of vehicle by using particle velocity signals with 165 samples, respectively, and Figures 4(b), 4(d), and 4(f) are results of MFCC, ΔMFCC, and ΔΔMFCC features of the second kind of vehicle by using particle velocity signals with 165 samples, respectively. MFCC features of each sample are more similar than those using sound pressure signals, because SNR is improved due to dipole directivity of particle velocity.

Figures 3 and 4 show that the same class MFCC features of the same kind underwater vehicles are relatively consistent and MFCC features of different underwater vehicles are quite different. We also can use the value of the and to quantify the similarity of different samples of one kind of vehicle’s signals, as well as the difference between different vehicles’ signals. The within class variance of one kind of vehicle and the interclass variance of two different kinds of vehicle are shown in Figure 5. Figure 5 shows obviously that the interclass variance is bigger than the within class variance. So MFCC features can be used to distinguish different targets.

Therefore MFCC features can be used to distinguish these different underwater targets. Singular values of MFCC features based on particle velocity signal are smaller than those based on sound pressure signal, which benefits recognition of different underwater targets.

For testing the effectiveness of MFCC features. Two kinds of underwater vehicles, whose pressure and particle velocity signals are simulated, are distinguished by using BP neural network. With a single hidden layer network in BP neural network is selected [10]. The number of input layer nodes is the feature dimension, and the number of hidden layer nodes is performed by taking (16), and output layer nodes is 2where is the number of hidden layer nodes, is the number of input layer nodes, is the number of output layer nodes, and is the adjusting constant whose value ranges from 0 to 10 [10]. Half of the samples are randomly selected as the training sets and the remaining as the test sets.

The 20-dimension MFCC features, 20-dimension first-order differential MFCC features, and 20-dimension second-order differential MFCC features are combined to obtain 60-dimension features used for recognition. The recognition probability is shown under different SNR in Table 1.

Feature fusion process is carried on for improving the performance of targets’ recognition. The feature fusion process is shown in Figure 6.

The 20-dimension MFCC features, 20-dimension first-order differential MFCC features, and 20-dimension second-order differential MFCC features, which are extracted above, are optimized disposed by fdc. The fdc of the first 5-dimension are obtained. The first 5-dimension of MFCC features, first-order differential MFCC features, and second-order differential MFCC features are combined to form 15-dimensional joint features for recognition.

Features of 15-dimension samples are sent into BP neural network classifier to identify those targets and identification results are shown in Table 2.

Comparing between Tables 1 and 2, the recognition probability is improved by using 15-dimensional fusion features than that using 60-dimensional fusion.

4. Experimental Data Analysis

The experiment for targets recognition based on a single acoustic vector sensor was carried on in Songhua Lake of Jilin province in China. Radiation noises of two kinds of underwater vehicles, including ship and speed boat, are measured, respectively, by using a single of acoustic vector sensor. The data of ship radiation noise is divided into 182 frames and the data of speed boat radiation noise is divided into 239 frames with 6000 snapshots in each frame. MFCC features are, respectively, extracted from using every frame data of sound pressure and particle velocity signals of ship and speed boat. There are 30 bandpass filters in Mel filter group. Feature extraction results by using sound pressure signal and particle velocity signal are shown in Figures 7 and 8, respectively.

Figure 7 demonstrates MFCC features of sound pressure signals of ship and speed boat, where Figure 7(a) shows MFCC features of 182 frames of the ship signal. Figure 7(a) shows that feature vectors of 182 frames of the ship are almost similar. Comparing with simulation results, the similarity of different feature vector obtained by using experimental data different is worse with the effect of noise. Figure 3(b) is the MFCC features of 239 frames of the speed boat signal, which shows that the feature vector of 239 frames of the speed boat is mainly similar. Comparing Figures 7(a) and 7(b), MFCC features of different vehicles are quite different, which can be used to distinguish underwater targets. Figures 7(c) and 7(d) demonstrate results of ΔMFCC features of the ship and speed boat signal, respectively. Figures 7(e) and 7(f) demonstrate results of ΔΔMFCC features of the ship and speed boat signal, respectively. Figures 7(a), 7(c), and 7(e) are MFCC, ΔMFCC, and ΔΔMFCC features of the ship signal, respectively. Figures 3(b), 3(d), and 3(f) are MFCC, ΔMFCC, and ΔΔMFCC features of the speed boat signal, respectively. Multiple dimension features are obtained from experimental data by using multiple methods, which can be fused to distinguish and recognize those targets.

Figure 8 demonstrates results of MFCC features extraction of ship and speed boat by using the particle velocity signals instead of sound pressure signals, where Figures 8(a), 8(c), and 8(e) are results of MFCC, ΔMFCC, and ΔΔMFCC features of the ship by using particle velocity signals with 182 frames, respectively, and Figures 8(b), 8(d), and 8(f) are results of MFCC, ΔMFCC, and ΔΔMFCC features of the speed boat by using particle velocity signals with 239 frames, respectively. MFCC features of each frame are more similar than those using sound pressure signals, because SNR is improved due to dipole directivity of particle velocity. Comparing Figure 8(a) with Figure 8(b), comparing Figure 8(c) with Figure 8(d), and comparing Figure 8(e) with Figure 8(f), the features extracted from the different particle velocity signals can be used to recognize the different underwater targets.

As seen from above, MFCC, first-order differential MFCC, and second-order differential MFCC features can be used as effective features of underwater target recognition. The fusion features based on fdc criterion with three kinds of features are researched. These results are shown in Figure 9.

Figure 9 shows that the fdc of the different features, which includes MFCC, first-order differential MFCC, and second-order differential MFCC, are different.

MFCC features of sound pressure and particle velocity signals of ship and speed boat which have been mentioned above are, respectively, used to recognize targets by using BP Neural Networks method.

Half of the samples are randomly selected as the training sets and the remaining as the test sets. The identification results by using sound pressure and particle velocity are shown in Tables 3 and 4, respectively.

When SNR is 0 dB, Tables 3 and 4 show that the recognition probability is higher under higher SNR. When SNR is −5 dB, the recognition probability is more than 85%; when the SNR is −10 dB, the recognition probability decreases greatly. Meanwhile, the recognition probability using MFCC features based on particle velocity signal is higher than that based on sound pressure signals.

Features of 15-dimension sample are sent into BP neural network classifier to identify those targets. The original signals in experiment are strong enough, which are thought to be no noise interference, and the recognition probability is 100%. The white Gaussian noise is added to the real data to achieve different SNR for testing the performance of recognition under different SNR.

Figure 10 demonstrates the recognition probability by using MFCC, first-order differential MFCC, second-order differential MFCC features, and fusion features under different SNR. The identification probability using fusion features is higher than that with the use of one kind of these three features, and the recognition probability using particle velocity signal is higher than that using sound pressure signal.

Identification results using fusion features of sound pressure and particle velocity signals, as well as the combination of sound pressure and particle velocity signals, are shown in Table 5.

Table 5 shows that the recognition probability by using particle velocity signal is higher than that with the use of sound pressure signal, and combination processing of sound pressure and particle velocity future improved the recognition probability of underwater targets.

5. Conclusion

In this paper, MFCC feature extraction using sound pressure and particle velocity signals are researched. Conclusions are summarized as follows.

Firstly, MFCC feature, first-order-differential MFCC feature, and second-order differential MFCC feature can be used as the effective feature of the underwater target identification from the feature extraction and recognition results.

Secondly, by calculating the Fisher-ratio and correlated distance, it can be found that the contribution of each dimension feature is different, and those three features fused by using fdc criterion can improve the recognition probability of underwater target signal.

Thirdly, the recognition probability by using particle velocity signal is higher than that with the use of sound pressure signal. Scalar and vector signals of underwater target signal can be used for feature extraction in order to improve the performance of target recognition.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This paper is supported by the National Natural Science Foundation of China (Grant no. 11674075).