Table of Contents Author Guidelines Submit a Manuscript
Journal of Sensors
Volume 2016, Article ID 7864213, 11 pages
http://dx.doi.org/10.1155/2016/7864213
Research Article

Feature Extraction of Underwater Target Signal Using Mel Frequency Cepstrum Coefficients Based on Acoustic Vector Sensor

Lanyue Zhang,1,2 Di Wu,1,2 Xue Han,1,2 and Zhongrui Zhu1,2

1Science and Technology on Underwater Acoustic Laboratory, Harbin Engineering University, Harbin 150001, China
2College of Underwater Acoustic Engineering, Harbin Engineering University, Harbin 150001, China

Received 31 May 2016; Revised 8 September 2016; Accepted 11 October 2016

Academic Editor: Andreas Schütze

Copyright © 2016 Lanyue Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Feature extraction method using Mel frequency cepstrum coefficients (MFCC) based on acoustic vector sensor is researched in the paper. Signals of pressure are simulated as well as particle velocity of underwater target, and the features of underwater target using MFCC are extracted to verify the feasibility of the method. The experiment of feature extraction of two kinds of underwater targets is carried out, and these underwater targets are classified and recognized by Backpropagation (BP) neural network using fusion of multi-information. Results of the research show that MFCC, first-order differential MFCC, and second-order differential MFCC features could be used as effective features to recognize those underwater targets and the recognition rate, which using the particle velocity signal is higher than that using the pressure signal, could be improved by using fusion features.

1. Introduction

The underwater target recognition is the key technology for submarines and surface ships. It is the premise of effectively confronting with the enemy [1].

Feature extraction is not only the key process of underwater target classification and recognition, but also an important area of underwater acoustic signal processing. Auditory feature extraction method based on the nonstationary speech signal has been developed and those satisfying results in speech signal processing have been obtained [2]. Recently, auditory feature extraction has been applied to feature extraction of underwater target radiated noise and higher target recognition probability has been obtained [3].

In this paper, auditory feature extraction method combined with acoustic vector sensor technology for underwater target recognition is researched. The higher recognition rate is obtained with fusion of the Mel frequency cepstrum coefficients (MFCC), first-order differential MFCC features, and second-order differential MFCC features by using Fisher criterion with correlated distance (fdc). For testing the validity of the proposed method, an experiment for getting the sound information of two different kinds of underwater targets with the use of a single acoustic vector sensor is carried out. The paper is organized as follows. A theoretical treatment of the proposed method, which includes the simulation of the sound pressure and particle velocity signals of underwater targets, MFCC model, and fdc criterion theory, is presented in Section 2. The simulation analysis of proposed method is provided in Section 3. Those sound feature extraction results of two different kinds of underwater targets and target identification results using Artificial Neural Networks are given in Section 4. In Section 5, results will be summarized and suggestions will be made for future research efforts.

2. Theory

2.1. The Modeling and Simulation of the Scalar and Vector Signal of Underwater Target

The spectrum of ship radiated noise is composed of line and continuous spectrum. Line spectrum is mainly concentrated below 1.5 kHz and often is used as the basic features to distinguish underwater vehicles [4]. Those main sound sources with line spectrum features are mechanical noise and propeller noise with periodic characteristic; therefore the periodic signal is often used to model line spectrum [5].

The mathematic expression of the acoustic pressure is given aswhere is the line spectrum number, is the angular frequency, is the sound propagation speed in water, and is the distance between the acoustic vector sensor and the sound source.

Based on the Euler formula , the particle velocity can be expressed aswhere represents the density of water. According to above theoretical analysis, two kinds of acoustic signals of underwater targets are simulated. These line spectrum frequencies of signal for the first target are 42 Hz, 109 Hz, 274 Hz, 216 Hz, 320 Hz, 442 Hz, and 549 Hz. Line spectrum frequencies of signal for the second target are 90 Hz, 186 Hz, 288 Hz, 381 Hz, 478 Hz, 655 Hz, 900 Hz and sampling frequency is 5 kHz. The signal amplitude of each line spectrum is randomly distributed in the range 0.6~1 Pa. The noise is the white Gaussian noise with the same bandwidth as signal. The particle velocity has dipole directivity so the SNR can be improved by using particle velocity comparing with the use of the sound pressure. The power spectrum of sound pressure and particle velocity signals of two kinds of targets are shown, respectively, in Figures 1 and 2.

Figure 1: The power spectrum diagram of the first kind of underwater vehicle. (a) The power spectrum of the sound pressure signal. (b) The power spectrum of the particle velocity signal.
Figure 2: The power spectrum diagram of the second kind of underwater vehicle. (a) The power spectrum of the sound pressure signal. (b) The power spectrum of the particle velocity signal.
2.2. MFCC Model

Human ear can easily distinguish various sound auditory features which better represents the features of the various acoustic signals [6]. The method of auditory features extraction based on the Mel frequency cepstrum coefficient (MFCC) has been put forward.

The relation between Mel frequency and the actual frequency of signal can be expressed asThe unit of the actual frequency is Hz. According to Zwicker and Fastl [7], the critical frequency bandwidth varies with the change of frequency, which is consistent with the increase of Mel frequency. Critical frequency bandwidth is approximately linear distribution with bandwidth of 100 Hz below 1000 Hz, while a logarithmic increase is shown above 1000 Hz. A series of bandpass filters form Mel filter group.

The steps of MFCC calculation are as follows. Firstly, the FFT operation is made to get frequency domain signal and further to get the energy spectrum. Secondly, the logarithm for the output of the Mel filter is obtained and the discrete cosine transform (DCT) is utilized to achieve MFCC.

First-order differential Mel frequency cepstrum coefficients (ΔMFCC) are calculated as follows:where is the frame of the first-order differential Mel frequency cepstrum coefficients, is the frame of Mel frequency cepstrum coefficients, and is a constant which is usually took 2 [3].

Using the same method, the second-order differential Mel frequency (ΔΔMFCC) cepstrum coefficients are calculated as follows:where is the second-order differential Mel frequency cepstrum coefficients.

2.3. Fisher-Ratio and Correlated Distance Criterion

fdc is the feature selection method which is combined with the traditional Fisher-ratio criterion and the features of each dimension correlation distance [8].

The definition of data concentration has categories which are , and samples are included in each category, where , , and represent sample , respectively, the th class sample average, and all samples mean values in the th dimension. Each dimension of the Fisher criterion can be expressed aswhere represents the interclass variance of th the dimensions feature in the training sample set, and represents within the class variance

Equation (9) is the expression of the th dimension feature within the class variance, and (11) is the expression of the interclass variance of the th dimension feature.

The distance of the vector and is defined as follows:

For dimensions feature vector, each dimension correlation distance is defined as follows:where and represent the th dimension and the th dimension of the th class, respectively.

The th dimension feature correlation distance of all classes is defined as follows:fdc criterion is defined as follows:

3. Simulation

Sound pressure signal and particle velocity signal of two types of underwater vehicles which have been simulated in Section 2.1 are applied to MFCC feature extraction. There are 165 sound pressure data samples and 165 particle velocity data samples with 6000 snapshots in each sample. 30 bandpass filters formed Mel filter group when MFCC features are extracted. The results of MFCC features extraction are shown in Figures 3 and 4.

Figure 3: The MFCC features of sound pressure simulation signals of two kinds of vehicles. (a) The MFCC features of the first kind of vehicle. (b) The MFCC features of the second kind of vehicle. (c) The ΔMFCC features of the first kind of vehicle. (d) The ΔMFCC features of the second kind vehicle. (e) The ΔΔMFCC features of the first kind of vehicle. (f) The ΔΔMFCC features of the second kind of vehicle.
Figure 4: The MFCC features of particle velocity simulation signals of two kinds of vehicles. (a) The MFCC features of the first kind of vehicle. (b) The MFCC features of the second kind of vehicle. (c) The ΔMFCC features of the first kind of vehicle. (d) The ΔMFCC features of the second kind of vehicle. (e) The ΔΔMFCC features of the first kind of vehicle. (f) The ΔΔMFCC features of the second kind of vehicle.

An acoustic vector sensor can simultaneously measure the acoustic pressure and particle velocity components at one point of sound field. Lately, acoustic vector sensor technology has been widely used in underwater noise measurement, underwater target detection, and noise source location [9].

The results of MFCC features extraction of sound pressure signals of two kinds of vehicles are demonstrated in Figure 3, where Figure 3(a) is the MFCC features of 165 samples of the first kind of vehicle signal, -axis is the dimension number of MFCC, -axis is the sample number, and -axis is the amplitude of MFCC features. Specifications of each coordinate in the following maps are the same as Figure 3(a), which shows that the feature vectors of these 165 samples of the first kind of vehicle are similar. Figure 3(b) is the MFCC features of 165 samples of the second kinds of vehicle signal, which shows that the feature vector of these 165 samples of the second kind of vehicle is similar, too. Comparing Figure 3(a) with Figure 3(b), the MFCC features of different vehicles are quite different, which can be used to distinguish underwater targets. Figures 3(c) and 3(d) demonstrate the results of ΔMFCC features of the first kind and second kind of vehicle, with 165 samples, respectively. Figures 3(e) and 3(f) demonstrate, respectively, results of ΔΔMFCC features of the first kind and second kind of vehicle, with 165 samples. Figures 3(a), 3(c), and 3(e) are MFCC, ΔMFCC, and ΔΔMFCC features of the first kind vehicle, respectively. Figures 3(b), 3(d), and 3(f) are MFCC, ΔMFCC, and ΔΔMFCC features of the second kind vehicle, respectively. Multiple dimension features are obtained by using multiple methods, which can be fused to distinguish and recognize those targets.

Figure 4 demonstrates results of MFCC feature extraction of two kinds of vehicles by using particle velocity signals instead of sound pressure signals, where Figures 4(a), 4(c), and 4(e) are results of MFCC, ΔMFCC, and ΔΔMFCC features of the first kind of vehicle by using particle velocity signals with 165 samples, respectively, and Figures 4(b), 4(d), and 4(f) are results of MFCC, ΔMFCC, and ΔΔMFCC features of the second kind of vehicle by using particle velocity signals with 165 samples, respectively. MFCC features of each sample are more similar than those using sound pressure signals, because SNR is improved due to dipole directivity of particle velocity.

Figures 3 and 4 show that the same class MFCC features of the same kind underwater vehicles are relatively consistent and MFCC features of different underwater vehicles are quite different. We also can use the value of the and to quantify the similarity of different samples of one kind of vehicle’s signals, as well as the difference between different vehicles’ signals. The within class variance of one kind of vehicle and the interclass variance of two different kinds of vehicle are shown in Figure 5. Figure 5 shows obviously that the interclass variance is bigger than the within class variance. So MFCC features can be used to distinguish different targets.

Figure 5: The comparison of the within class variance of one kind of vehicle and the interclass variance of two different kinds of vehicle.

Therefore MFCC features can be used to distinguish these different underwater targets. Singular values of MFCC features based on particle velocity signal are smaller than those based on sound pressure signal, which benefits recognition of different underwater targets.

For testing the effectiveness of MFCC features. Two kinds of underwater vehicles, whose pressure and particle velocity signals are simulated, are distinguished by using BP neural network. With a single hidden layer network in BP neural network is selected [10]. The number of input layer nodes is the feature dimension, and the number of hidden layer nodes is performed by taking (16), and output layer nodes is 2where is the number of hidden layer nodes, is the number of input layer nodes, is the number of output layer nodes, and is the adjusting constant whose value ranges from 0 to 10 [10]. Half of the samples are randomly selected as the training sets and the remaining as the test sets.

The 20-dimension MFCC features, 20-dimension first-order differential MFCC features, and 20-dimension second-order differential MFCC features are combined to obtain 60-dimension features used for recognition. The recognition probability is shown under different SNR in Table 1.

Table 1: The recognition probability by using 60-dimension MFCC combined features of sound pressure and particle velocity of simulation signals.

Feature fusion process is carried on for improving the performance of targets’ recognition. The feature fusion process is shown in Figure 6.

Figure 6: The block diagram of MFCC feature fusion based on Fisher-ratio with correlated distance.

The 20-dimension MFCC features, 20-dimension first-order differential MFCC features, and 20-dimension second-order differential MFCC features, which are extracted above, are optimized disposed by fdc. The fdc of the first 5-dimension are obtained. The first 5-dimension of MFCC features, first-order differential MFCC features, and second-order differential MFCC features are combined to form 15-dimensional joint features for recognition.

Features of 15-dimension samples are sent into BP neural network classifier to identify those targets and identification results are shown in Table 2.

Table 2: The recognition probability by using 15-dimension MFCC fusion features of sound pressure and particle velocity signals.

Comparing between Tables 1 and 2, the recognition probability is improved by using 15-dimensional fusion features than that using 60-dimensional fusion.

4. Experimental Data Analysis

The experiment for targets recognition based on a single acoustic vector sensor was carried on in Songhua Lake of Jilin province in China. Radiation noises of two kinds of underwater vehicles, including ship and speed boat, are measured, respectively, by using a single of acoustic vector sensor. The data of ship radiation noise is divided into 182 frames and the data of speed boat radiation noise is divided into 239 frames with 6000 snapshots in each frame. MFCC features are, respectively, extracted from using every frame data of sound pressure and particle velocity signals of ship and speed boat. There are 30 bandpass filters in Mel filter group. Feature extraction results by using sound pressure signal and particle velocity signal are shown in Figures 7 and 8, respectively.

Figure 7: The MFCC features of sound pressure signals of two kinds of underwater targets. (a) The MFCC features of radiation noise of ship. (b) The MFCC features of radiation noise of speed boat. (c) The ΔMFCC features of radiation noise of ship. (d) The ΔMFCC features of radiation noise of speed boat. (e) The ΔΔMFCC features of radiation noise of ship. (f) The ΔΔMFCC features of radiation noise of speed boat.
Figure 8: The MFCC features of particle velocity signals of two kinds of underwater targets. (a) The MFCC features of radiation noise of ship. (b) The MFCC features of radiation noise of speed boat. (c) The ΔMFCC features of radiation noise of ship. (d) The ΔMFCC features of radiation noise of speed boat. (e) The ΔΔMFCC features of radiation noise of ship. (f) The ΔΔMFCC features of radiation noise of speed boat.

Figure 7 demonstrates MFCC features of sound pressure signals of ship and speed boat, where Figure 7(a) shows MFCC features of 182 frames of the ship signal. Figure 7(a) shows that feature vectors of 182 frames of the ship are almost similar. Comparing with simulation results, the similarity of different feature vector obtained by using experimental data different is worse with the effect of noise. Figure 3(b) is the MFCC features of 239 frames of the speed boat signal, which shows that the feature vector of 239 frames of the speed boat is mainly similar. Comparing Figures 7(a) and 7(b), MFCC features of different vehicles are quite different, which can be used to distinguish underwater targets. Figures 7(c) and 7(d) demonstrate results of ΔMFCC features of the ship and speed boat signal, respectively. Figures 7(e) and 7(f) demonstrate results of ΔΔMFCC features of the ship and speed boat signal, respectively. Figures 7(a), 7(c), and 7(e) are MFCC, ΔMFCC, and ΔΔMFCC features of the ship signal, respectively. Figures 3(b), 3(d), and 3(f) are MFCC, ΔMFCC, and ΔΔMFCC features of the speed boat signal, respectively. Multiple dimension features are obtained from experimental data by using multiple methods, which can be fused to distinguish and recognize those targets.

Figure 8 demonstrates results of MFCC features extraction of ship and speed boat by using the particle velocity signals instead of sound pressure signals, where Figures 8(a), 8(c), and 8(e) are results of MFCC, ΔMFCC, and ΔΔMFCC features of the ship by using particle velocity signals with 182 frames, respectively, and Figures 8(b), 8(d), and 8(f) are results of MFCC, ΔMFCC, and ΔΔMFCC features of the speed boat by using particle velocity signals with 239 frames, respectively. MFCC features of each frame are more similar than those using sound pressure signals, because SNR is improved due to dipole directivity of particle velocity. Comparing Figure 8(a) with Figure 8(b), comparing Figure 8(c) with Figure 8(d), and comparing Figure 8(e) with Figure 8(f), the features extracted from the different particle velocity signals can be used to recognize the different underwater targets.

As seen from above, MFCC, first-order differential MFCC, and second-order differential MFCC features can be used as effective features of underwater target recognition. The fusion features based on fdc criterion with three kinds of features are researched. These results are shown in Figure 9.

Figure 9: The fdc of MFCC features based on sound pressure signals and particle velocity signals. (a) Each dimension ratio of MFCC features. (b) Each dimension ratio of ΔMFCC features. (c) Each dimension ratio of ΔΔMFCC features.

Figure 9 shows that the fdc of the different features, which includes MFCC, first-order differential MFCC, and second-order differential MFCC, are different.

MFCC features of sound pressure and particle velocity signals of ship and speed boat which have been mentioned above are, respectively, used to recognize targets by using BP Neural Networks method.

Half of the samples are randomly selected as the training sets and the remaining as the test sets. The identification results by using sound pressure and particle velocity are shown in Tables 3 and 4, respectively.

Table 3: Target identification probability using MFCC features based on sound pressure signal under different SNR.
Table 4: Target identification probability using MFCC features based on particle velocity signal under different SNR.

When SNR is 0 dB, Tables 3 and 4 show that the recognition probability is higher under higher SNR. When SNR is −5 dB, the recognition probability is more than 85%; when the SNR is −10 dB, the recognition probability decreases greatly. Meanwhile, the recognition probability using MFCC features based on particle velocity signal is higher than that based on sound pressure signals.

Features of 15-dimension sample are sent into BP neural network classifier to identify those targets. The original signals in experiment are strong enough, which are thought to be no noise interference, and the recognition probability is 100%. The white Gaussian noise is added to the real data to achieve different SNR for testing the performance of recognition under different SNR.

Figure 10 demonstrates the recognition probability by using MFCC, first-order differential MFCC, second-order differential MFCC features, and fusion features under different SNR. The identification probability using fusion features is higher than that with the use of one kind of these three features, and the recognition probability using particle velocity signal is higher than that using sound pressure signal.

Figure 10: The recognition probability comparison of under different SNR. (a) The comparison of recognition probability of sound pressure signal. (b) The comparison of recognition probability of particle velocity signal.

Identification results using fusion features of sound pressure and particle velocity signals, as well as the combination of sound pressure and particle velocity signals, are shown in Table 5.

Table 5: The recognition probability by using 15-dimension MFCC fusion features of sound pressure, particle velocity signal, sound pressure, and particle velocity combined processing.

Table 5 shows that the recognition probability by using particle velocity signal is higher than that with the use of sound pressure signal, and combination processing of sound pressure and particle velocity future improved the recognition probability of underwater targets.

5. Conclusion

In this paper, MFCC feature extraction using sound pressure and particle velocity signals are researched. Conclusions are summarized as follows.

Firstly, MFCC feature, first-order-differential MFCC feature, and second-order differential MFCC feature can be used as the effective feature of the underwater target identification from the feature extraction and recognition results.

Secondly, by calculating the Fisher-ratio and correlated distance, it can be found that the contribution of each dimension feature is different, and those three features fused by using fdc criterion can improve the recognition probability of underwater target signal.

Thirdly, the recognition probability by using particle velocity signal is higher than that with the use of sound pressure signal. Scalar and vector signals of underwater target signal can be used for feature extraction in order to improve the performance of target recognition.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This paper is supported by the National Natural Science Foundation of China (Grant no. 11674075).

References

  1. J. Zhihong and Q. Jianli, “The research of underwater target recognition technology,” Ship Science Technology, no. 4, pp. 38–44, 1999. View at Google Scholar
  2. L. Zhenbo, Z. Xinhua, and Z. Jin, “Feature extraction of ship-noise based on mel frequency cepstrum coefficients,” Ship Science and Technology, vol. 266, no. 2, pp. 51–54, 2004. View at Google Scholar
  3. W. Bin, S. Tinggen, S. Xuehua, and X. Shenghua, “The MFCC feature extraction based on the noise situation,” Microcomputer Information, vol. 24, no. 1, pp. 224–226, 2008. View at Google Scholar
  4. M. Zhiyou, Y. Yingchun, and W. Chaohui, “Further feature extraction and its application on speaker recognition,” Journal of Circuits and Systems, vol. 8, no. 2, pp. 130–133, 2003. View at Google Scholar
  5. L. Geming, S. Chao, and Y. Yixin, “Feature extraction of passive sonar target based on two cepstrums,” Journal of Northwestern Polytechnical University, vol. 26, no. 3, pp. 276–281, 2008. View at Google Scholar · View at Scopus
  6. M. Buscema, “Back propagation neural networks,” Substance Use& Misuse, vol. 33, no. 2, pp. 276–281, 2008. View at Google Scholar
  7. E. Zwicker and H. Fastl, Psychoacoustic Facts and Models, Springer, Berlin, Germany, 1999.
  8. E. P. Frigieri, P. H. S. Campos, A. P. Paiva, P. P. Balestrassi, J. R. Ferreira, and C. A. Ynoguti, “A mel-frequency cepstral coefficient-based approach for surface roughness diagnosis in hard turning using acoustic signals and gaussian mixture models,” Applied Acoustics, vol. 113, pp. 230–237, 2016. View at Publisher · View at Google Scholar
  9. T. L. Hemminger and Y.-H. Pao, “Detection and classification of underwater acoustic transients using neural networks,” IEEE Transactions on Neural Networks, vol. 5, no. 5, pp. 712–718, 1994. View at Publisher · View at Google Scholar · View at Scopus
  10. L. Zhenbo and Z. Xinhua, “Auditory feature extraction of noise radiated from an underwater target,” Systems Engineering and Electronics, vol. 26, no. 12, pp. 1801–1803, 2004. View at Google Scholar · View at Scopus