Under the complex oceanic environment, robust and effective feature extraction is the key issue of ship radiated noise recognition. Since traditional feature extraction methods are susceptible to the inevitable environmental noise, the type of vessels, and the speed of ships, the recognition accuracy will degrade significantly. Hence, we propose a robust time-frequency analysis method which combines resonance-based sparse signal decomposition (RSSD) and Hilbert marginal spectrum (HMS) analysis. First, the observed signals are decomposed into high resonance component, low resonance component, and residual component by RSSD, which is a nonlinear signal analysis method based not on frequency or scale but on resonance. High resonance component is multiple simultaneous sustained oscillations, low resonance component is nonoscillatory transients, and residual component is white Gaussian noises. According to the low-frequency periodic oscillatory characteristic of ship radiated noise, high resonance component is the purified ship radiated noise. RSSD is suited to noise suppression for low-frequency oscillation signals. Second, HMS of high resonance component is extracted by Hilbert-Huang transform (HHT) as the feature vector. Finally, support vector machine (SVM) is adopted as a classifier. Real audio recordings are employed in the experiments under different signal-to-noise ratios (SNRs). The experimental results indicate that the proposed method has a better recognition performance than the traditional method under different SNRs.

1. Introduction

Ship radiated noise is mainly generated from the vibration of the engine, a variety of propulsion plants, kinds of auxiliary equipment, and the cavitation of its propeller. Ship radiated signal is a low-frequency periodic oscillatory signal in essence. The signal received by the passive sonar system due to ship radiated noise can be described as a narrowband structure, embedded into broadband noises generated from underwater acoustic channel, which consist of ambient noise and transient interference with the characteristic of nonoscillation [1, 2]. Recognition of ship radiated noise has extensive applications in the military and economic fields such as ocean bottom exploration [3] and underwater vehicle detection [4]. Recognition of ship radiated noise transmitted through a long range is a complex issue under the effect of inevitable environmental noise and it consists of feature extraction and classification of audio signals collected by the hydrophone. The features presented to the classifier have direct influence on the final recognized performance. Therefore, the robust methods of the feature extraction are significant and profound for improving the recognition accuracy and attract much attention of scholars [5].

Many methods have been developed to recognize ship radiated signals. These methods are based on three aspects of signal processing technologies including time domain analysis, frequency domain analysis, and time-frequency analysis. Most of previous works on automatic recognition of ship radiated noise have dealt with extracting features from the frequency domain, utilizing fast Fourier transform (FFT) power spectrum [610]. But it is well known that time-frequency domain analysis has much more advantages for ship radiated noises which are nonstationary signals. The method that combined the short-time Fourier transform (STFT) and relevance vector machine (RVM) was described in [11]. The methods based on wavelet transform for underwater target classification were presented in [1214]. The Hilbert-Huang transform was applied in [1517] to extract the feature of underwater signals.

However, the traditional time-frequency analysis methods are sensitive to the environmental noise generated from the complex underwater acoustic channel, which degrades the recognition accuracy directly. Many techniques have been proposed for denoising the corrupted signals. The denoising methods can be classified into two categories, that is, the filter-based and wavelet decomposition-based methods. The main idea of classical filter-based denoising method is to find the appropriate center frequency and bandwidth to retain the narrowband component and filter out the noise and interferences except for in-band noise and interferences. The theoretical foundation of wavelet decomposition denoising method is the principle of multiresolution analysis [18]. RSSD is one kind of wavelet decomposition-based denoising methods, which is proposed to decompose nonstationary and nonlinear signals into high resonance component, low resonance component, and residual component [19]. Note that RSSD is a nonlinear signal analysis method based not on frequency or scale but on resonance [20]. The advantages of RSSD algorithm include the following: (1) it can filter out in-band noise and interference which cannot be removed by classical frequency-based methods, because it is frequency-independent in nature, (2) it can extract the oscillatory signal signature from severe interference-obscured signals without filtering the signals, and (3) it does not require the prior information of the signals [18].

To classify ship radiated noises with the high recognition accuracy, the robust time-frequency analysis method combining RSSD and HHT is proposed in this paper. RSSD is applied to extract high resonance signal, which represents ship radiated noise with periodic oscillatory property, and to eliminate low resonance signal and residual signal which denote transient signal and white Gaussian noise, respectively. RSSD can purify the observed signals. HHT is employed for the feature extraction, and SVM is used as a classifier. Real audio recordings are employed in the experiments under different signal-to-noise ratios (SNRs).

2. Methodology

Signal model of the observed signal can be expressed as follows [15]:where is the original signal collected by the hydrophone, is the clean signal of ship radiated noise, and denotes the complex environmental noise.

A robust time-frequency analysis method combining resonance-based sparse signal decomposition and Hilbert-Huang transform is proposed as shown in Figure 1. First, the ship radiated noise is decomposed into high resonance component, low resonance component, and residual component by RSSD, where high resonance component is multiple simultaneous sustained oscillations, low resonance component is nonoscillatory transients, and residual component is white Gaussian noises. High resonance component is the purified ship radiated noise. Second, HMS of high resonance component is extracted by Hilbert-Huang transform as the feature vector. Finally, SVM is adopted as the classifier.

2.1. Resonance-Based Sparse Signal Decomposition

Sparse signal representation, morphological component analysis (MCA), and a tunable Q-factor wavelet transform (TQWT) are adopted in the RSSD algorithm [11]. TQWT is applied to acquire the basic functions of high-Q transform and low-Q transform and obtain the corresponding transform coefficients for signal decomposition [21]. MCA, which is a general method for signal decomposition based on sparse representations, is utilized to decompose signals into high component, low component, and residual component [22, 23].

2.1.1. Resonance Properties of Ship Radiated Noise

Resonance properties of a signal are quantified by the quality factor or Q-factor which describes its degree of resonance. Q-factor is defined as the ratio of its center frequency to its bandwidth and is expressed as follows [24]:where is the center frequency and is the bandwidth. Higher Q-factor indicates more oscillatory cycles forming a signal and a lower rate of energy loss. A signal with high Q-factor is considered as underdamped. Underdamped signals oscillate at a specific frequency with the amplitude attenuation. On the contrary, A signal with low Q-factor is considered as overdamping. There are no oscillations in overdamping signals. In fact, ship radiated noise is a low-frequency oscillatory signal with high Q-factor and high resonance property.

2.1.2. Tunable Q-Factor Wavelet Transform

TQWT is a wavelet transform with a flexible Q-factor for discrete-time signals. The wavelet transform can be adjusted in line with the oscillatory characteristic of the observed signal. The transform is based on a real-valued scaling factor (dilation factor) and implemented by a perfect reconstruction oversampled filter bank with real-valued sampling factors [21].

RSSD acquires the basic function library of high Q-factor transform and low Q-factor transform. It calculates the corresponding transform coefficients by TQWT. Two-channel filter banks are adopted in TQWT shown in Figure 2. and represent low-pass and high-pass filters, respectively. The subband signal has a sampling rate , where is the sampling rate of the input signal . Likewise, the subband signal has the sampling rate of . and , where and are the scaling parameters for low-pass and high-pass scaling, respectively, and is the oversampling rate or redundancy [25].

The transfer functions of the low-pass filter and the high-pass filter are constructed as follows [21]:where , , and . Note that the filter bank must be strictly oversampled to localize the filter responses well, so it meets .

The most significant parameters of the TQWT are the quality factor , the oversampling rate , and the number of stages or levels . is served as the measurement for the number of the oscillatory wavelets. Note that the selection of -factor should be subjected to . Higher values of -factor generate more oscillatory wavelets. Particularly, indicates that the oscillatory components do not exist in the observed signal. denotes the redundancy factor of the TQWT. When two-channel filter banks are iterated on TQWT’s low-pass output and calculated infinitely to perform a wavelet transform, the wavelet transform is oversampled by . is generally recommended to avoid the issue that the transition bands of and will be relatively narrow and the time domain response will not be well localized. represents the quantity of filter banks. The total amount of subbanks is . Signal of each filter bank is output by high-pass filter, and signal of the final filter bank is output by low-pass filter [2426].

2.1.3. Morphological Component Analysis

In RSSD, the high and low resonance components are represented sparsely by performing high-Q transform and low-Q transform. MCA is a method of nonlinear signal decomposition based on sparse representation [22, 23]. Given a ship radiated signal , where , assume that and can be represented sparsely in bases and . The aim of MCA is to estimate and individually, which can be determined by minimizing the objective function as follows:with respect to and , where and are the regularization parameters which are chosen artificially according to the power distribution of the high resonance and low resonance components.

The estimates and are obtained by minimizing (4) by MCA; then and will serve as the low resonance and high resonance signals. A variant of the split augmented Lagrangian shrinkage algorithm (SALSA) is applied to solve the MCA problem through the iteration [19].

2.2. Hilbert Marginal Spectrum Analysis

Hilbert marginal spectrum analysis is based on HHT which comprises empirical mode decomposition (EMD) and the Hilbert spectral analysis [26]. As a crucial part of HHT, EMD is a creative technique of processing nonlinear and nonstationary signals. Such signals are decomposed directly and adaptively into a set of intrinsic mode functions (IMFs). Each IMF represents an oscillatory mode which is a subset of frequency components from the observed signal. The modes are nearly orthogonal with each other and they are linear components of the original signal. Two basic properties must be satisfied for the candidate IMF to be called an IMF. First, the number of extreme points and the number of zero crossings must either be equal or differ at most by one. Second, the mean value of the envelope defined by the local maxima and the envelope defined by the local minima is zero at any point [2729].

For the high resonance signal obtained by RSSD, the process of obtaining the IMFs from is performed along with the following steps [26]:(1)Determine the local maxima and minima of .(2)Acquire upper and lower envelopes and by interpolating all local maxima and minima using cubic splines, respectively.(3)Calculate the point-by-point mean of the upper and lower envelopes by the equation .(4)Generate a preliminary candidate of mode function by subtracting the local mean from high resonance signal: .(5)Test whether or not satisfies two properties in order to be an IMF. If the definition of the IMF is satisfied, we will set . On the contrary, repeat steps (1)–(4) until satisfies the properties.(6)Once an IMF is confirmed, the residual is calculated by and is regarded as the input signal of step (1). Steps (1)–(5) are repeated to yield other IMFs. The iteration is terminated when is monotonic.

After obtaining all the IMFs, high resonance signal can be expressed as where are the IMFs, is the number of IMFs, and is the final residual that approximates to zero amplitude and frequency.

As the other key part of HHT, the Hilbert transform is applied to each IMF . The Hilbert transform is defined aswhere denotes the Cauchy principal value. Then the analytical signal can be constructed as follows:where the amplitude of preenvelope and the instantaneous phase are expressed as

Hence, the instantaneous frequency can be calculated as the time derivative of the instantaneous phase as follows:After performing Hilbert transform on each IMF, the original high resonance signal can be expressed as

Equation (10) provides the representation of the amplitude and the instantaneous frequency as functions of time in a three-dimensional plot. Such time-frequency distribution of the amplitude is referred to as the Hilbert spectrum and is expressed as follows:

According to the Hilbert spectrum, the Hilbert marginal spectrum would be defined as follows [2729]:

2.3. Support Vector Machine

Support vector machine (SVM), first proposed by Vladimir Vapnik, is a machine learning method based on the Vapnik-Chervonenkis dimension theory of statistical learning theory and the structural risk minimization principle [30, 31]. In order to recognize different categories of the observed signals, the margin is maximized by determining a separating hyperplane in SVM. In this paper, SVM is adopted as a classifier, because it has been shown that SVM can achieve a remarkable classification performance when it is applied to audio signals.

In the case of two-class application, assume a given training set , where is the input training data which satisfy and are the category labels which satisfy . A linear hyperplane is defined as follows:where denotes the normal vector of the hyperplane and represents the offset vector. The above equation is called the optimal separating hyperplane which is subject to the following constraints:

We need to solve the following convex optimization problem in and to obtain the optical plane:where is a positive real constant to control the punishment for misclassified samples. In the primal weight space, it is a constrained optimization problem; then formulate the Lagrangian function of the linear soft margin optimization problem and take the conditions for optimality; finally solve the problem in the dual space.

For the nonlinear classified problem, the SVM classifier can be formulated aswhere represents Lagrange multipliers; is a kernel function. A significant kernel function, which is referred to as Gaussian radial basis function (RBF), is used in this work by the following equation:

3. Experiment and Analysis

3.1. The Description for the Recorded Data

Real audio recordings of ship radiated noises (A, B, C, and D) illustrated in Figure 3 are adopted as the objected signals in the recognition experiments, which are acquired from four types of the different ships. Noise A and noise B are downloaded from Discovery of Sound in the Sea (http://www.dosits.org/galleries/audio/anthropogenicsounds/ship), where noise A is a recording in the distance of 3.2 km away from a hydrophone and the recoding distance of noise B is about 1.7 km. Noise C and noise D are acquired with a sampling frequency of 10 KHz by the hydrophone, which are recorded at the shallow sea located at the west coast of Taiwan strait under the condition that the depth of the hydrophone is about 25 m and the distance between the observed ship and the hydrophone is about 1 km. Each individual audio recording has 30,000 samples with the inevitable noise, which lasts for about 3 s. However, the original audio recordings are regarded as the clean signals. The corresponding spectrums of four real audio signals are shown in Figure 4. Although the spectrums have different distribution, they have commonality that the main spectral energy concentrates below 200 Hz.

3.2. The Experimental Results of Feature Extraction

To evaluate the recognition accuracy of the proposed method, the experiment is performed under different SNRs. We adopt the fragment with 8192 samples of the real audio signals as the input signal presented to the process of RSSD. We select the real audio signal of noise A as the target signal. When RSSD is operated, we set the parameters , , and for high-Q TQWT and , , and for low-Q TQWT, respectively. It can be obviously seen from Figure 5 that high resonance component is separated by using RSSD algorithm. Meanwhile, transient signals and white Gaussian noise are well eliminated by abandoning low resonance and residual components. The spectrums of the corresponding components are shown in Figure 6. By calculation, the proportion of the spectral energy of high resonance component which is concentrated below 250 Hz is 85.7%; however, the spectrum of low resonance component is spread over a range of 1000 Hz. Comparatively speaking, the spectrum of the high resonance component is a narrowband component and the spectrum of the low resonance component is a broadband component in the frequency domain. The frequency spectrum of high resonance component is similar to the original signal. That is to say, the inherent periodic oscillatory information of ships will be kept well by extracting high resonance component.

After the process of RSSD, EMD is applied to high resonance component. The IMFs extracted from high resonance signal of noise A are demonstrated thoroughly in Figure 7. It is observed that the amplitude and frequency of each mode degrade with its order. Then the feature vectors are extracted by further analysis utilizing HMS method to well classify ship radiated signals. Four types of ship radiated signals are processed by the above-mentioned method to get the HMS feature vectors. The comparison results of HMS feature vectors are illustrated in Figure 8. For comparison, HHT algorithm is used as the traditional method. The simulations are operated on type A and type D at 10 dB and the HMS feature vectors of the proposed and traditional methods are compared in the range of 150 Hz, because the main energy of the HMS is concentrated upon the frequency range below 150 Hz. It can be obviously seen that the distance of the HMS features between noise A and noise D in Figure 8(a) is larger than that in Figure 8(b). A more obvious distinction of the feature vectors means higher recognition rate.

3.3. The Recognition Results

SVM acts as a classifier in this work. The recognition accuracy experiments of all observed signals are conducted at the different SNRs by adding white Gaussian noise to the original signals and the results are listed in Table 1. It is obviously observed that the correct recognition rates of both methods deteriorate with the decrease of the SNR. At different SNRs, most recognition rates of four ships processed by the proposed method are higher than those of the traditional method, and the average rates of the proposed method are higher than those of the traditional method: at least 3.7% at 10 dB. For intuitional comparison, the recognition results are shown in Figure 9. The results demonstrate that the recognition performance of the proposed method is superior to the traditional method.

4. Conclusion

This paper proposed a robust time-frequency analysis method combining RSSD and HHT. RSSD is operated for purifying the observed signals by extracting high resonance signal which is ship radiated noise and eliminating the interference of transient signals and white Gaussian noise which are denoted as low resonance signal and residual signal, respectively. HHT is employed for the feature extraction and SVM is used as a classifier. It has been evaluated that the performance of the proposed method is higher than the traditional method by the experiments at different SNRs and the proposed method is robust to the effect of the interference of the transient signal and white Gaussian noise. The proposed algorithm is suitable for the classification of ship radiated noises which are nonstationary and oscillatory signals.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.


This work was supported by the National Natural Science Foundation of China (nos. 61471308 and 61471309).