Abstract

Marine ambient noise (AN) is a nonlinear and unstable signal, traditional dispersion entropy can only analyze the marine AN from a single scale, which is easy to cause the loss of information. To address this problem, we introduced multiscale dispersion entropy (MDE), and then a new feature extraction method of marine ambient noise based on MDE is proposed. We used MDE, multiscale permutation entropy (MPE), multiscale permutation Lempel–Ziv complexity (MPLZC), and multi-scale dispersion Lempel–Ziv complexity (MDLZC) to carry out feature extraction and classification recognition experiments for six ANs. The experimental results show that for the feature extraction methods based on MDE, MPE, MDLZC, and MPLZC, with the increase of the number of features, the feature extraction effect becomes better, and the average recognition rate (ARR) becomes higher; compared with other three feature extraction methods, the feature extraction method based on MDE has the best feature extraction effect and the highest ARR for the six ANs under the same feature number.

1. Introduction

The marine animals living in the sea have an extremely sophisticated vocal system and sound processing system. They can use sound to achieve the purpose of communication, navigation, positioning, looking for food, and escaping natural enemies [13]. However, serious noise pollution will damage the auditory system of marine animals and even cause the death of life. In order to protect the diversity of marine organisms, it is necessary to study the marine ambient noise (AN) [46].

The marine AN is a nonlinear and unstable signal, the traditional feature extraction methods are mainly aimed at linear and stable signals, and it is difficult to analyze marine AN [710]. The feature extraction method based on nonlinear dynamics can effectively analyze the marine AN. The common nonlinear dynamic parameters include Lempel–Ziv complexity (LZC), sample entropy (SE), and permutation entropy (PE). LZC relies on binary conversion and has weak antinoise ability, which often leads to the loss of some useful information of the time series [1114]. The calculation process of SE is complex and time-consuming, and SE is not suitable for real-time monitoring. Although the calculation steps of PE are simple, the magnitude relationship between amplitudes is not considered [1518].

Dispersion entropy (DE) is another important indicator of signal complexity [19]. Compared with LZC, SE, and PE, DE considers the magnitude relationship between amplitudes and has fast calculation speed as well as strong antinoise ability; therefore, DE has been widely used in the field of underwater acoustic and fault diagnosis [2022]. However, the marine ANs are extremely complex, and DE only analyzes the marine ANs from a single scale; it cannot fully reflect the effective information of marine ANs, so we adopted a feature extraction method based on multiscale dispersion entropy (MDE) for ANs. Since MDE is based on multiscale analysis, it can more comprehensively reflect the ANs [23].

At present, MDE is widely used in the field of medicine and fault diagnosis, but it has not been used in marine AN [2428]. In this paper, MDE is used to study the marine AN for the first time and has achieved good results. The general structure of this paper is as follows: Section 2 introduces the basic principle of DE in detail; in Section 3, a feature extraction method based on MDE is proposed, and the specific steps are introduced; Section 4 carries out the experiments of feature extraction and classification for six ANs; finally, Section 5 summarizes this paper.

2. Theory

2.1. Dispersion Entropy

DE is one of the physical quantities to measure the complexity of time series. It considers the relationship between amplitudes and has the characteristics of high stability and fast operation. The specific calculation steps are as follows:(1)Given a time series , map it to according to the normal distribution formula. Assuming that the expectation is and the variance is , the mapping result is as follows:where .(2) is linearly transformed to :where the range of is within ; represents a rounding function; indicates category.(3)Embedded a vector :where represents the embedding dimension; represents the time delay.(4)The number of dispersion patterns corresponding to the embedded vector: assuming , the dispersion pattern corresponding to is . It is composed of parts, in which each part has values. Therefore, there are dispersion patterns corresponding to .(5)The probability of each dispersion pattern:where indicates the number of dispersion patterns .(6)According to the formula of Shannon entropy, DE can be defined as

2.2. Multiscale Dispersion Entropy

DE can only measure the time series on a single scale, which often leads to the lack of series information. In order to solve this problem, the multiscale idea is infiltrated on the basis of DE, and the series measurement method of MDE is generated. The MDE can reduce the loss of sequence information, have strong anti-interference ability, and fast calculation speed. The calculation process is as follows:

Firstly, given a time series . The total length of the time series is N, and the results of coarse graining are as follows:where represents the scale factor, ; is the integer of , indicating the length of coarse granulation series.

Secondly, the coarse-grained sequences corresponding to different scale factors have different DEs. The DEs of all coarse-grained sequences are calculated.

Finally, the average value of DE of all coarse-grained sequences is taken as the result of MDE, which is expressed as follows:where represents the original time series; represents the embedding dimension; represents the category; represents a time delay.

3. Feature Extraction Method of ANs

The flow chart of feature extraction method of six ANs based on MDE is shown in Figure 1, where SF1 is the abbreviation of scale factor 1, SF2 is the abbreviation of scale factor 2, and so on. MDE1 stands for the MDE under SF1, MDE2 stands for the MDE under SF2, and so on. The specific steps are as follows:(1)Six ANs are sampled with 5000 sampling points, 100 samples of each AN are obtained and inputted(2)The MDE of each AN from SF1 to SF10 are calculated, then MDE1 to MDE10 are obtained(3)The single feature extraction, double feature extraction, and multifeature extraction are carried out respectively(4)The best feature or feature combination which corresponds to the highest average recognition rate (ARR) is selected(5)K-nearest neighbor (KNN) classifier is adopted to classify each AN(6)The highest ARR for six ANs are got

4. Feature Extraction of ANs

4.1. Six ANs

Six different ambient ANs were selected as the research object in this paper, which came from the National Park Service. The selected data are heavy rain on sea surface (HR), light rain on sea surface (LR), light wind at the sea surface-underwater recording (LW), moderate wind on the sea surface-underwater recording (MW), snowfall on sea surface (SN), and wind and ship noise on underwater hydrophone (W–S). 500000 sampling points are taken for each AN. Figure 2 shows the time domain waveform of six ANs.

4.2. Single Feature Extraction and Classification

In order to compare the feature extraction effect of four complexity parameters for each AN, the common parameters of multiscale permutation entropy (MPE), MDE, multiscale permutation Lempel–Ziv complexity (MPLZC), and multiscale dispersion Lempel–Ziv complexity (MDLZC) are set to and , where the category of MDE and MDLZC are set to ; MPE and MPLZC do not need to set parameter . For each group of feature extraction experiments, we take 100 samples for each type of AN, and each sample contains 5000 sampling points, and MPE, MDE, MPLZC, and MDLZC of the six kinds of ANs are extracted with from SF1 to SF10. In order to compare the recognition ability of four complexity parameters for six ANs, six ANs are classified and identified by the KNN algorithm. For each complexity parameters, 50 samples of each AN are selected as training samples, and then the remaining 50 samples of each AN are used as test samples. Figure 3 shows the single feature distributions and classification results corresponding to the highest ARR.

Comparing each feature distribution map and the corresponding highest ARR classification recognition results, we can see that the MPLZC of six ANs are mixed together, and the number of misidentified samples of each AN is the largest; the MPE of LR, LW, MW, and W–S are mixed together, and the number of misidentified samples of each AN is quite large; MDE and MDLZC have less overlap than MPE and MPLZC in the feature distribution of the six ANs, but the number of misidentified samples of each AN is still large. It concluded that the discrimination effect of MPLZC MPE, MDE for the six ANs, and MDLZC is poor, and MPLZC has the worst discrimination effect.

To compare the recognition results of each complexity parameter for six ANs more easily, we calculate the highest ARR each AN under four complexity parameters. Table 1 shows the highest ARR of the single feature.

It can be observed from Table 1 that the ARR of the four feature extraction methods for the six ANs is lower than 75.0%, and among them, the MDE-based feature extraction method has the best result, but the recognition rate only reaches 72%, = which is much lower than 89%; the recognition rate of MPLZC to MW and W–S is 0, and the ARR of MPLZC to six ANs is the lowest; in addition to MPLZC, the other three feature extraction methods have the lowest recognition rate for LR. It can be concluded that it is difficult to accurately distinguish six ANs by using the single feature extraction method.

4.3. Double Feature Extraction and Classification

In order to further improve the recognition rate of the six ANs, we used the double feature extraction method based on MPE, MDE, and MPLZC and used MDLZC to extract and classify the six ANs. Figure 4 shows the double feature distributions and classification results corresponding to the highest ARR.

From Figure 4, we can find that compared with the single feature extraction method, each double feature extraction method can distinguish more samples of ANs and has better discrimination effect on six ANs; compared with MPLZC and MPE, the overlapping parts of MDE and MDLZC of the six ANs are less; MDE and MDLZC have better ability to distinguish LR than MPE and MPLZC; compared with MPLZC and MPE, MDE and MDLZC have a better discrimination effect on LW and MW; MPLZC has the most misidentified samples for six kinds of AN, and MDE has the least misidentified samples; among the six ANs, HR has the least number of samples that are misidentified. Results show that compared with MPE and MPLZC, MDE and MDLZC can better distinguish six ANs.

The highest average recognition rates of double features for six ANs are calculated, in which (1, 4) represents double features of complexity parameter under SF1 and SF4 and so on. Table 2 shows the highest ARR of double features.

As shown in Table 2, the recognition rate of the MPE-based feature extraction method for HR and SN is 100%, the recognition rate of the feature extraction method based on MDE for HR and W–S is 100%, the recognition rate of the feature extraction method based on MPLZC for HR; compared with the other three feature extraction methods, MPLZC has the lowest recognition rate for the six ANs; the four feature extraction methods have the highest recognition rate for HR and the lowest recognition rate for LR, and this is consistent with Figure 3. The results showed that the double feature extraction method can better identify six ANs, compared with MPE, MPLZC, and MDLZC, and MDE can better distinguish six ANs.

4.4. Multifeature Extraction and Classification

To further verify the effectiveness of MDE in distinguishing six ANs, the multifeature extraction method based on four complexity parameters mentioned in section 4.2 is adopted. Figure 5 indicates that triple feature distributions and classification results correspond to the highest ARR.

It can be seen from Figure 5 that after the three feature extraction method is adopted, the recognition effect of six ANs is further improved; the MDE distribution and MDLZC distribution of the six ANs are linear, and the MPE distribution and MPLZC distribution are blocky; for MDE and MDLZC, the overlapping parts of feature distributions for the six ANs are less than that in MPE and MPLZC; the overlapping of the MPLZC distributions of the six ANs is the most serious, and the number of misidentified samples for each AN is the largest; for MPE, the total number of misidentified samples of the six ANs is greater than that in MDE and MDLZC; for MDE, the samples of HR, SN, and W–S are not identified incorrectly, and only one sample of MW is misidentified. To summarize, compared with MPE, MPLZC, and MDLZC, MDE had the best recognition effect for six ANs.

We calculate the highest average recognition rate under different number of features separately, as shown in Table 3. Where (1, 4, 5) denotes triple features for complexity parameters under SF1, SF4, and SF5, (1, 3, 4, 5) denotes four features for complexity parameters under SF1, SF3, SF4, SF5, and so on. Table 3 shows the highest ARR of multifeatures for six ANs.

It can be seen from Table 3 that for the multifeature extraction methods based on MPE, MDE, MPLZC, and MDLZC, respectively, the recognition rate of the six ANs increased with the increase of the number of features; under the same number of features, the multifeature extraction method based on MPLZC has the lowest recognition rate, and the recognition rate of multifeature extraction method based on MDE is the highest and reaches 96.3% when the number of extracted features is 5; the highest recognition rate of the proposed method is at least 2.6% higher than that of the other three multifeature extraction methods. In conclusion, compared with the other three multifeature extraction methods, the proposed method has the highest recognition rate and can distinguish six ANs more accurately.

5. Conclusions

In this paper, MDE was introduced as an improved algorithm of DE, and we proposed feature extraction methods of six ANs based on MDE. The effectiveness of the proposed method is verified by the feature extraction experiments of six ANs, and the main conclusions are as follows:(1)In order to more comprehensively react complexity of the ANs, the MDE is introduced which combined with the coarse graining process and DE. Experiments show that MDE can better reflect the complexity of each AN.(2)In this paper, we proposed a feature extraction method of six ANs based on MDE. For the single feature and the double feature extraction experiments, compared with feature extraction methods based on MLZC, MPE, and MDLZC, the ARR for six ANs of the feature extraction method based on MDE is the highest.(3)In the multifeature extraction experiments, the proposed method all has the highest ARR under the same number of features, and the ARR reaches 96.3% when the number of features is 5. The proposed method has the highest ARR and can more accurately distinguish six ANs.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The author declares that they have no conflicts of interest.