Table of Contents Author Guidelines Submit a Manuscript
Journal of Sensors
Volume 2018, Article ID 6593037, 14 pages
Research Article

A GTCC-Based Underwater HMM Target Classifier with Fading Channel Compensation

1Research & Development Centre, Bharathiar University, Coimbatore 641046, India
2Department of Electronics, Cochin University of Science and Technology, Cochin 682022, India

Correspondence should be addressed to Shameer K. Mohammed; moc.liamg@ilaredyh.reemahs

Received 19 August 2017; Revised 16 November 2017; Accepted 14 January 2018; Published 3 April 2018

Academic Editor: Juan C. Cano

Copyright © 2018 Shameer K. Mohammed et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Underwater acoustic target classifiers are found to have many applications in military and security areas where a higher degree of prediction accuracy is needed that makes classifier efficiency and reliability an interesting subject. Classifiers are often trained with known acoustic target specimens with their characteristic feature set and tested with measurements obtained from the sonar that is deployed in the surveillance or observation zone. The selection of source-specific deterministic features in automatic target recognition (ATR) system is very significant, since it determines the reliability, efficiency, and success rate of the classifier. The robustness of the gammatone cepstral coefficients (GTCC) in combination with the statistical Euclidean distance, artificial neural network (ANN), and hidden Markov model (HMM) classifiers has been investigated, and its performance is compared with that of other feature extraction schemes. The classifier performance has been analyzed in Rayleigh fading conditions, based on which the performance is enhanced by incorporating an autoregressive (AR) Rayleigh fading channel compensation. The performance of the classifier in different operating conditions is investigated, with underwater target signals consisting of the real field data collected during expedition, and the results are presented in this paper.

1. Introduction

The ambient acoustic environment of the ocean is complex and includes a variety of noise sources which are of manmade as well as natural in origin. The problem of identification of the noise sources in the ocean is of prime importance because of its diverse applications in commercial as well as military sonar applications, which includes detection, underwater monitoring, and classification missions. Ocean’s random heterogeneity induces various noises, distortions, and signal-degrading agents that affect the acoustic emanations from the target. The source-specific deterministic features capable of disclosing their generating mechanism are extracted from the target emanations received by the hydrophone array. The source-specific deterministic features thus extracted should be capable of providing the unique set of classification clues.

Underwater acoustic situations are generally considered as an additive white Gaussian noise scenario [1]. In such a situation, gammatone cepstral coefficients may improve the performance of an underwater classifier system, since they are found to be effective in additive ambient noise scenarios [2]. Recently, Lian et al. [3] investigated the feasibility of gammatone cepstral features in underwater acoustics and proposed it for target classification. But the experiments were carried out with limited amount of field data at three different noise conditions. Binesh et al. [4] proposed an HMM underwater target classifier with spectral features and Rayleigh fading channel compensation, which highlights the probabilistic approach of HMM in effectively modeling the random temporal fluctuations in underwater acoustic scenario compared to other supervised classifiers. While modeling, HMMs derive a sequential relationship with the past and present temporal points in the signal [5]. Focusing on the implementation of an efficient underwater target classifier for real field data at different conditions, we investigated the applicability of gammatone cepstral coefficients in combination with HMM classifier as a prototype with considerable amount of data for reliable validation, and the results are presented.

The Rayleigh fading effect existing in the underwater channel due to the random heterogeneity of the ocean induces significant variations in the signal-to-noise ratio (SNR) of the signal. This leads to ambiguities in the HMM classifier’s decision-making process, which further affects the classifier performance. The Rayleigh fading effect can be modeled effectively using an autoregressive (AR) method, enabling the analysis of the propagation effects [6]. An attempt has been made to analyze the channel effects by employing an AR model. The capability of an AR method in modeling underwater Rayleigh fading channel and its after-effects in the classifier performance have been investigated. The robustness of the GTCC features in combination with Euclidean distance, ANN, and HMM classifiers has been investigated, and its performance has been compared with those of other features like linear predictive coding (LPC), linear predictive cepstral coefficients (LPCC), mel frequency cepstral coefficients (MFCC), and nonnegative matrix factorization (NMF).

The performance comparison has been carried out with signals from precollected database as well as with the field data collected during various expeditions. The analysis of the prototype classifier under different conditions unveils the robustness of the proposed GTCC-based HMM classifier.

2. Related Work

A number of research papers have been published in open literature in the areas of underwater noise, which feature extraction techniques such as spectral analysis, cepstral analysis, and classification algorithms. Such studies highlight the functional and operational requirements of various existing classifier systems along with several feature extracting techniques for representing the deterministic features of the targets as well as the feature selection criteria adopted in the implementation of various classifiers such as statistical classifiers, artificial neural network classifiers, and hidden Markov model classifiers.

2.1. Ocean Noise Characteristics

Wenz [7] addresses the basic objectives and challenges in underwater acoustic research, considering various characteristics of ambient noise, radiated noise, and self-noise. The paper also reviews the major problems in noise measurement, noise reduction, and its prevention. Pieng et al. [8] highlight the collection of ambient noise data and the structured compilation of the collected information into a useful database. Kamal et al. [9] present an attempt to lift the latent underwater source signals from the observed noise mixture utilizing independent component analysis (ICA). The results show that with the ICA technique, the ambient noise levels are mitigated to a considerably tolerable level in practical applications. Gray [10] proposes a model for the acoustic source strength of blade rate line tonal produced by vessels that could yield a statistical representation of the source level distribution and the frequency of propeller blade right acoustic energy for the vessels. Arveson and Vendittis [11] present the results of the studies carried out on the radiated noise of a ship with direct drive low-speed diesel engine. Most of the modern merchant ships are powered by these types of diesel engines. The results highlight the dominating frequency components in radiated noise as well as its contributing sources like engine and auxiliaries. Bouvet and Schwartz [12], from the statistical study conducted on underwater noises, state that the underwater background noise characteristics are very similar to Gaussian and the vessel noise can be effectively described by a Gaussian mixture model. Supriya et al. [13] propose a method based on spectral subtraction for alleviating the acoustic ambient noise with underwater acoustic receivers.

2.2. Feature Extraction

Han et al. [14] propose a new algorithm for MFCC feature extraction with increased accuracy and less computational power compared to the conventional methods, which is found to be very efficient for hardware implementation. Van der Merwe and du Preez [15] describe a methodology for the estimation of LPC coefficients utilizing mel-scale frequency warping by means of a two-step process in which the LPC coefficients are represented using a bilinear transform. Imai [16] reports new techniques for cepstral analysis synthesis on signal frequency scale and effectively represents the acoustic signal spectral envelope by means of log spectrum on a mel-frequency scale. Furui [17] proposes a new technique for automatic acoustic recognition with cepstral coefficients extracted utilizing LPC analysis, the results of which show that the proposed method performs better under different transmission conditions.

Eronen [18] reports a comparison on the performance of an acoustic classification system with several feature extraction methods. Both MFCC and LPCC coefficients are calculated, and the best performance was obtained with a feature set consisting of two sets of MFCC. Kumari and Jayanna [19] compare the robustness of LPCC and MFCC features in acoustic recognition with limited amount of data. The results highlight that the equal error rate is higher for MFCC compared to LPCC. Azimi-Sadjadi et al. [20] establish the supremacy of GTCC features over MFCC features in representing the spectral characteristics of nonspeech audio signals, especially at low frequencies. Mohankumar et al. [21] present an underwater target classifier with GTCC features extracted from the bispectrum of the noise. The robustness of the GTCC features is investigated in combination with a neural network classifier using back-propagation algorithm. GTCC features are found to yield acceptable success rates for the classification. Lian et al. [3] investigate the feasibility of gammatone filter-based features in the underwater acoustics target classification. The robustness of modified gammatone frequency cepstral coefficients (MGFCC) features has been analyzed in combination with support vector machine classifier. The performance evaluation has been carried out with limited amount of real field data at different conditions. The results indicate that the MGFCC features are feasible for underwater acoustic target classification. When compared with MFCC features, the GFCC features are found to yield robust classification results under all noise conditions.

2.3. Classifiers

Aiello et al. [22] investigate the applicability of artificial neural network in acoustics recognition. The success rates of the neural network-based recognition system with various spectral analysis models such as autocorrelation and the mel cepstrum are compared, and a hybrid system consisting of a combination of two different sets of coefficients is also proposed for the performance improvement. Uhrig [23] describes the architecture of artificial neural networks and its resemblance to biological neural networks with its historical aspects of evolution. The paper also describes the various parameters and methodologies adopted for training the neural networks. Allim and Hashem [24] highlight a neural network classifier for sonar signals. The paper also discussed and compared the various factors affecting the shape of the echoes returning from underwater targets like submarines, vessels, and mines. Eapen [25] proposes a neural network underwater target classifier that operates in the presence of random noise. The signals received by the hydrophone are fed to the neural network, and the network is made to adapt to the changes triggered by the targets. The proposed classifier uses back-propagation algorithm, which is widely accepted for multilayer perceptron-like networks. Collins et al. [26] present a comparison of the relative performance of a number of classical classification methods, namely, K-nearest neighbour, Mahalanobis, Euclidean distance, and weighted Euclidean distance. The studies show that the neural network method could outperform all the other classical techniques. Mohankumar et al. [27] highlight the feasibility of realizing an intelligent classifier for marine noise signals, with the help of artificial intelligence, using higher-order cepstral features.

Jahangir et al. [28] present a classification algorithm using hidden Markov models. In the paper, recognition of targets based on target Doppler signatures is described. The paper highlights the moderate data requirement of hidden Markov models for training, which is considered as the major advantage of HMM. Runkle et al. [29] present a new method for target identification in multiaspect target scenarios using hidden Markov models. Couvreur et al. [30] discuss the automatic classification of environmental noise sources from their acoustic signatures using hidden Markov models. The performance of the proposed system is analyzed for the classification of different types of noises and is found to outperform the human listeners as well as conventional classifiers. Binesh et al. [31] present a methodology for a design and performance analysis of an HMM-based underwater signal classification system, utilizing the discrete sine transform-based target-specific features. Binesh et al. [4] address the underwater target classification problem with HMMs. Discrete sine transform-based features along with channel fading compensation have been incorporated for the refinement of the HMM underwater classifiers. The encouraging performance of HMM classifiers in underwater target classification under different operating conditions is presented.

3. Methodology

Acoustic emanations from the underwater noise sources require special processing techniques in automatic target classifier systems because of their distinctive features and propagation characteristics. Target-specific characteristic features can be employed for the design and implementation of underwater target classifiers. The extracted feature set of the targets acts as the dominant factors which govern the classifier performance in recognition of underwater targets. The proposed system extracts the characteristic features using GTCC, which are being utilized in effective training of HMM classifier with 20 states. The various stages of the design methodology are detailed as follows.

3.1. Signal Preprocessing—Spectral Subtraction

In the case of underwater acoustic target classifiers, when the signal spectrum is superimposed by a noise spectrum additively, it can be safely assumed that the spectrum of the collected signal is a linear sum of the signal spectrum and the noise spectrum [32]. This assumption is used to enhance a noise-corrupted target signal by subtracting an estimated average noise spectrum from the noisy signal spectrum, restoring the spectrum of the target signal observed in additive noise [33], and this is called the spectral subtraction technique. A common scenario of the additive noise source is the self-noise of the receiving vessel itself. A pictorial representation of a typical self-noise generation mechanism in underwater data acquisition is shown in Figure 1. The major sources of self-noise are the propeller noise, flow noise, machinery noise, and radiated ship noise. The self-noise depends on the propeller speed, type of the propeller, number of blades, depth of the propeller, the size and shape of the hull, speed of the ship, the position of the transducer array, the vibrations in the machinery, engine noise, and onboard auxiliaries.

Figure 1: The sources of ship self-noise.

When self-noise is added to the target signal , the noisy signal can be represented as where n is the time index and N is the number of samples under consideration. The objective of signal preprocessing is to find the enhanced target signal from the observed , with the assumption that is uncorrelated with .

The algorithm of spectral subtraction divides the input signal into segments of uniform length, and these time domain signals are transformed to the frequency domain using discrete Fourier transform. If an estimate of the self-noise spectrum can be obtained, then an approximation of interested signal from can be estimated as

The noise magnitude spectrum can be estimated as a running average of those signal blocks determined to be primarily noise alone. The average noise magnitude spectrum is then subtracted from the magnitude spectrum of the incoming signal, discarding the negative differences to avoid undesirable spectral excursions in the output. The modified magnitude spectrum is further supplemented with the phase spectrum for generating modified spectrum, and the output signal is recovered via an inverse FFT.

3.2. Gammatone Cepstral Coefficients

Gammatone cepstral coefficients (GTCC) are biologically inspired features computed by applying gammatone filter bank to the spectrum of the signal, followed by the application of the logarithmic and the discrete cosine transformation. Gammatone filter banks are nonuniform overlapping band-pass filters similar to mel-filter banks, which model the response of the human auditory system.

The impulse response of the gammatone filter bank is the product of a gamma distribution function and a sinusoidal tone centered at frequency fc [2, 34], represented as where K is the amplitude factor, n is the filter order, fc is the center frequency in hertz, φ is the phase shift, and B is the equivalent rectangular bandwidth, which represents the duration of the impulse response. The distribution of the center frequency of the filter in gammatone bank follows the equivalent rectangular bandwidth (ERB) scale, used in psychoacoustics, which gives the same approximation as the bandwidth of a rectangular filter with the bandwidth of an auditory filter at each point of human hearing. It delivers the same peak transmission and passes the same total power for a white noise input. The equation describing the value of ERB as a function of center frequency, F (in hertz), is

The ERB value at a center frequency of 1 kHz is approximately 132 Hz, which corresponds to one step of ERB number in the ERB scale. The mathematical expression relating ERB number to the frequency, F (in hertz), is given as

The complete process flow of a GTCC feature extraction process is depicted in Figure 2. The GTCC feature estimation process divides the preemphasized incoming signal stream into finite width frames and a windowing function. The signal is windowed into subframes with a size of 10–50 ms [2]. The windowing process is meant to ensure the stationarity for a short duration to the signal in case signals exhibit nonstationary characteristics, to facilitate spectro-temporal analysis as well as to increase the efficiency of the features extracted by rectangular window. The effect of frame discontinuities is removed, and the signals are converted into frequency domain using Fourier transform. The frequency domain frames are transformed into ERB, and the frequency scale warping converts it into the cepstrum domain.

Figure 2: GTCC feature extraction.
3.3. Vector Quantization

Vector quantization (VQ) is a widely used technique for data compression, attained by mapping a vector from the large vector space into finite set of Voronoi regions. These regions denote the centroids of the actual signal space distribution. The Voronoi region is represented as

The VQ process projects a -dimensional vector in the vector space into a finite set of regions expressed by the vectors , that is, the set of all possible reconstruction vectors. Each vector is termed as a codeword, and all the collections of codewords are termed as a codebook. The feature vectors for training the classifier are mapped into code words before the training stage.

3.4. Rayleigh Fading Compensation

The ocean’s surface, bottom, and reflectors along with scatters having random heterogeneity introduce multipath effects in underwater acoustic propagation channels, thereby resulting in the formation of additional sub-Eigen path components, which override the contribution of the dominant component severely. In such a scenario, the interaction of signals travelling along different paths causes fluctuations of the received signal in both amplitude and phase, due to the phenomenon known as multipath Rayleigh fading, thereby affecting the SNR of the signal significantly. The Clarke model of the Rayleigh fading channel is given as where is the wave direction, is the initial phase of the th path, and is the Doppler frequency.

The envelope of the underwater acoustic (UWA) channel with multipath effects obeying Rayleigh random distribution is approximated by an autoregressive model (AR) using the correlation matching property [35]. The time domain recursion for a th-order AR process is given as where is the AR model filter coefficients, is a complex white Gaussian noise process with zero mean, is the simulator output, is the variance of , and is the power spectral density (PSD) of the AR process. The relationship between the desired autocorrelation function (ACF) model , of the fading model, and the is given by [6]

For in matrix form is given as

From the given desired ACF sequence, AR filter coefficients can be determined by solving the set of Yule-Walker equations. These equations can be solved efficiently by the Levinson-Durbin equation to obtain the channel parameters [36]. The unique solution to the Yule-Walker equations is given by [37]

The interaction of signals travelling along different paths in the underwater channel induces multipath Rayleigh fading, and it affects the SNR of the signal significantly. Thus a signal requires fading compensation; if not, it will reduce the ATR performance drastically. The proposed HMM classifier system incorporates a prototype communication system wholly for the purpose of estimating the Rayleigh fading, thereby compensating the ill effects induced in the signals by underwater channel [4]. The Rayleigh fading model represented by (7) is modeled using an AR method as in (8), (9), (10), (11), and (12) and is assumed to provide a fading effect closer to the actual one. This Rayleigh fading model is integrated in the prototype communication system for compensating the channel fading effects.

The prototype communication system fundamentally consists of quadrature amplitude modulation and demodulation subsystems. In the quadrature amplitude modulation stage, the signal data are represented by the modulated output of two carrier components which are in quadrature. The prototype communication system emits a pilot signal along with the quadrature amplitude modulated signals, which can effectively contribute to the fading compensation as in [4, 38]. The receiver subsystem estimates the fading of the signal, utilizing the fading information of the channel extracted with reference to the pilot signal. The Rayleigh fading compensation to the affected signal is provided by utilizing transformations of linear interpolation with given suitable conditions. The demodulator subsystem provides the channel parameter estimation for minimizing the effects of Rayleigh fading.

3.5. Hidden Markov Models

Hidden Markov models (HMMs) provide an effective architecture for target classification of distinct targets in multiple target scenarios, when it is accurately designed and trained. Hidden Markov models can be defined as a statistical model in which the system being modeled is assumed to be a Markov process with indefinite parameters and the model tries to determine the hidden parameters from the recognizable observations [5]. The HMM consists of a finite set of states, and each state is associated with a probability distribution.

The transitions between the different states are statistically organized by a set of probabilities called transition probabilities . In the second stochastic process, the state is not directly visible, but the variables influencing the state are visible. Each state has a probability distribution over the possible output tokens termed as observation probability . An HMM is defined by the number of states , initial state probability , transition probability , and observation probability . The HMM of a stochastic process is defined as in the following equations: where is the probability that the state at time is , when the state at time is ; is the probability that symbol is emitted in state ; is the state at time ; and represents the number of distinct observations per state.

The topology of the HMM used in this work is a hybrid HMM, which can be used for both transient and continuous signals. In this work, all the targets under consideration are modeled using ergodic HMM. The training signals are end pointed manually, and the silent frames are stripped off from the test utterances with the help of an automatic detector. The training of the HMM is achieved using GTCC feature set, which is clustered using K-means algorithm by setting centroids and stored as a matrix. The cluster centroids and cluster number for each data are indexed in an array separately. The sum of squares is minimized iteratively using a function for quantized and unique clusters for modeling the HMM.

Learning the HMM includes the adjustment of parameters to maximize given the observation sequence . Among the several optimization criteria for learning, maximum likelihood (ML), an effective optimization technique to maximize the probability of a given observation sequence for an HMM of class , is given by [5, 39]

The model that maximizes the quantity is solved using an iterative procedure by the well-known Baum-Welch algorithm by choosing appropriate model parameters.

3.6. Implementation Scheme

HMM is trained with the vector-quantized features of the underwater target signals belonging to the training data, and the state transition probabilities as in (14) are estimated for each target. The HMM parameters, observation symbol probability, and initial state distributions as in (15) and (16) are estimated using the Baum-Welch algorithm. For each target, the model parameters are estimated and indexed in the database for utilization in the test phase. The block schematic representation of the HMM classifier design is depicted in Figure 3. The flowchart of the target class association prior to HMM classifier design is shown in Figure 4.

Figure 3: HMM classifier design.
Figure 4: Flowchart of class association prior to HMM design.

During the test phase, for an unknown target signal, the GTCC features as in (3), (4), and (5) are extracted, and normalized likelihood emission probabilities are estimated. From the collection of likelihood estimate for each model, the recognition is made with maximum similarity of test profile and training model. For the field collected data, the self-noise compensation is effected in the proposed system as in (2), for improving the performance of the classifier.

4. Performance Evaluation

The performance validation of the proposed GTCC-based HMM classifier has been carried out with the enhanced field collected underwater signals as well as signals from precollected open sources. An HMM classifier with 20 state has been trained with GTCC-extracted features. For training the HMM, transition probability matrix corresponding to each target in the training set has been estimated. Subsequently, normalized likelihood emission probability parameters, which determine the performance of the classifier, are estimated from the training data. The performance of the GTCC-based HMM classifier with self-noise conditions and under Rayleigh fading environment is investigated. The effect of fading on the classifier performance is analyzed, and the required fading compensation is provided. Analysis on the classifier performance improvement, on self-noise reduction to field data utilizing spectral subtraction, is also investigated.

A study of the performance evaluation of the proposed classifier with Euclidean distance classifier and ANN classifier under different operating conditions—ideal, self-noisy environments and multipath Rayleigh fading channels, has been carried out. The robustness of the GTCC features has been investigated against LPC, LPCC, MFCC, and NMF features in combination with Euclidean distance, ANN, and HMM classifiers, and the results are presented.

4.1. Field Data Collection

For the collection of field data, two cruises were made on Fishery Oceanographic Research Vessel Sagar Sampada, an Indian research vessel that is equipped to carry out multidisciplinary research in oceanography with an objective of monitoring, collection, and characterization of acoustic emanation from target noise sources. The recording equipment used includes compact data acquisition system, noise analyzer, steel buoys, B&K 8104 hydrophones, preamplifiers, computational platforms, and portable battery backup. The data acquisition systems make use of a steel buoy with a standard linear hydrophone additive array, comprising transducers. The outputs from each transducer elements are combined linearly to form a single output [40]. The first cruise, #321 of the vessel during December 3–15, 2013, was in the south-eastern Indian Ocean. Several possible self-noise interferences in real-time recording conditions were investigated on the basis of which the recording instruments were reconfigured and validated. The formulations of different recording strategies were also intended to be carried out during the cruise. Recordings from 7 stations with different depth profiles (ranging from 30 m to 2845 m) were collected during the cruise. The steel buoy deployment in the ocean for data acquisition from the Sagar Sampada is shown in Figure 5.

Figure 5: Buoy deployment.

The second data collection expedition, cruise #339, during May 2–16, 2015, was carried out in the North Eastern Arabian Sea. The recordings were taken with propeller “off” and engine and auxiliaries in “on” condition. The buoy was deployed only up to a maximum of 0.6 km distance from ship, and the target emanations are recorded. The averaged self-noise level of the ship obtained during the experiments at different conditions is 170 dB re μPa.

Further, three acoustic surveys were carried out during December 2–4, 2014, in the Cochin backwaters for field data collection near the shipping channels of Kochi International Container Transshipment Terminal in trawler boats. The engine of the boat was turned off during recording in order to avoid self-noise. Several noises that emanated from biological as well as manmade noise sources were collected. Since these recordings were not affected by self-noise, spectral subtraction is not applied to enhance the signals.

For the development of an efficient underwater target classifier capable of being deployed in a real-world scenario, the field data should be collected at different sea states. The observations of a particular target at different conditions are difficult to acquire due to the practical limitations in the measurement conditions. However, few target records were collected at different sea states during the expeditions. In this work, a minimum of ten targets are considered for the reliable validation of the classifier system. Variations in sea states can be regarded as different SNR conditions due to the changes in background shipping noise and transmission loss. For simulating various sea states, the recordings were augmented with background shipping noise and white Gaussian noise at different SNRs. The target records used for evaluating the classifier system is presented in Table 1. Targets labelled as , , and are from the precollected database available in the Department of Electronics, Cochin University of Science and Technology, and the remaining records were collected from the field. Targets records of , , , , and were collected at different sea states during the experiments. The precollected database is also augmented in order to obtain different sea states. The duration of target records used for the class “Ship” and “Marine species” is 4 minutes and 1 minute, respectively.

Table 1: Target records.
4.2. Results and Discussions

The performance of the GTCC-based HMM target classifier has been evaluated with the underwater target records detailed in Table 1, under different conditions. The field data collected during the expeditions have been affected by the self-noise of the ship. Thus a signal preconditioning stage by spectral subtraction has been carried out prior to feature extraction. For demonstrating the effect of self-noise in the spectrogram, dolphin calls observed from the field affected by self-noise are shown in Figure 6 and the retrieved signal with spectral subtraction is depicted in Figure 7.

Figure 6: Spectrogram of observations.
Figure 7: The spectrogram of the retrieved dolphin call.

The training set for the prototype HMM classifier consists of target noises from 6 vessels and 4 biological species. GTCC features have been extracted from these signals and subsequently vector quantized as in (6). K-means algorithm with K = 8 is used for vector quantization, and the HMM classifier has been trained with the clustered features. The value of K is optimally selected based on trial and error, considering the performance and complexity of the system. The classifier performance obtained for K = 4 was not satisfactory, whereas at K = 16 there was no significant improvement in the performance. Figure 8 portrays the K-means-clustered GTCC feature vector distribution of the 10 targets used for HMM training. The centroid distribution of the GTCC feature vectors conveys the unambiguous separability of the GTCC features in representing the targets of various categories, which in turn can be utilized for developing class-specific target models and the estimation of normalized likelihood HMM stochastic parameter.

Figure 8: K-means-clustered GTCC features of the 10 targets.

The HMM classifier has been trained with the clustered GTCC features for estimating the target models as in (13) for each class. For training the HMM, codebook size is optimally selected as 16. The number of states for the HMM is optimized as 20, based on the training performance validation on random state initialization during various trials, considering the scalability and upgradability of the classifier system. The validation data set has been used for this optimization. For each target model, the state transition probability matrix , observation probability matrix , and initial state probability () have been estimated using the iterative Baum-Welch algorithm. The transition matrix consists of 400 elements corresponding to the 20 states, where each element corresponds to the probability of a particular state transition to another. The model is utilized in the testing phase of the HMM target classifier.

In the testing phase of the classifier, the unknown target to be identified is feature extracted using GTCC, and normalized likelihood emission probabilities are estimated. From the maximum likelihood values obtained, an optimum match to the trained HMM target model is identified, and the target label is assigned. The iterative likelihood values estimated during the optimization of a typical target are depicted in Figure 9.

Figure 9: HMM likelihood performance.

The performance evaluation of the proposed GTCC HMM classifier has been carried out with a target record of 10 classes, comprising 4550 samples, that is 455 frames of each class from the recordings. Because of the restrictions in the available data, the samples are considered at frame level and the frame rate considered for each target sample is 130 ms. In order to obtain statistically relevant results, the proposed GTCC-based HMM classifier system has been validated with a 10-fold cross-validation approach. The classifier training was performed with a training set of 3640 samples of 10 target classes, which corresponds to 80% of the target records. The validation of the system has been done with 10% (364) of the training record in 10 folds. In each fold, the validation and training have been carried out with disjoint data sets to avoid overfitting issues. The averaged validation efficiency obtained for the system is 95.1%. The validation performance during the 10-fold cross-validation is depicted in Figure 10. The test phase utilizes 20% (910) of the samples augmented at different SNR levels, for representing different sea states. The SNR levels used for augmentation ranges from 0 dB to 20 dB (0 dB, 5 dB, 10 dB, 15 dB, and 20 dB). The success rates obtained for HMM, ANN, and Euclidean classifiers at different SNRs are depicted in Table 2.

Figure 10: K-fold cross-validation performance.
Table 2: Classifiers performance comparison.

The averaged performance of the system at different SNRs subjected to various test conditions is detailed in Table 3. The success rate of the proposed GTCC-based HMM classifier compared with ANNs and Euclidean distance classifier under ambient noise-free conditions is estimated. The ANN classifier utilizes multilayer perceptron feedforward network with sigmoid activation function, and the training is performed with back-propagation algorithm. ANN consists of one input layer, one hidden layer with 20 neurons, and one output layer. The performance of the proposed classifier is found to be encouraging, with a success rate of 93%, whereas the performance of ANNs and Euclidean distance classifier obtained is 89% and 86%, respectively, under ambient noise-free conditions.

Table 3: Classifiers performance comparison.

The performance of the system trained with targets obscured by self-noise with and without spectral subtraction compensation has been presented in Table 3. The performance of all systems under consideration has found to be drastically reduced with the effect of self-noise. The success rates of HMMs, ANNs, and Euclidean distance classifier under self-noise conditions obtained are 47%, 41%, and 38%, respectively, when trained without the knowledge of self-noise, whereas the performance was found to be improved to 74%, 67%, and 63%, respectively, when trained with self-noise-affected signals. Although it yielded better results, in situations with prevailing self-noise, training the classifiers with self-noise-affected signals may not be a reliable method as the success rates vary in proportion to the noise levels at different occasions. Hence an attempt to improve the performance of the classifiers by eliminating the self-noise with spectral subtraction before training has been carried out, and the success rates for HMMs, ANNs, and Euclidean classifiers obtained are 85%, 79%, and 68%, respectively. The proposed HMM classifier is found to yield a better success rate compared to the ANNs and Euclidean distance classifiers. The target recognition capability of all the classifiers mentioned, trained with nonfaded signals, has found to be drastically reduced while being operated in the Rayleigh fading channels, without channel compensation. The success rates obtained are 49%, 44%, and 38%, respectively, as presented in Table 3. Meanwhile the success rate has increased to 83%, 78%, and 75%, for HMMs, ANNs, and Euclidean distance classifiers, respectively, when the classifiers are trained with the faded signals. The classifier performance has been analyzed with the Rayleigh fading compensation, and the success rates obtained for HMM, ANN, and Euclidean classifiers were 89%, 84%, and 81%, respectively.

The best state sequence obtained for ship 1 affected by self-noise, Rayleigh fading, and signal alone, using the Baum-Welch algorithm for a 20-state HMM, is depicted in Figure 11. The variations demonstrated in the HMM state sequence by the self-noise and fading effects induce ambiguities in the decision making of the classifier, thereby affecting the performance. As detailed in Table 3, with the AR Rayleigh fading compensation, it is observed that the success rate of all the classifiers has improved significantly. With the AR fading compensation, the proposed classifier has been found to perform well compared to other classifiers.

Figure 11: HMM 1 (ship 1—signal alone), HMM 2 (ship 1—signal obscured by Rayleigh faded), and HMM 3 (ship 1—signal obscured by self-noise).

The confusion matrix of the ANN and HMM classifier obtained for test data augmented at 10 dB noise level is depicted in Figures 12 and 13, respectively. The test data consist of 910 samples of 10 targets, comprising 91 samples corresponding to each target. From the confusion matrix of both ANN and HMM, it has been observed that targets and were highly misclassified compared to other targets. For the ANN classifier, 16% of and 13% of have been misclassified as and , respectively; whereas in HMM classifier, 8% of and 10% of have been misclassified as and , respectively. is a hopper dredger and is a grab hopper dredger, where both targets share similar attributes. HMM is observed to be performing well in this case compared to the ANN classifier. The decreased performance obtained for these similar targets is due to the ambiguity that arises in prediction at increased noise levels.

Figure 12: Confusion matrix—ANN.
Figure 13: Confusion matrix—HMM.

The success rate of the HMM underwater target classifier with several feature extraction schemes such as LPC, LPCC, MFCC, and NMF has been compared with that of GTCC features and is presented in Table 4. The feature coefficients were extracted from target data waveforms that covers the frequency range [50, 6400] Hz, utilizing the respective feature extraction schemes. Both the training and validation data have been used to estimate the parameters of the LPC, LPCC, MFCC, and NMF configurations. In all the feature extraction schemes, the data are sampled at 13.2 kHz. The feature coefficients are optimally selected for LPC, LPCC, MFCC, NMF, and GTCC as 30, 27, 26, 31, and 25, respectively, based on the performance obtained during various trials using forward feature selection algorithm. The success rates obtained for each feature extraction scheme during the trial are depicted in Figure 14. The GTCC method gives the highest success rates of 89% with the AR Rayleigh fading compensation.

Table 4: Features-performance comparison.
Figure 14: Performance at different feature coefficients.

Under similar conditions, the success rates of Euclidean distance and ANN classifier were also estimated and tabulated in Table 4. The proposed GTCC-based HMM classifier with the AR Rayleigh fading compensation is observed to be performing with a better success rate over the Euclidean classifier and ANN classifier with other feature extraction schemes, especially in fading conditions of a typical underwater channel. The computational time taken for testing an instance in HMM and ANN classifiers is observed at 18.05 ms and 20.82 ms, respectively. The computation time is system dependent, and the configuration of the computational platform used in this work was Intel® Xeon® CPU 3.07 GHz with 8 GB memory. The proposed system can be implemented efficiently for real-time underwater signals, which marks its applicability in underwater monitoring, detection, and classification missions.

5. Conclusion

In this work, an underwater target classifier based on GTCC and an HMM with 20 states has been described and evaluated on real field data. The implementation scheme involves GTCC feature extraction, training, validation, and testing of the HMM classifier. Classifier performance in ideal, self-noise, and Rayleigh fading compensation has been investigated. The proposed classifier with the fading compensation yielded an encouraging success rate of 89%, when compared to ANN and Euclidean distance classifiers. The robustness of GTCC has been analyzed against that of LPC, LPCC, MFCC, and NMF features. GTCC features are found to perform well in combination with HMM and ANN, with a success rate of 89% and 84%, respectively. The improved classification capability and robustness of the proposed system for real-time underwater signals in the fading channel conditions make the system acceptable for underwater naval applications like monitoring, target detection, and classification. Modified GTCC features and further refinement of the HMM classifier can improve the performance of the prototype.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


The authors greatly acknowledge the Naval Research Board, New Delhi, India, for the financial assistance provided and the Centre for Ocean Electronics (CUCENTOL), and the Department of Electronics, Cochin University of Science and Technology, Kerala, India, for extending all the facilities including the underwater test facility (UTF) during the research. The authors are grateful to the Ministry of Earth Sciences (MoES), New Delhi, India, and Centre for Marine Living Resources and Ecology (CMLRE) Cochin, India, for the Research Vessel FORV Sagar Sampada during the field data collection expedition and Indian Institute of Technology (IIT) Delhi for the research collaboration.


  1. L. A. Pflug, P. M. Jackson, J. W. Ioup, and G. E. Ioup, “Moment analysis of ambient noise dominated by local shipping,” in Proceedings of 8th Workshop on Statistical Signal and Array Processing, pp. 271–274, Corfu, Greece, Greece, 1996. View at Publisher · View at Google Scholar
  2. X. Valero and F. Alias, “Gammatone cepstral coefficients: biologically inspired features for non-speech audio classification,” IEEE Transactions on Multimedia, vol. 14, no. 6, pp. 1684–1689, 2012. View at Publisher · View at Google Scholar · View at Scopus
  3. Z. Lian, K. Xu, J. Wan, and G. Li, “Underwater acoustic target classification based on modified GFCC features,” in 2017 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), pp. 258–262, Chongqing, China, 2017. View at Publisher · View at Google Scholar · View at Scopus
  4. T. Binesh, P. R. S. Pillai, and M. H. Supriya, “An efficient HMM underwater signal classifier with enhanced fading channel performance,” Journal of Circuits, Systems and Computers, vol. 23, no. 09, article 1450121, 2014. View at Publisher · View at Google Scholar · View at Scopus
  5. L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 1989. View at Publisher · View at Google Scholar · View at Scopus
  6. K. E. Baddour and N. C. Beaulieu, “Autoregressive modeling for fading channel simulation,” IEEE Transactions on Wireless Communications, vol. 4, no. 4, pp. 1650–1662, 2005. View at Publisher · View at Google Scholar · View at Scopus
  7. G. M. Wenz, “Review of underwater acoustics research: noise,” The Journal of the Acoustical Society of America, vol. 51, no. 3B, pp. 1010–1024, 1972. View at Publisher · View at Google Scholar · View at Scopus
  8. T. S. Pieng, K. T. Beng, P. Venugopalan, M. A. Chitre, and J. R. Potter, “Development of a shallow water ambient noise database,” in Proceedings of the 2004 International Symposium on Underwater Technology (IEEE Cat. No.04EX869), pp. 169–173, Taipei, Taiwan, 2004. View at Publisher · View at Google Scholar
  9. S. Kamal, M. H. Supriya, and P. R. S. Pillai, “Mitigating ambient noise in underwater acoustic receivers using independent component analysis,” in 2011 International Symposium on Ocean Electronics (SYMPOL), Kochi, India, 2011. View at Publisher · View at Google Scholar · View at Scopus
  10. L. M. Gray, “Source level model for propeller blade rate radiation for the world’s merchant fleet,” The Journal of the Acoustical Society of America, vol. 67, no. 2, pp. 516–522, 1980. View at Publisher · View at Google Scholar · View at Scopus
  11. P. T. Arveson and D. J. Vendittis, “Radiated noise characteristics of a modern cargo ship,” The Journal of the Acoustical Society of America, vol. 107, no. 1, pp. 118–129, 2000. View at Publisher · View at Google Scholar · View at Scopus
  12. M. Bouvet and S. C. Schwartz, “Underwater noises: statistical modelling, detection and normalization,” The Journal of the Acoustical Society of America, vol. 83, no. 3, pp. 1023–1033, 1988. View at Publisher · View at Google Scholar · View at Scopus
  13. M. Supriya, S. Kamal, and P. R. S. Pillai, “Alleviation of uniform ambient noise in underwater acoustic communication receivers using spectral subtraction,” in Société Française d'Acoustique, Nantes, France, 2012.
  14. K.-P. Pun, W. Han, C.-F. Chan, and C.-S. Choy, “An efficient MFCC extraction method in speech recognition,” in Proceedings 2006 IEEE International Symposium on Circuits and Systems, pp. 145–148, Island of Kos, Greece, 2006. View at Publisher · View at Google Scholar
  15. C. J. Van der Merwe and J. A.d. Preez, “Calculation of LPC-based cepstrum coefficients using mel-scale frequency warping,” in Proceedings of South African Symposium on Communications and Signal Processing, pp. 17–21, Pretoria, South Africa, 1991. View at Publisher · View at Google Scholar
  16. S. Imai, “Cepstral analysis synthesis on the mel frequency scale,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, p. 96, Boston, Massachusetts, USA, 1983. View at Publisher · View at Google Scholar
  17. S. Furui, “Cepstral analysis technique for automatic speaker verification,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 29, no. 2, pp. 254–272, 1981. View at Publisher · View at Google Scholar · View at Scopus
  18. A. Eronen, “Comparison of features for musical instrument recognition,” in Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No. 01TH8575), pp. 21–24, New Platz, NY, USA, 2001. View at Publisher · View at Google Scholar
  19. T. R. J. Kumari and H. S. Jayanna, “Comparison of LPCC and MFCC features and GMM and GMM-UBM modeling for limited data speaker verification,” in 2014 IEEE International Conference on Computational Intelligence and Computing Research, pp. 95–103, Coimbatore, India, 2015. View at Publisher · View at Google Scholar · View at Scopus
  20. M. R. Azimi-Sadjadi, M. Robinson, A. A. Jamshidi, and G. J. Dobeck, “A biologically inspired adaptive underwater target classification using a multi-aspect decision feedback unit,” in Proceedings of the MTS/IEEE International Conference on Oceans, vol. 1, pp. 38–45, Biloxi, MI, USA, 2002. View at Publisher · View at Google Scholar
  21. K. Mohankumar, M. H. Supriya, and P. R. Saseendran Pillai, “Bispectral gammatone cepstral coefficient based neural network classifier,” in 2015 IEEE Underwater Technology (UT), Chennai, India, 2015. View at Publisher · View at Google Scholar · View at Scopus
  22. M. Aiello, A. Cataliotti, and S. Nuccio, “A comparison of spectrum estimation techniques for nonstationary signals in induction motor drive measurements,” IEEE Transactions on Instrumentation and Measurement, vol. 54, no. 6, pp. 2264–2271, 2005. View at Publisher · View at Google Scholar · View at Scopus
  23. R. E. Uhrig, “Introduction to artificial neural networks,” in Proceedings of the Electronic Technology Directions to the Year 2000, pp. 36–62, Adelaide, SA, Australia, 1995. View at Publisher · View at Google Scholar
  24. O. A. Allim and H. F. Hashem, “Automatic recognition of the sonar signals using neural network,” in Proceedings of the Fifteenth National Radio Science Conference (Cat. No. 98EX109), pp. 1–8, Cairo, Egypt, 1998. View at Publisher · View at Google Scholar
  25. A. Eapen, “Neural network for underwater target detection,” in Proceedings of the IEEE Conference on Neural Networks for Ocean Engineering, pp. 91–98, Washington, DC, USA, 1991. View at Publisher · View at Google Scholar
  26. A. K. Patel, W. A. Wright, and P. R. Collins, “Target classification using neural and classical techniques,” in Proceedings of the Third International Conference on Artificial Neural Networks, pp. 238–242, Brighton, UK, 1993.
  27. K. Mohankumar, M. H. Supriya, and P. R. S. Pillai, “Implementation of a neural network based bicepstral classifier for marine noise sources,” in 2011 International Symposium on Ocean Electronics (SYMPOL), pp. 157–162, Kochi, India, 2011. View at Publisher · View at Google Scholar · View at Scopus
  28. M. Jahangir, K. M. Ponting, and J. W. O'Loghlen, “Robust Doppler classification technique based on hidden Markov models,” IEEE Proceedings-Radar, Sonar and Navigation, vol. 150, no. 1, pp. 33–36, 2003. View at Publisher · View at Google Scholar · View at Scopus
  29. P. R. Runkle, P. K. Bharadwaj, L. Couchman, and L. Carin, “Hidden Markov models for multiaspect target classification,” IEEE Transactions on Signal Processing, vol. 47, no. 7, pp. 2035–2040, 1999. View at Publisher · View at Google Scholar · View at Scopus
  30. C. Couvreura, V. Fontaine, P. Gaunard, and C. G. Mubikangieya, “Automatic classification of environmental noise events by hidden Markov model,” in Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 6, pp. 3609–3612, Seattle, WA, USA, 1998. View at Publisher · View at Google Scholar · View at Scopus
  31. T. Binesh, M. H. Supriya, and P. R. S. Pillai, “Discrete sine transform based HMM underwater signal classifier,” in 2011 International Symposium on Ocean Electronics (SYMPOL), pp. 152–156, Kochi, India, 2011. View at Publisher · View at Google Scholar · View at Scopus
  32. M. H. Supriya, S. Kamal, and P. R. S. Pillai, “Reduction of self-noise effects in onboard acoustic receivers of vessels using spectral subtraction,” in Proceedings of the Acoustics Nantes Conference, pp. 3793–3798, Nantes, France, 2012.
  33. S. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 27, no. 2, pp. 113–120, 1979. View at Publisher · View at Google Scholar · View at Scopus
  34. R. Schlüter, I. Bezrukov, H. Wagner, and H. Ney, “Gammatone features and feature combination for large vocabulary speech recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, Honolulu, HI, USA, 2007. View at Publisher · View at Google Scholar · View at Scopus
  35. J. G. Proakis, Digital Communication, McGraw-Hill, 3rd edition, 1983.
  36. A. A. Giordano and F. M. Hsu, Least Square Estimation with Applications to Digital Signal Processing, Wiley, New York, NY, USA, 1985.
  37. S. S. Haykin, Adaptive Filter Theory, Englewood Cliffs, NJ, USA, Prentice-Hall, 2nd edition, 1991.
  38. I. W. Selesnick and C. S. Burrus, “Generalized digital Butterworth filter design,” in 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, vol. 3, pp. 1367–1370, Atlanta, GA, USA, 1996. View at Publisher · View at Google Scholar
  39. J. Bilmes, What HMMs Can Do, Technical report, University of Washington, 2002.
  40. R. Urick, Principles of Underwater Sound, McGraw-Hill, USA, 1975.