Mathematical Problems in Engineering

Volume 2015, Article ID 439264, 13 pages

http://dx.doi.org/10.1155/2015/439264

## A New Method of Blind Source Separation Using Single-Channel ICA Based on Higher-Order Statistics

^{1}Department of Information Engineering, University of Electronic Science and Technology of China, Chengdu, China^{2}College of Urban Railway Transportation, Shanghai University of Engineering and Science, Shanghai, China

Received 1 April 2015; Accepted 22 July 2015

Academic Editor: Carla Roque

Copyright © 2015 Guangkuo Lu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Methods of utilizing independent component analysis (ICA) give little guidance about practical considerations for separating single-channel real-world data, in which most of them are nonlinear, nonstationary, and even chaotic in many fields. To solve this problem, a three-step method is provided in this paper. In the first step, the measured signal which is assumed to be piecewise higher order stationary time series is introduced and divided into a series of higher order stationary segments by applying a modified segmentation algorithm. Then the state space is reconstructed and the single-channel signal is transformed into a pseudo multiple input multiple output (MIMO) mode using a method of nonlinear analysis based on the high order statistics (HOS). In the last step, ICA is performed on the pseudo MIMO data to decompose the single channel recording into its underlying independent components (ICs) and the interested ICs are then extracted. Finally, the effectiveness and excellence of the higher order single-channel ICA (SCICA) method are validated with measured data throughout experiments. Also, the proposed method in this paper is proved to be more robust under different SNR and/or embedding dimension via explicit formulae and simulations.

#### 1. Introduction

As one of the most attractive solutions for the blind source separation (BSS) problem, independent component analysis (ICA) has a strong practical background and wide applications in multiway data analysis such as biomedicine [1], image processing [2], telecommunications [3], geophysical research field [4, 5], and physics of musical instruments [6, 7], because it is a combination of informationism, optimal theory, probability, matrix theory, and mathematical statistics. Indeed, several papers recently have been written in which standard linear ICA, for instantaneous mixtures, was successfully applied to many natural environments as explosion-quakes and tremor at Stromboli and Erebus volcanoes [8, 9], to study acoustical and mechanical vibrational field in organ pipe [10]. Particularly, in intuitive way a segmentation combined with a ICA approach was already proposed to get information on the very long-period waves from water-level oscillations [11, 12], in which the ICA, the intertime occurrence, and the reconstruction of asymptotic dynamics are adopted after a preanalysis in the frequency domain. Obviously, ICA appears more appropriate in the investigation of nonlinear systems than the analyses based on the Fourier transform, even though several tidal behaviours have been pointed out by frequency-domain methods.

Generally, the number of sensors must be no less than that of the sources to acquire information to support the BSS work. Often in real cases, however, one has just a single measure of a certain specific physical variable, from which information on the underlying source mechanism has to be derived. In this case, the topic faced by the researchers is very important and difficult, that is, the extraction of characteristics from single experimental series, because of the lack of prior information. The method called SCBSS is proposed to exact the independent feature by using only one transducer. The methods employ ICA [13] to find the interested and independent feature from the decomposed signals based on oversampling [14], principal component analysis (PCA) [15], short time Fourier transform (STFT) [16], wavelet transform (WT) [17], empirical mode decomposition (EMD) [18, 19], and so forth. The SCBSS methods based on oversampling and PCA can be used to separate the single-channel signal, however, only when the source signals in the linear mixture are stationary and independent. The methods based on STFT and WT can solve the nonstationary problem whereas they do not work to separate a nonlinear time series, which is generated from most of the natural [20, 21] and artificial systems in some fields like wireless communication and radar and sonar engineering. The SCBSS method based on EMD can be directly used to separate these single-channel signals, which are nonlinear, nonstationary, and even chaotic. But this method will break down if any source signal is not an Intrinsic Mode Function (IMF). Particularly, if the mixture contains spike pulses or the source signals have different time of arrival (TOA), which introduce some spurious extrema into the mixture, the EMD algorithm suffers from the problem of IMF confusion, and then a number of phantom sources can appear in the decomposed signals. To solve these problems, the way of thinking needs to be changed for developing new methods.

In a dynamical embedding framework, the measured data can be assumed to be generated by the nonlinear interaction of just a few degrees of freedom, with additive noise, and suggests the existence of an unobservable deterministic generator of the observed data. Obviously, in this case the reconstructed phase space (RPS) can be used to uncover as much information as possible about the underlying generators based only on the measured data [22], and the ICA algorithm could be then performed on the embedding matrix to exact its underlying ICs in the SCICA method proposed by James and Lowe [23, 24]. However, the SCICA method can successfully separate the measured signal only when the time series is nonlinear and stationary. Also, the key of the SCICA algorithm is to change the one-dimension time series to equivalent multidimensions through the RPS method. In order to achieve a better result, a larger embedding dimension should be taken, which could greatly increase computational complexity. To overcome this shortcoming, Ma [25] has developed a novel method, in which the stationary segments are firstly gotten by the Bernaola-Galavan (BG) segmentation algorithm [26], then the embedding dimension is reduced by singular spectrum analysis (SSA) [27], and the ICs are finally generated by ICA. The computational complexity of Ma’s method is successfully decreased from that of SCICA while achieving better performance. Unfortunately, although the selection of a suitable window length in Ma’s method, which is crucial for the resolution of the SSA method to be computed by means of multiple autocorrelation, some subjective factors are introduced in the computational process. Essentially, SSA is a linear method based on the covariance matrix which reflects the linear relationship of the source signals and cannot reflect the intrinsic nonlinear relationship of them, although SSA-based method has been successfully applied in the field of signal processing for nonlinear dynamical systems. Particularly, the eigenvalues of the covariance matrix cannot be used to select a series of features to reconstruct the original time series if the signal to noise ratio (SNR) is too low or the embedding dimension is not correctly selected. Moreover, a multistage SSA algorithm [25] has been proposed to exact the feature signal under strong noise levels, which greatly increases the computational complexity.

In order to find a solution to the aforementioned problems, a modified method based on HOS is developed in this paper. Section 2 introduces signal model and the problem that needs to be solved in this paper. Section 3 contains the HOS-based SCICA method and simulations are carried out to verify the effectiveness of the method in Section 4. Finally, the conclusion of the paper is given in Section 5.

*Notations.* Hereinafter, bold uppercase letters denote matrices; bold lowercase letters stand for column vectors and lowercase letters represent scalars. Superscripts , , and denote transpose, absolute value, and Frobenius-2 norm, respectively. is the expectation operator. is the th entry of . denotes convolution. denotes real number domain. and denote the true value and the estimate of variable , respectively.

#### 2. Problem Statement

##### 2.1. Data Model

Generally, the observed single-channel signal could be modeled as a single-channel instantaneous linear mixture (SCILM) of unknown independent signal :where is the weight of source signal the th and is the zero-mean additive white Gaussian noise of unknown covariance. Obviously, ICA does not work when only one sensor could be employed. For the scenario, the key is to change the single time series into multidimensional time series before ICA will be used to separate the preprocessed signals. Based on different additional assumptions, the single-channel data can be reconstructed into different pseudo-MIMO models by different decomposition methods, such as PCA, STFT, WT, SSA, and EMD.

##### 2.2. Single-Channel ICA

When the actual data is treated as a nonlinear time series with additive noise which is generated by the nonlinear interaction of just a few degrees of freedom, we can use the SCICA algorithm to solve the SCBSS problem. RPS is the first and foremost step, when the dynamic system theory is utilized to analyze a nonlinear time series. In [22], Takens shows that the map defined by is embedding, where the -dimensional state space , is a twice continuously differentiable diffeomorphism that describes the dynamics of the system and is a twice continuously differentiable function representing the observation of a single state variable. Generally, the embedding dimension must be large enough to capture the necessary information. Then for a nonlinear time series , the state of the unobservable system at time , is given bywhere is the lag, is the embedding dimension, and is the sampling interval.

Any approach to state space reconstruction uses the information in delay coordinates as a starting point. Obviously, Takens’ theorem allows us to reconstruct the unknown dynamical system that generates the measured time series by reconstructing a new state space based on the successive observations of the time series. It is indicated that the RPS of the nonlinear time series is the essential projection of the strange attractor on the axis of the space spanned by delay vectors. Therefore, each time series constructed by each delay vector can be regarded as a mixture of source signals. As shown in [23], the method based on RPS could be used to change the single-channel data into multidimensions time series. Then ICA can be used to span the embedding matrix with any ICs and to exact the feature.

##### 2.3. Problem Statement

SCICA could separate a single-channel time series successfully if and only if this method satisfies the following conditions [23]:(1)The measured signal is stationary.(2)The phase state can be reconstructed perfectly.(3)Each time series constructed by RPS could be considered as a single-channel instantaneous linear mixture (SCILM) of source signals.(4)All the independent random processes must be bandlimited with disjoint spectral support.

Unfortunately, SCICA algorithms cannot be used directly for sources separation or extraction while the signal is nonstationary. Therefore, a nontrivial structure with nonstationarity of the actual signal with variable statistical property such as the mean and the variance is expected. The problem addressed in this paper is to segment a nonstationary time series, which consist of many segments with different statistical property, in such a way as to maximize the differences in the statistical property between adjacent segments. The BG algorithm in [26] is applied to divide the nonstationary data in [25]. However, an important assumption in BG algorithm that the variances of adjacent two segments are constant and the nonstationarity is only reflected by the difference of means of these two segments is not always true in a general sense. Therefore, the higher-order moments will be used for the nonstationary detection in this paper.

Takens’ theorem [22] shows that the unknown dynamical system can be reconstructed by recreating a new state space only when the Euclidean embedding dimension must satisfy that ( is the attractor dynamics). As shown in the proof of [22], the embedding dimension and the time lag could be selected arbitrarily, resulting in arbitrarily precise states, which is as good as any others. However, an important assumption of Takens’ theorem is that the recording data without noise must be infinite, which may not always be true in the actual case. The actual data is always finite and is added with the strong broadband noise, which can obscure states and deteriorate the good properties of RPS. Simulation results [28] show that RPS does not work while the embedding dimension is less than the requirement. Accordingly, the calculated complexity increases with the increase of the embedding dimension. Although several methods, such as false nearest neighbor [29], singular value decomposition (SVD) [30], autocorrelation [31], and mutual information [32], can be used to determine the embedding dimension and the time lag of the reconstruction, these methods are mainly based on the experiments. Therefore, the selection of the reconstruction properties is essential to solve the problem of this paper.

Assuming instantaneous linear mixing of the sources at the sensors, ICA performs a blind separation of statistical independent sources with techniques involving higher-order statistics. However, RPS, which reconstructs the nonlinear time series in the state phase based on the delay coordinates, is essentially a nonlinear transform and cannot change single-channel data into multiple instantaneous linear mixture. Therefore, SSA [27] is used to transform the decomposed signals based on the delay coordinates into the one based on the original coordinates. In [25], SSA based on an eigenvalue decomposition (EVD) of the so-called lagged-covariance matrix for determining the optical dimension of reconstruction is applied to decompose the short and noise time series into a Pseudo-MIMO model before BSS is used. However, SSA, which essentially is a linear method, cannot reflect the structure of nonlinear dependence. Furthermore, SSA is not robust to reconstructive lag, embedding dimension, and the effect of the additive noise. Therefore, based on the above analyses, HOS-based methods may be employed because of the robustness of the higher-order-cumulants to Gaussian noise and the nonlinear property.

#### 3. Source Separation Using HOS-Based Single-Channel ICA

In this section, the actual data is assumed to be a stochastic process , where is deterministic, , and is an independent and identically distributed (IID) process. As shown in [33, 34] any process that satisfies the following assumptions can be referred to as a th-order quasistationary process:(1),(2),(3) is linear, time-invariant, and stable; that is, ,(4) exists and is finite, ,

where () is the time lag. Then, the fourth-order cumulants for a zero-mean, quasistationary (up to the fourth-order) signal is defined aswhere .

Furthermore, in this paper the actual data is considered as nonstationary signal, which is composed of many zero-mean, quasistationary (up to the fourth-order) segments with different higher-order statistical properties. Then the different statistical properties will be selected to segment the time series into several subsets by means of the BG algorithm [26], which is based on heuristic segmentation with different scales and is more effective in detecting the abrupt changes of nonlinear time series.

Considering a zero-mean, quasistationary (up to the fourth-order) subset of the actual data, which is generated by a nonlinear dynamical system, the information about the underlying generators is uncovered by employing RPS-based method. Using the mean time between peaks (MTBP) as the time window, the reconstructed parameters and are estimated simultaneously and the nonlinear time series is reconstructed by Takens’ embedding theory [22]. Then, the decomposition and reconstruction based on HOS are applied to reduce the dimension, weaken the noise, eliminate the nonlinear factors, and transform the phase space based on delay coordinates into the multiple instantaneous linear mixture. Finally, ICA is used to separate the decomposed sequences and extract the information from short and noisy time series. The modified SCICA strategy is illustrated in Figure 1, which will be further interpreted in the following sections in detail.