Variable Step-Size Method Based on a Reference Separation System for Source Separation
Traditional variable step-size methods are effective to solve the problem of choosing step-size in adaptive blind source separation process. But the initial setting of learning rate is vital, and the convergence speed is still low. This paper proposes a novel variable step-size method based on reference separation system for online blind source separation. The correlation between the estimated source signals and original source signals increases along with iteration. Therefore, we introduce a reference separation system to approximately estimate the correlation in terms of mean square error (MSE), which is utilized to update the step-size. The use of “minibatches” for the computation of MSE can reduce the complexity of the algorithm to some extent. Moreover, simulations demonstrate that the proposed method exhibits superior convergence and better steady-state performance over the fixed step-size method in the noise-free case, while converging faster than classical variable step-size methods in both stationary and nonstationary environments.
Blind source separation (BSS) aims at extracting the latent unknown source signals from their observed mixtures by an array of sensors without a priori knowledge of the original source signals and the mixing coefficients. In the separating process, nothing can be used except for the observation sequences and the statistical characteristic assumptions of the sources. This makes BSS become a versatile tool used in many multisensor systems such as antenna arrays in acoustics or electromagnetism, chemical sensor arrays, and electrode arrays in electroencephalography .
Several optimization algorithms have been proposed for BSS  and can be generally categorized into batch-based algorithms and adaptive (sequential) algorithms. Batch-based algorithms are block-wise and will not work until a block of data samples is received, such as the fast fixed-point algorithm . In this paper, we consider the latter, which have particular practical advantages due to their computational simplicity and latent ability in tracing a nonstationary environment .
However, the traditional adaptive BSS algorithms such as equivariant adaptive separation via independence algorithm (EASI)  and natural gradient algorithm (NGA)  usually assume that the step-size is a small positive constant, leading to an inevitable conflict between the learning rate and stability performance, that is, slow convergence speed or large steady state error. A simple way to solve the conflict is reducing the learning rate as the iteration goes on [7, 8], but it brings about another new problem: if the learning rate decreases to be too small before source components are extracted, the separation system will fail to separate sources properly. To improve the learning rate and stability performance, variable step-size algorithms have been proposed. The variable step-size algorithms can exploit the online measurements of the state of the separation system from the outputs and the parameter updates. In [4, 9, 10], variable step-size algorithms have been derived according to the gradient of different contrasts, that is, NGA, EASI, and S-NGA algorithms. Zhang et al. put forward a grading learning algorithm based on the measurements of correlation of the separating signals, whose learning rate is updated by the state of separating . Hsieh et al. proposed an effective learning rate adjustment method based on an improved particle swarm optimizer . But the separating performance of these variable step-size algorithms is usually sensitive to the initial parameter settings. As a result, the convergence is still slow and improper initial value of learning rate results in large steady state error or even divergence. Ou et al. proposed a variable step-size algorithm based on an auxiliary separation system . The step-size is updated by estimating a pseudo-performance index in the light of the index descending in an exponential form. Compared to classical variable step-size methods, the separation performance of Ou’s method is less sensitive to the initial settings.
In order to improve the initial convergence and stability performance, we consider using a reference separation system based on MSE of the instantaneous outputs to update the step-size. This technique is shown to improve the convergence speed and the steady-state performance. Moreover, the use of “minibatches” can reduce the whole computational load of the algorithm. The remainder of this paper is organized as follows. In Section 2, the principle of adaptive source separation methods is briefly summarized. Our algorithm is proposed in Section 3. Numerical stimulation results and discussion are provided in Section 4. At the end of the paper, a concise conclusion is given. What is more, this paper can be regarded as an important complement for Ou’s method in .
2. Adaptive Algorithms for BSS
In the noise-free instantaneous case, we assume that unknown statistically independent zero mean source signals, with at most one having a Gaussian distribution, contained within pass through an unknown mixing system ; therefore mixed signals can be modeled aswhere is the time index and is the vector transpose operator. To simplify the problem, we further assume that the number of sources matches the number of mixtures, that is, , an exactly determined problem.
The blind separation problem is then to recover original source signals from observations , which is equivalent to estimate an separating matrix that performs the inverse operation of the mixing process, as subsequently used in separation model. Figure 1 shows a block diagram of adaptive BSS model. Then the output signal vector is obtained:where is an estimate of to within the well-known permutation and scaling ambiguities.
Based on classical contrast such as mutual information contrast, maximum likelihood contrast, and informax principle, many adaptive algorithms have been proposed to estimate . Amari proved the NGA algorithm is the fastest least-mean-square (LMS) type BSS algorithm . The natural gradient BSS algorithm based on the mutual information contrast, maximum likelihood contrast, and informax principle have the same form:where is identity matrix, and , , are increasing odd functions, usually called activation functions.
Based on the fact that the separating matrix can be factorized into the product of an orthogonal matrix and the prewhitening matrix, via combining LMS-type updating formulas of these two matrixes above, using some reasonable approximation, the EASI algorithm is derived :
It has been shown that, as compared with using a fixed step-size, the algorithm with a variable step-size has an improved convergence rate. Yuan et al. derived a gradient variable step-size algorithm for the NGA algorithm , which adapts the step-size in the form ofwhere is a small constant, and is an instantaneous estimate of the cost function from which the NGA algorithm is derived.
What should be noticed is that the activation functions (i.e., the step-size update functions) can be identical, when sources are all sub-Gaussian or super-Gaussian signals. The distinct distributions of signals determine the different activation functions; that is, the separation of all sub-Gaussian sources usually utilizes the cubic function while the proper choice for super-Gaussian sources separation is hyperbolic tangent function.
3. The Proposed Algorithm
As is known to us, in the process of adaptive BSS the estimated signals will approximate to source signals as iteration goes on if the permutation and scaling ambiguities of the estimated signals can be eliminated . The correlation between the estimated signals and source signals can be evaluated by mean-square-error, which is defined aswhere is the sample size, and as well as is normalized before the evaluation of . When the separation system is steady, the mean-square-error matrix MSE, whose element is , has one, and only one, zero entry in each row and column.
If we can calculate the matrix MSE at each update of the separating matrix , a rule for variable step-size algorithm is to adjust adaptively in terms of MSE. However, since the source signals are unknown, the matrix MSE at each update is not accessible in practice.
In this section, we propose to estimate MSE approximately by combining a reference separation system , which follows the same optimization criteria and updating principle as based on natural gradient algorithm (NGA), except for the initialization. Hence, we obtain thatwhere which represents the reference signal. The correlation between from the primary separation system and from the reference system should increase as iteration goes on regardless of the ambiguities. Therefore, at every iterative, we replace mean-square-error in (6) bywhere
where norm denotes the root-mean-square value of the output vectors, and the operator takes the absolute value of the normalized vector. In this way, scaling ambiguity can be removed.
Online procedures use a given sample every time , whereas to appropriately evaluate mean-square-error one time requires some samples just as (9) indicates. Therefore, we consider updating the separating matrix once over a “minibatch,” that is, a small block of signal samples, while the observation window slides [15, 16]. Hence, the online updating equation of the separating matrix becomeswhere is the iteration number index (or the minibatch index), and the step-size parameter is updated by a nonlinear function in the form ofwhich is a widely used rule in adaptive filtering algorithms . The primary separation system follows the same updating rule above, where the parameters and are two positive constants, which control the shape of the function curve and initial step-size, respectively. The effects of these two parameters on performance of the algorithm will be investigated in the next section. We define the correlation function aswhere denotes mean-square-error of th minibatch, represents a weighting factor , and is the number of weighting functions. Thus we introduce exponential weighting into past data, which is proper especially when the channel characteristics are time-variant [18, 19].
Regarding the computation load, in the proposed algorithm products per iteration are required, but the total iteration number is . Therefore, products are calculated in a whole separation procedure, which has the same quantity level, that is, products, as the other algorithms in [5, 6, 10, 13]. If the sample size of “minibatches” is large enough, the operation quantity will be much smaller than the others. But considering the tracking performance especially in nonstationary environment, the sample size of “minibatches” should be selected moderately.
4. Simulation Results and Discussion
Here, several sets of simulation results are provided to demonstrate the performance of the proposed algorithm. Generally speaking, comparisons among fixed step-size algorithms, classical variable step-size algorithms and the proposed algorithm in both stationary and nonstationary environments have been carried out.
Comparison between the proposed algorithm and fixed step-size algorithms.
In this experiment, we consider the separation of three zero mean sub-Gaussian sources in stationary environment:where is a random source signal distributed uniformly in . The mixing matrix is randomly generated subject to the normal distribution with mean 0 and standard deviation 1, and three receivers are used . The sampling period is set to 0.0001 s.
To evaluate the performance of the BSS algorithms, we use the cross-talking error as the performance index [5, 20–22]:where the matrix is the combined mixing-separating matrix. As converges to PDA−1, the combined mixing-separating matrix will converge to PD, a generalized permutation matrix, and will converge to zero.
In the algorithms, activation function is applied. The step-size and 0.01 is taken in natural gradient algorithm  and optimized EASI algorithm , respectively. The parameters of the proposed algorithm are set to , . Considering the balance between tracking performance and evaluation accuracy of the mean-square-error matrix MSE, the sample size of “minibatches” in all the experiments. The effects of crucial parameters and on the performance of proposed algorithm are investigated in Figure 3. What should be noticed is that larger and , respectively, lead to faster initial learning rate and better convergence performance, so the results provide reference for choosing appropriate parameters and . Hence, we set the parameters and to be 0.06 and 104, respectively.
Besides, if sources include both sub-Gaussian and super-Gaussian signals, the activation functions should not be the same increasing odd function. The activation functions might be initialized by polynomials or kernel functions with some adjustable parameters, so that the optimal activation functions vector can be estimated adaptively along with the iteration . However, further investigation on the activation functions might go beyond this paper.
Figure 4 plots the average PI value obtained from the simulations of three adaptive algorithms for 500 Monte Carlo trials. From the plots, we can see that the proposed algorithm provides the fastest convergence speed, while achieving lower steady state error than both the NGA and optimized EASI approaches. The step-size , evolution of which is demonstrated in Figure 5, decreases generally in the exponential type as iterations. We observe that step-size maintains a constant 0.06 during about 100 iterations of the beginning. This is attributed to the choosing of parameter , which can benefit for high initial learning rate yet sensitive detection in the state of separating. As result, the separating performance is more robust to the setting of initial learning rate.
Comparison between the proposed algorithm and variable step-size algorithms.
In this experiment, we firstly define the function in (11). In order to allow fair comparison, the same function in  is used for the proposed algorithms; that is,where and denote the operation of taking the diagonal elements and off-diagonal elements of a matrix, respectively, and two zero mean sub-Gaussian sources are mixed by a mixing matrix ; that is,
Zero-mean independent white Gaussian noise is added to the mixture with the signal-to-noise ratio being equal to 20 dB. The parameters including the initial step-size in LMS-type algorithms are manually tuned so each algorithm has nearly the same steady state performance. The initial value of for classical variable step-size algorithms, that is, VS-NGA and VS-S-NGA in , is set to 0.004, , and 500 Monte Carlo trials are run for averaged performance. The parameters of Ou’s method in  are set to , . For the proposed algorithm, the parameters are, respectively, , , , and . The parametric settings imply that the proposed algorithm can lead to a higher learning rate while maintaining appropriate steady state performance. The average values of resulting from three approaches are compared in Figure 6. The proposed algorithm only requires approximately 500 samples for convergence; however, the other three algorithms need 600 samples at least. Clearly, the performance of the proposed algorithm is considerably improved over the classical variable step-size algorithms in noisy case.
Figure 7 plots the average value of three approaches in a nonstationary environment. The mixing matrix to simulate the time-varying environments is chosen aswhere , is MATLAB built-in function , and the initial is set to a null matrix. Here, and . The initial parameter for classical variable step-size algorithms is the same as the experiment in noisy case. The parameters of the proposed algorithm are reset to , , , . Likewise, results are obtained over 500 Monte Carlo runs. From this figure, it is observed that the proposed algorithm converges faster than VS-NGA and VS-S-NGA algorithms in the nonstationary environment.
Finally, we checked the computational time and separation performance for different separation methods in noisy case. The Fast-ICA algorithm, a classical batch-based method, is also utilized for comparison. The sources, mixing matrix, and initial parameters are set as Experiment 2. The data length is set to 10000 samples, which is enough for achieving the convergence. The iteration number of Fast-ICA was set to 100. The results are provided in Table 1. It can be seen that Fast-ICA generally has better separation performance under high SNR (signal to noise ratios). However, it costs large computational time since it is a kind of batch-based algorithm, and it will not work until a large number of data samples are received. In contrast, though the proposed algorithm performs slightly worse than Fast-ICA, it behaves better when the noise power is increased (SNR = 0 dB, 5 dB, 10 dB). As a novel adaptive online algorithm, the proposed algorithm has particular advantage due to its computational simplicity and latent ability in tracing the noise and nonstationary environments. It also shows that the proposed algorithm performs even better than optimized EASI, NGA, and VS-NGA in terms of average PI. Similar separation performance with lower computational load is also obtained compared to VS-S-NGA and Ou’s method. This demonstrates the complexity analysis in Section 3; that is, the utilization of “minibatches” would probably reduce the computational cost.
In this paper, we propose a new variable step-size algorithm for blind source separation. Reference separation system is utilized to acquire the mean-square-error matrix which is treated as the metric to update the step-size. As for performance comparison, fixed step-size algorithms, classical variable step-size algorithms, and the proposed algorithm have been carried out in both stationary and nonstationary environments. The performance of the abovementioned approaches is analyzed and compared in terms of cross-talking error. It is revealed that the proposed scheme has improved learning rate and stability performance over the fixed step-size algorithms and converges faster than classical variable step-size algorithms.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
The author would like to thank anonymous referees for constructive comments which are valuable for improving the paper. This work was supported by the National Natural Science Foundation of China under Grant 61172061 and the Natural Science Foundation of JiangSu Province in China under Grant BK2011117. This work was also supported by the National Natural Science Foundation of China under Grant 61201242 and Grant 60772083.
P. Comon and C. Jutten, Handbook of Blind Source Separation: Independent Component Analysis and Applications, chapter 1, Elsevier Press, New York, NY, USA, 2010.
S. Amari, A. Cichocki, and H. H. Yang, “A new learning algorithm for blind signal separation,” in Proceedings of the Advances in Neural Information Processing Systems (NIPS '96), vol. 8, pp. 757–763, 1996.View at: Google Scholar
S. C. Douglas and A. Cichocki, “Adaptive step size techniques for decorrelation and blind source separation,” in Proceedings of the 32nd Asilomar Conference on Signals, Systems & Computers, vol. 2, pp. 1191–1195, November 1998.View at: Google Scholar
X. D. Zhang, X. L. Zhu, and Z. Bao, “Grading learning for blind source separation,” Science in China E, vol. 32, no. 5, pp. 693–703, 2002.View at: Google Scholar
S. F. Ou, X. H. Zhao, and Y. Gao, “Variable step-size blind source separation algorithm with an auxiliary separation system,” Acta Electronica Sinica, vol. 37, no. 7, pp. 1588–1593, 2009 (Chinese).View at: Google Scholar
F. Nesta, T. S. Wada, S. Miyabe, and B.-H. Juang, “On the non-uniqueness problem and the semi-blind source separation,” in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '09), pp. 101–104, New Paltz, NY, USA, October 2009.View at: Publisher Site | Google Scholar