Abstract
We propose an entirely novel family of score functions for blind signal separation (BSS), based on the family of mixture generalized gamma density which includes generalized gamma, Weilbull, gamma, and Laplace and Gaussian probability density functions. To blindly extract the independent source signals, we resort to the FastICA approach, whilst to adaptively estimate the parameters of such score functions, we use Nelder-Mead for optimizing the maximum likelihood (ML) objective function without relaying on any derivative information. Our experimental results with source employing a wide range of statistics distribution show that Nelder-Mead technique produce a good estimation for the parameters of score functions.
1. Introduction
By definition, independent component analysis (ICA) is the statistical method that searches for a linear transformation, which can effectively minimize the statistical dependence between its components [1]. Under the physically plausible assumption of mutual statistical independence between these components, the most application of ICA is blind signal separation (BSS). In its simplest form, BSS aims to recover a set of unknown signals, the so-called original sources , by relying exclusively on information that can be extracted from their linear and instantaneous mixtures , given by where is an unknown mixing matrix of full rank and . In doing so, BSS remains truly (blind) in the sense that very little to almost nothing be known a priori for the mixing matrix or the original source signals.
Often sources are assumed to be zero-mean and unit-variance signals with at most one having a Gaussian distribution. The problem of source estimation then boils down to determining the unmixing matrix such that the linear transformation of the sensor observation is where yield an estimate of vector corresponding to the original or true sources. In general, the majority of BSS approaches perform ICA, by essentially optimizing the negative log-likelihood (objective) function with respect to the unmixing matrix such that where represents the expectation operator and is the model for the marginal probability density function (pdf) of , for all . Normally, matrix is regarded as the parameter of interest and the pdfs of the sources are considered to be nuisance parameters. In effect, when correctly hypothesizing upon the distribution of the sources, the maximum likelihood (ML) principle leads to estimating functions, which in fact are the score functions of the sources [2] In principle, the separation criterion in (1.3) can be optimized by any suitable ICA algorithm where contrasts are utilized (see; e.g., [2]). A popular choice of such a contrast-based algorithm is the so-called fast (cubic) converging Newton-type (fixed-point) algorithm, normally referred to as FastICA [3], based on where, as defined in [4], with being valid for all . In the ICA framework, accurately estimating the statistical model of the sources at hand is still an open and challenging problem [2]. Practical BSS scenarios employ difficult source distributions and even situations where many sources with very different pdfs are mixed together. Towards this direction, a large number of parametric density models have been made available in recent literature. Examples of such models include the generalized Gaussian density (GGD) [5], the generalized lambda density (GLD), and the generalized beta distribution (GBD) or even combinations and generalizations such as super and generalized Gaussian mixture model (GMM) [6], the generalized gamma density (GGD) [7], the Pearson family of distributions [4], and even the so-called extended generalized lambda distribution (EGLD) which is an extended parameterizations of the aforementioned GLD and GBD models [8]. In the following section, we propose Mixture Generalized Gamma Density (MGΓD) for signal modeling in blind signal separation.
2. Mixture Generalized Gamma Density (MGΓD)
A Mixture Generalized Gamma Density (MGΓD) is a parametric statistical model which assumes that the data originates from weighted sum of several generalized gamma sources [9]. More specifically, a MGΓD is defined as where(i), (ii) is the number of mixture density components,(iii) is the th mixture weight and satisfies ,(iv) is an individual density of the generalized gamma density which is characterized by [7], where is the location parameter, is the scale parameter, is the shape/power parameter, and is the shape parameter. is the gamma function, defined by By varying the parameters, it is possible to characterize a large class of distributions such as Gaussian, sub-Gaussian (more peaked, than Gaussian, heavier tail), and supergaussian (flatter, more uniform). It is noticed that for , the (GΓD) define gamma density as special case. Furthermore, if and , it become the Gaussian pdf, and if and , it represent the Laplacian pdf.
Figures 1 and 2 show some examples of pdf for MGΓD for and . Thanks to the shape parameters, the MGΓD is more flexible and can approximate a large class of statistical distributions, this distribution requires to estimate parameters, . Particularly, we discuss the estimation of these parameters in detail in the following section.
3. Numerical Optimization of the Log-Likelihood Function to Estimate MGΓD Parameters
We propose in this section a generalization of the method proposed in [9] which address only the case of 2 components by setting the derivatives of the log-likelihood function to zeros. The log-likelihood function of (2.2), given by [10] where the sample size and represents the conditional expectation of given the observation , this means the posterior probability that belongs to the th component. In the case of generalized gamma distribution, if we substitute (2.2) into (3.1) and after some manipulation, we obtain the following form of Accordingly, we obtain for the following nonlinear equation related to the estimated parameters by derivatives of the log-likelihood function with respect to , and and setting these derivatives to zeros, we obtain where is the digamma function . After a little mathematical manipulation, the ML estimate of is obtained Given the estimate of , it is straightforward to derive the estimate for , and . Let be the estimate of . Then, where where and are the resulting estimates for and , respectively, and to estimate the location parameter, we solve (3.4) by gradient ascent. The estimation of weight coefficient obtains directly from as follows [10]
However, (3.5) cannot be easily solved, so we adopt the gradient ascent algorithm to obtain the estimate of and determine the estimates of , , and uniquely using this value of .
Alternative numerical method can be used to estimate the parameters is called NM, where the appeal of the NM optimization technique lies in the fact that it can minimize the negative of the log-likelihood objective function given in (3.2), essentially without relying on any derivative information. Despite the danger of unreliable performance (especially in high dimensions), numerical experiments have shown that the NM method can converge to an acceptably accurate solution with substantially fewer function evaluations than multidirectional search descent methods. Good numerical performance and a significant improvement in computational complexity for our estimation method, therefore, optimization with the NM technique, produce a good estimation for parameters in MGΓD. To show the performance of NM, we consider the next example.
3.1. Example
We generate random number from MGΓD with parameters , , , , , , , , , , and . By performs NM, we obtain best estimation for parameters. As we show in Table 1, the first 5th values of estimated parameters after being sorted according the value of function. In the following section, we resolve to FastICA algorithm for blind signal separation (BSS), this algorithm depends on the estimated parameters and an unmixing matrix which estimated by FastICA algorithm.
4. Application of MGD in Blind Signal Separation
Novel flexible score function is obtained, by substituting (2.1) into (1.4) for the source estimates , it quickly become obvious that our proposed score function inherits a generalized parametric structure, which in turn can be attributed to the highly flexible MGΓD parent model. In this case, a simple calculus yield the flexible BSS score function In principle is capable of modeling a large number of signals, such as speech or communication signals, as well as various other types of challenging heavy- and light-tailed distributions.
This is due to the fact that its characterization depends explicitly on all parameters , . Other commonly used score functions can be obtained by substituting appropriate values for , and in (4.1), for instance, when , we have score function When and , we have a scaled form of the GGD-based score function constitutes such a special case of(4.2) The function could become singular, in some special cases, essentially those corresponding to heavy-tailed (or sparse) distribution defined for with and . In practice, to circumvent such deficiency, the denominator in (4.1) can be modified slightly to read where is a small positive parameter (typically around 104) which, when put to use, can almost always guarantee that the discontinuity of (4.1) or values in or approaching the region is completely avoided.
5. Numerical Experiments
To investigate the separation performance of the proposed MGΓD-based FastICA BSS method, a set of numerical experiments are preformed, in which we consider only two cases when , , and we illustrate this in the following two examples.
5.1. Example 1
In this example, and the data set used consists of different realizations of independent signals, with distributions shown in Table 2. Note that this is a large-scale and substantially difficult separation problem, since it involves a Gaussian, various super- and sub-Gaussian symmetric PDFs, as well as asymmetric distributions. In all cases, the number of data samples has also been designed to be relatively small; for example, . The source signals are mixed (noiselessly) with randomly generated full-rank mixing matrices . The FastICA method is implemented in the so-called simultaneous separation mode whereas the stopping criterion is set to . FastICA is executed using the flexible MGΓD model is used to model the distribution of the unknown sources, while (3.3)–(3.5) are employed to adaptively calculate the necessary parameters of the MGΓD-based score function defined in (4.4).
Now, to show the performance in this case, we consider three source signals (source), where these signals are generated randomly from Weilbull, gamma, and exponential distributions as follows: Let the mixing matrix and unmixing matrix be defined as follows: By using the equation , we obtain mixed signals as show in Figure 3, where mixing signals are in the left and source signals are in the right.
After using FastICA, we recover the sources, and we show the estimated signals in the left and original signals in the right in Figure 4 with different in scale only.
5.2. Example 2
In this example, in which the data set used consists of different realizations of independent signals. Note that each signal not only a Gaussian, super-, and sub-Gaussian PDFs, but it is mixed of this PDFs as shown in Table 3.
To show the performance in this case, we consider three source signals (source) where these signals are generated randomly from Gamma_Lapalce (), Weilbull (), and Gaussian_Laplace () distributions. Let the mixing matrix and unmixing matrix be defined as follows: By using the equation we obtain mixed signals as show in Figure 5, where mixed signals are in the left and source signals are in the right.
After using FastICA, we recover the source, and we show the estimated signals in the left with scale and original signals in the right in Figure 6.
6. Algorithm Performance
The separation performance for ICA algorithm is evaluated with the crosstalk error measure Note that here, represents the elements of the permutation matrix , which after assuming that all sources have been successfully separated should ideally reduce to a permuted and scaled version of the identity matrix. The separation performance for the first example is and for second example is .
7. Conclusions
We have derived a novel parametric family of flexible score functions, based exclusively on the MGΓD model. To calculate the parameters of these functions in an adaptive BSS setup, we have chosen to maximize the ML equation with the NM optimization method. This alleviates excessive computational cost requirements and allows for a fast practical implementation of the FastICA. Simulation results show that the proposed approach is capable of separating mixtures of signals.