Mixture of Generalized Gamma Density-Based Score Function for Fastica

Waheed, M. EL-Sayed; Mohamed, Osama Abdo; Abd El-aziz, M. E.

doi:https://doi.org/10.1155/2011/150294

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions References Copyright Related Articles

Special Issue

Propagation Phenomena and Transitions in Complex Systems: Efficient Mathematical Models

View this Special Issue

Research Article | Open Access

Volume 2011 | Article ID 150294 | https://doi.org/10.1155/2011/150294

Mixture of Generalized Gamma Density-Based Score Function for Fastica

M. EL-Sayed Waheed,¹Osama Abdo Mohamed,¹and M. E. Abd El-aziz¹

Academic Editor: Ezzat G. Bakhoum

Received14 Sept 2010

Accepted21 Sept 2010

Published30 Nov 2010

Abstract

We propose an entirely novel family of score functions for blind signal separation (BSS), based on the family of mixture generalized gamma density which includes generalized gamma, Weilbull, gamma, and Laplace and Gaussian probability density functions. To blindly extract the independent source signals, we resort to the FastICA approach, whilst to adaptively estimate the parameters of such score functions, we use Nelder-Mead for optimizing the maximum likelihood (ML) objective function without relaying on any derivative information. Our experimental results with source employing a wide range of statistics distribution show that Nelder-Mead technique produce a good estimation for the parameters of score functions.

1. Introduction

By definition, independent component analysis (ICA) is the statistical method that searches for a linear transformation, which can effectively minimize the statistical dependence between its components [1]. Under the physically plausible assumption of mutual statistical independence between these components, the most application of ICA is blind signal separation (BSS). In its simplest form, BSS aims to recover a set of unknown signals, the so-called original sources , by relying exclusively on information that can be extracted from their linear and instantaneous mixtures , given by where is an unknown mixing matrix of full rank and . In doing so, BSS remains truly (blind) in the sense that very little to almost nothing be known a priori for the mixing matrix or the original source signals.

Often sources are assumed to be zero-mean and unit-variance signals with at most one having a Gaussian distribution. The problem of source estimation then boils down to determining the unmixing matrix such that the linear transformation of the sensor observation is where yield an estimate of vector corresponding to the original or true sources. In general, the majority of BSS approaches perform ICA, by essentially optimizing the negative log-likelihood (objective) function with respect to the unmixing matrix such that where represents the expectation operator and is the model for the marginal probability density function (pdf) of , for all . Normally, matrix is regarded as the parameter of interest and the pdfs of the sources are considered to be nuisance parameters. In effect, when correctly hypothesizing upon the distribution of the sources, the maximum likelihood (ML) principle leads to estimating functions, which in fact are the score functions of the sources [2] In principle, the separation criterion in (1.3) can be optimized by any suitable ICA algorithm where contrasts are utilized (see; e.g., [2]). A popular choice of such a contrast-based algorithm is the so-called fast (cubic) converging Newton-type (fixed-point) algorithm, normally referred to as FastICA [3], based on where, as defined in [4], with being valid for all . In the ICA framework, accurately estimating the statistical model of the sources at hand is still an open and challenging problem [2]. Practical BSS scenarios employ difficult source distributions and even situations where many sources with very different pdfs are mixed together. Towards this direction, a large number of parametric density models have been made available in recent literature. Examples of such models include the generalized Gaussian density (GGD) [5], the generalized lambda density (GLD), and the generalized beta distribution (GBD) or even combinations and generalizations such as super and generalized Gaussian mixture model (GMM) [6], the generalized gamma density (GGD) [7], the Pearson family of distributions [4], and even the so-called extended generalized lambda distribution (EGLD) which is an extended parameterizations of the aforementioned GLD and GBD models [8]. In the following section, we propose Mixture Generalized Gamma Density (MGΓD) for signal modeling in blind signal separation.

2. Mixture Generalized Gamma Density (MGΓD)

A Mixture Generalized Gamma Density (MGΓD) is a parametric statistical model which assumes that the data originates from weighted sum of several generalized gamma sources [9]. More specifically, a MGΓD is defined as where(i), (ii) is the number of mixture density components,(iii) is the th mixture weight and satisfies ,(iv) is an individual density of the generalized gamma density which is characterized by [7], where is the location parameter, is the scale parameter, is the shape/power parameter, and is the shape parameter. is the gamma function, defined by By varying the parameters, it is possible to characterize a large class of distributions such as Gaussian, sub-Gaussian (more peaked, than Gaussian, heavier tail), and supergaussian (flatter, more uniform). It is noticed that for , the (GΓD) define gamma density as special case. Furthermore, if and , it become the Gaussian pdf, and if and , it represent the Laplacian pdf.

Figures 1 and 2 show some examples of pdf for MGΓD for and . Thanks to the shape parameters, the MGΓD is more flexible and can approximate a large class of statistical distributions, this distribution requires to estimate parameters, . Particularly, we discuss the estimation of these parameters in detail in the following section.

3. Numerical Optimization of the Log-Likelihood Function to Estimate MGΓD Parameters

We propose in this section a generalization of the method proposed in [9] which address only the case of 2 components by setting the derivatives of the log-likelihood function to zeros. The log-likelihood function of (2.2), given by [10] where the sample size and represents the conditional expectation of given the observation , this means the posterior probability that belongs to the th component. In the case of generalized gamma distribution, if we substitute (2.2) into (3.1) and after some manipulation, we obtain the following form of Accordingly, we obtain for the following nonlinear equation related to the estimated parameters by derivatives of the log-likelihood function with respect to , and and setting these derivatives to zeros, we obtain where is the digamma function . After a little mathematical manipulation, the ML estimate of is obtained Given the estimate of , it is straightforward to derive the estimate for , and . Let be the estimate of . Then, where where and are the resulting estimates for and , respectively, and to estimate the location parameter, we solve (3.4) by gradient ascent. The estimation of weight coefficient obtains directly from as follows [10]

However, (3.5) cannot be easily solved, so we adopt the gradient ascent algorithm to obtain the estimate of and determine the estimates of , , and uniquely using this value of .

Alternative numerical method can be used to estimate the parameters is called NM, where the appeal of the NM optimization technique lies in the fact that it can minimize the negative of the log-likelihood objective function given in (3.2), essentially without relying on any derivative information. Despite the danger of unreliable performance (especially in high dimensions), numerical experiments have shown that the NM method can converge to an acceptably accurate solution with substantially fewer function evaluations than multidirectional search descent methods. Good numerical performance and a significant improvement in computational complexity for our estimation method, therefore, optimization with the NM technique, produce a good estimation for parameters in MGΓD. To show the performance of NM, we consider the next example.

3.1. Example

We generate random number from MGΓD with parameters , , , , , , , , , , and . By performs NM, we obtain best estimation for parameters. As we show in Table 1, the first 5th values of estimated parameters after being sorted according the value of function. In the following section, we resolve to FastICA algorithm for blind signal separation (BSS), this algorithm depends on the estimated parameters and an unmixing matrix which estimated by FastICA algorithm.

Novel flexible score function is obtained, by substituting (2.1) into (1.4) for the source estimates , it quickly become obvious that our proposed score function inherits a generalized parametric structure, which in turn can be attributed to the highly flexible MGΓD parent model. In this case, a simple calculus yield the flexible BSS score function In principle is capable of modeling a large number of signals, such as speech or communication signals, as well as various other types of challenging heavy- and light-tailed distributions.

This is due to the fact that its characterization depends explicitly on all parameters , . Other commonly used score functions can be obtained by substituting appropriate values for , and in (4.1), for instance, when , we have score function When and , we have a scaled form of the GGD-based score function constitutes such a special case of(4.2) The function could become singular, in some special cases, essentially those corresponding to heavy-tailed (or sparse) distribution defined for with and . In practice, to circumvent such deficiency, the denominator in (4.1) can be modified slightly to read where is a small positive parameter (typically around 10⁴) which, when put to use, can almost always guarantee that the discontinuity of (4.1) or values in or approaching the region is completely avoided.

5. Numerical Experiments

To investigate the separation performance of the proposed MGΓD-based FastICA BSS method, a set of numerical experiments are preformed, in which we consider only two cases when , , and we illustrate this in the following two examples.

5.1. Example 1

In this example, and the data set used consists of different realizations of independent signals, with distributions shown in Table 2. Note that this is a large-scale and substantially difficult separation problem, since it involves a Gaussian, various super- and sub-Gaussian symmetric PDFs, as well as asymmetric distributions. In all cases, the number of data samples has also been designed to be relatively small; for example, . The source signals are mixed (noiselessly) with randomly generated full-rank mixing matrices . The FastICA method is implemented in the so-called simultaneous separation mode whereas the stopping criterion is set to . FastICA is executed using the flexible MGΓD model is used to model the distribution of the unknown sources, while (3.3)–(3.5) are employed to adaptively calculate the necessary parameters of the MGΓD-based score function defined in (4.4).

Now, to show the performance in this case, we consider three source signals (source), where these signals are generated randomly from Weilbull, gamma, and exponential distributions as follows: Let the mixing matrix and unmixing matrix be defined as follows: By using the equation , we obtain mixed signals as show in Figure 3, where mixing signals are in the left and source signals are in the right.

After using FastICA, we recover the sources, and we show the estimated signals in the left and original signals in the right in Figure 4 with different in scale only.

5.2. Example 2

In this example, in which the data set used consists of different realizations of independent signals. Note that each signal not only a Gaussian, super-, and sub-Gaussian PDFs, but it is mixed of this PDFs as shown in Table 3.

To show the performance in this case, we consider three source signals (source) where these signals are generated randomly from Gamma_Lapalce (), Weilbull (), and Gaussian_Laplace () distributions. Let the mixing matrix and unmixing matrix be defined as follows: By using the equation we obtain mixed signals as show in Figure 5, where mixed signals are in the left and source signals are in the right.

After using FastICA, we recover the source, and we show the estimated signals in the left with scale and original signals in the right in Figure 6.

6. Algorithm Performance

The separation performance for ICA algorithm is evaluated with the crosstalk error measure Note that here, represents the elements of the permutation matrix , which after assuming that all sources have been successfully separated should ideally reduce to a permuted and scaled version of the identity matrix. The separation performance for the first example is and for second example is .

7. Conclusions

We have derived a novel parametric family of flexible score functions, based exclusively on the MGΓD model. To calculate the parameters of these functions in an adaptive BSS setup, we have chosen to maximize the ML equation with the NM optimization method. This alleviates excessive computational cost requirements and allows for a fast practical implementation of the FastICA. Simulation results show that the proposed approach is capable of separating mixtures of signals.

References

P. Comon, “Independent component analysis: a new concept?” Signal Processing, vol. 36, no. 3, pp. 287–314, 1994.
View at: Google Scholar
J.-F. Cardoso, “Blind signal separation: statistical principles,” Proceedings of the IEEE, vol. 86, no. 10, pp. 2009–2025, 1998.
View at: Google Scholar
A. Hyvärinen and E. Oja, “A fast fixed-point algorithm for independent component analysis,” Neural Computation, vol. 9, no. 7, pp. 1483–1492, 1997.
View at: Google Scholar
J. Karvanen and V. Koivunen, “Blind separation methods based on Pearson system and its extensions,” Signal Processing, vol. 82, no. 4, pp. 663–673, 2002.
View at: Publisher Site | Google Scholar
K. Kokkinakis and A. K. Nandi, “Exponent parameter estimation for generalized Gaussian probability density functions with application to speech modeling,” Signal Processing, vol. 85, no. 9, pp. 1852–1858, 2005.
View at: Publisher Site | Google Scholar
J. A. Palmer, K. Kreutz-Delgado, and S. Makeig, “Super-Gaussian mixture source model for ICA,” in Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation, pp. 854–861, Charleston, SC, USA, March 2006.
View at: Google Scholar
E. W. Stacy, “A generalization of the gamma distribution,” Annals of Mathematical Statistics, vol. 33, pp. 1187–1192, 1962.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
J. Eriksson, J. Karvanen, and V. Koivunen, “Source distribution adaptive maximum likelihood estimation of ICA model,” in Proceedings of the 2nd International Conference on ICA and BSS, pp. 227–232, Helsinki, Finland, June 2000.
View at: Google Scholar
G. Dean, “Boerrigter, parameter estimation of the mixed generalized gamma distribution using maximum likelihood estimation and minimum distance estimation,” Presented to the Faculty of the Graduate School of Engineering of the Air Force Institute of Technology .AFIT/GOR/ENS/98M-3.
View at: Google Scholar
M. A. T. Figueiredo and A. K. Jain, “Unsupervised learning of finite mixture models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 381–396, 2002.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2011 M. EL-Sayed Waheed et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1929

Downloads

759

Citations

Mathematical Problems in Engineering

Propagation Phenomena and Transitions in Complex Systems: Efficient Mathematical Models

Mixture of Generalized Gamma Density-Based Score Function for Fastica

Abstract

1. Introduction

2. Mixture Generalized Gamma Density (MGΓD)

3. Numerical Optimization of the Log-Likelihood Function to Estimate MGΓD Parameters

3.1. Example

4. Application of MGΓD in Blind Signal Separation

5. Numerical Experiments

5.1. Example 1

5.2. Example 2

6. Algorithm Performance

7. Conclusions

References

Copyright

4. Application of MGD in Blind Signal Separation