Research Article | Open Access
Generalized Information for the -Order Normal Distribution
This paper investigates a generalization of Fisher’s entropy type information measure under the multivariate -order normal distribution, related to his measure, as well as its corresponding Shannon entropy. Certain boundaries of this information measure are also proved and discussed.
The Poincaré inequality is one of the most well-known results of the theory of the Sobolev spaces; that is, we can obtain bounds on a function belonging to the Sobolev space using the bounds on the derivatives, while the domain is still important. The energy of a local -integrable function with is defined asThe corresponding Poincaré constant, , can be easily evaluated when the domain is convex. It holds that where and are, respectively, the expected value and the variance of , corresponding to the probability measure ; that is, and . Under some regularity conditions for the measure , there exists a constant such that the Poincaré inequality as in (2) can be written as with being a differentiable function having compact support. That is, we need to evaluate bounds for the variance and therefore for the information, either the parametric or the entropy type.
The Logarithmic Sobolev Inequalities (LSI) attempt to estimate the lower-order derivatives of a given function in terms of higher-order derivatives. The well-known LSI was introduced in 1938; see [1, 2] for details. The introductory and well-known Sobolev Inequality is or, using the norm notation, The constant is known as Sobolev constant. Since then, various attempts were made to generalize (4). The first optimal Sobolev Inequality was of the form with .
The usual normal distribution, also known as Gaussian, plays an important role to all statistical problems as do the information measures related to it. New entropy type information measures were introduced in , generalizing the known Fisher’s entropy type information measure, while an exponential-power generalization of the usual normal distribution was introduced and studied in [3–5]. This generalized normal, called the -order normal distribution, defined in Section 2, emerges as an extremal function for the Logarithmic Sobolev Inequality.
In particular, following , the Gross Logarithm Inequality with respect to the Gaussian weight  is of the formwhere is the norm in with . Inequality (7) is equivalent to the (Euclidean) LSI: for any function with , and being the Sobolev space; see  for details. This inequality is optimal, in the sense that see . Extremal functions for inequality (8) are precisely the Gaussian distribution with and ; see [7, 8] for details. Now, consider the extension by del Pino et al. in  for the LSI as in (8). For any with , the -LSI holds; that is,with the optimal constant being equal to where is the usual gamma function. Inequality (11) is optimal and the equality holds when with normalizing factorand -quadratic form , . The function with corresponds to the extremal function for the LSI due to . The essential result is that the defined can be considered as a probability density function (p.d.f.) of an r.v. and works as an extremal function to a generalized form of the Logarithmic Sobolev Inequality; see Section 2.
Various attempts were made in the past to generalize the normal distribution; see [10, 11]. The privilege of the introduced family of the -order normal distribution is that there is a theoretical insight due to the generalization of Fisher’s information and the use of the LSI and not just a technical extension as in .
This is why there is an interest to have at least inequalities among various statistical-analytical information measures concerning the Gaussian as well as the -order normal distribution: to be able to compare the “information” we can obtain for an experiment, which usually is assumed to follow the Gaussian distribution (see Section 3).
In principle, the information measures are divided into three main categories: parametric (a typical example Fisher’s information), nonparametric (with Shannon information measure to be the most well known), and entropy type (see ) which are adopted in this paper. The introduced new entropy type measure of information is a function of the density of the -variate random variable (see ) defined aswhere is the usual norm. Notice that , with being the known Fisher entropy type information measure.
Moreover, the known entropy power , defined through the Shannon entropy , has been extended to wheresee  for details. Notice that and , where is the known Shannon entropy power for the normal distribution. It is interesting that  which extends the well-known Information Inequality, obtained for .
The so-called Information Inequality is generalized due to the introduced information measures . The Generalized Information Inequality (GII) is given by When we have , and, therefore, the Cramer-Rao inequality (, Theorem 11.10.1) holds. The lower boundary for the introduced generalized information is
We are also interested to have at least inequalities among various statistical-analytical information measures concerning the Gaussian as well as the -order normal distribution: to be able to compare the “information” we can obtain for an experiment, which usually is assumed to follow the Gaussian distribution (see Section 3).
2. The -Order Normal Distribution and the Generalized Fisher Information
Let denote the Shannon (or differential) entropy. For any multivariate random variable with zero mean and covariance matrix , it holds thatwhile the equality in (21) holds if and only if is a normally distributed variable; that is, ; see . Moreover, the normal distribution, according to Information Measures Theory, is adopted for the noise, acting additively to the input variable when an input-output time discrete channel is formed. Therefore, the Gaussian distribution needs a special treatment evaluating the LSI.
Kitsos and Tavoularis [3, 4] introduced and studied the multivariate and spherically contoured -order normal distribution, denoted with with . See also [5, 13] for further reading. Recall the definition of the family of distributions.
Definition 1. The -dimensional random variable follows the -order normal distribution with location vector , shape parameter , and positive definite scale matrix , when the density function is of the form , where , , is the quadratic form , . The normalizing factor is defined by
From the p.d.f. as above, notice that the location vector of is essentially the mean vector of ; that is, . Moreover, for the shape parameter value , is reduced to the well-known multivariate normal distribution. We also comment that the function , with , corresponds to an extremal function for inequality extending LSI due to . The essential result is that the defined -order normal distribution works as an extremal function to a generalized form of the Logarithmic Sobolev Inequality.
Various attempts to generalize the usual normal distribution are known. The introduced univariate -order normal coincides with the existent generalized normal distribution introduced in , with density function when and . The multivariate case of the -order normal coincides with the existent multivariate power exponential distribution , as introduced in , where and . See also . These exponential-power generalizations as above are technically obtained (involving an extra parameter ). On the contrary, the form of the -order normal is obtained as an extremal of a generalized LSI, and therefore is having a strong mathematical background. Moreover, the family of distributions acts as the usual normal distribution for the Information Inequality by obtaining the equality. Recall that for . In fact, the generalized form of the Information Inequality, as in (18), is reduced into equality for the -order normally distributed random variable, as it holds that for ; see  for details.
Denote with the area included by the -ellipsoid defined by the quadratic form ; that is, . The family of , that is, the family of the elliptically contoured -order normals, provides a smooth bridging between the multivariate (and elliptically countered) uniform, normal, Laplace, and the degenerate Dirac distributions, that is, between the r.v. , , , and (with pole at the point ), with probability density functions , respectively. That is, the family of distributions generalizes not only the usual normal but also two other very significant distributions, as the uniform and the Laplace distributions.
Theorem 2. The multivariate -order normal distribution, , for order values of , coincides with
Proof. The p.d.f. definition (22) of is depending on the real-valued shape parameter defined outside the closed interval . We let and denote . We distinguish the following cases: (i)The uniform case . From (22) with , that is, or , we get while, for , that is, , we obtain due to the fact that as for all . Therefore, from (25), the third branch of (29) holds true as , or through (25). That is, the multivariate first-ordered normal distribution coincides with the elliptically contoured uniform distribution.(ii)The Gaussian case . It is clear that , as coincides with the multivariate (and elliptically contoured) normal density function as in (26). That is, the multivariate second-ordered normal distribution coincides with the usual elliptically contoured normal distribution.(iii)The Laplace case . For the limiting case of (as ), we derive that as clearly coincides with the multivariate (and elliptically contoured) Laplace density as in (27). That is, the multivariate infinite-ordered normal distribution coincides with the elliptically contoured Laplace distribution.(iv)The degenerate Dirac case . Firstly, we assume that ; that is, , and hence, from definition (22), Given that where is the integer value of , we obtain Utilizing now Stirling’s asymptotic formula, as , (34) implies and, thus, for , (35) implies , while for or it implies . Assuming now that and using (34), we have and hence, for , (36) implies that (due to the fact that for all as ) while, for or , applying (35) into (36) we obtain That is, . Therefore, it is clear that for , as p.d.f. coincides with the multivariate Dirac density given in (25). Therefore, the univariate and bivariate zero-ordered normal distributions with mean at coincide with the (univariate and bivariate) degenerate Dirac distributions with poles at the point . Moreover, the -variate, , zero-ordered normal has a degenerated (zero) density function. Considering the above cases of (i), (iii), and (iv) we can now extend the defining values of the shape parameter of the distribution to include the limiting values of , respectively; that is, can now be defined outside the real open interval . Eventually, the uniform, normal, Laplace, and also the degenerate distributions, as the Dirac or the flat one, can be considered as members of the family of distributions.
The following proposition calculates the Shannon entropy for the -order normally distributed random variable. Recall that the well-known Shannon (or differential) entropy measure is defined by  where the usual minus sign is omitted, for a -variate continuous random variable with density function .
Proposition 3. The Shannon entropy of a random variable is of the form
Proof. Consider the p.d.f. as in (22). From the definition (38) we have that the Shannon entropy of is where denotes the normalized coefficient of the generalized Gaussian kernel with of the p.d.f . Applying the linear transformation with , the above is reduced to where denotes the identity matrix. Switching to hyperspherical coordinates, we get where is the volume of the -sphere. Applying the variable change we obtain successivelywhere . Finally, by substitution of the volume and the normalizing factors and , relation (39) is obtained.
Corollary 4. The usual Shannon entropy for the multivariate (and elliptically countered) uniform, normal, and Laplace distributed random variables is given by For the univariate case , we are reduced to where , , and are the usual notations for the univariate uniform, normal, and Laplace distributions, respectively.
Proof. For the normal case of and the Laplace case of , or (in limit), the second and the third branch of (44) follows easily from (39).
For the uniform case of we obtain that and hence the first branch of (44) holds. Moreover, the corresponding univariate case yields for or, through (25), which is the known (continuous) uniform distribution where and . Thus, using the usual notation of .
Consider now the degenerate case of . We can write (39) in the form where and . We then have and using Stirling’s asymptotic formula, as , (47) yields which proves the corollary.
The generalized Fisher information measure has been calculated for the spherically contoured random variable . We extend this with the following theorem.
Theorem 5. The generalized Fisher information of an r.v. , where is a real matrix that consisted of orthogonal vectors (columns) with the same norm, is given by
Proof. From (15) we havewhile, from the definition of the density function as in (22), we havewhere and . From the assumption there exist a positive real such that , where is an orthogonal matrix with , and thus . For the gradient of the quadratic form it holds that while, from the fact that is an orthogonal matrix, we have . Therefore, (51) can be written as Applying the linear transformation into the above integral, we get ; the quadratic form is reduced to Hence,Switching to hyperspherical coordinates with radius , we get where is the volume of the -sphere , and hence From the fact that and the definition of the gamma function, we obtain successivelyand, finally, applying the normalizing factor as in (23), we derive (58) and the theorem has been proved.
Therefore, for the spherically contoured case, the following holds.
Theorem 6. The generalized Fisher information of a spherically contoured r.v. is given by
3. Bounds for the Generalized Fisher Information
For the defined generalized Fisher information measure and the -order normal distribution, it is clear that the values of as in (50) depend on the two parameters and . In this section we will investigate certain bounds for the under these parameters. For dimension , we obtain boundaries for parameter which are greater or lower than the shape parameter under some restrictions (see Proposition 7) while for dimension , the restrictions are removed; see Corollary 8.
In the following proposition we provide some inequalities for the generalized Fisher entropy type information measure for the -order normally distributed r.v. with positive order , that is, for . We denote with the point of minimum for the positive gamma function, , ; that is, .
Proposition 7. The generalized Fisher information measures with where the vectors of the orthogonal scale matrix are adopting the same norm satisfy the inequalities for values of where .
Proof. For the proof of the first branch of (59) we assume that ; that is, . Then, we have . This implies when . That is, if the inequality holds, then , as the gamma function is an increasing function for . Inequality is equivalent to . As a result, (60) holds indeed, provided that , and thus Our main assumption of together with the fact that for all (defined) parameter values leads us to . Then, inequality (61) providesand, through (58), we derive that for , with ; that is, the first branch of (59) holds. The order of inequalities, , is valid, as is valid. This is true, because implies ; that is, . The values of are decreasing and for all . Moreover, .
For the proof of the third branch of (59) we assume now that ; that is, . We then have and hence provided that . That is, if the inequality holds, then , as the gamma function is an increasing function for . Inequality is equivalent to . As a result, (63) holds indeed, for orders such that , and so The assumption , together with the fact that for all defined order values , leads to . Then, inequality (64) provides and, using (58), we derive that for ; that is, the third branch of (59) holds. These inequalities have a valid order when is valid, that is, when is assumed. Therefore, , where as .
Finally, for the case where , from (58) we easily get ; that is, the middle branch of (59) holds. In this case, the restriction is not binding.
Notice that, as the quantity with is the known Fisher information measure with respect to the multivariate normal distribution, Proposition 7 shows that the generalized Fisher information for the family of distributions is greater than this quantity when and lower when .
Also notice that as the dimension of the involved variable increases then ; for example, . Moreover, as rises. Therefore, Proposition 7 holds without, practically, the restrictions of and , for large enough values of .
Corollary 8. The generalized Fisher entropy type information measure of a random variable , with as in Proposition 7, satisfies the following inequalities for shape parameter and dimension :
Corollary 9. The generalized Fisher information measure of a random variable following the -variate, , elliptically contoured Laplace distribution , as in Proposition 7, is always lower than for all the parameter values of ; that is, For the multivariate normal distribution case, that is, with , we have while is reduced to the known Fisher information for the multivariate ; that is, .
Proof. The normal distribution case is straightforward from Corollary 8. For the Laplace case, as from (29), then for ; that is, the inequality holds for all values of , and the corollary has been proved.
In this paper we considered the generalized form of the multivariate normal distribution, namely, the -order normal distribution, or . This generalization is obtained as an extremal of the LSI corresponding to a power generalization of the entropy type Fisher information.
The Shannon entropy of the introduced distribution was evaluated (including the specific cases of the multivariate elliptically contoured uniform and Laplace distributions, resulting from ), while the generalized entropy type information measure , which extends the known entropy type Fisher information , was also evaluated; see Theorem 6. In Proposition 7 and in the following Corollaries 8 and 9, we obtained boundaries for the generalized information measure .
Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.
- L. Gross, “Logarithmic Sobolev inequalities,” American Journal of Mathematics, vol. 97, no. 4, pp. 1061–1083, 1975.
- P. Federbush, “Partially alternate derivation of a result of Nelson,” Journal of Mathematical Physics, vol. 10, no. 1, pp. 50–52, 1969.
- C. P. Kitsos and N. K. Tavoularis, “Logarithmic Sobolev inequalities for information measures,” IEEE Transactions on Information Theory, vol. 55, no. 6, pp. 2554–2561, 2009.
- C. P. Kitsos and N. K. Tavoularis, “New entropy type information measures,” in Proceedings of the Information Technology Interfaces International Conference (ITI '09), V. Luzar-Stiffer, I. Jarec, and Z. Bekic, Eds., pp. 255–259, Dubrovnik, Croatia, June 2009.
- C. P. Kitsos, T. L. Toulias, and P. C. Trandafir, “On the multivariate-ordered normal distribution,” Far East Journal of Theoretical Statistics, vol. 38, no. 1, pp. 49–73, 2012.
- F. B. Weissler, “Logarithmic Sobolev inequalities for the heat–diffusion semigroup,” Transactions of the American Mathematical Society, vol. 237, pp. 255–269, 1978.
- E. A. Carlen, “Superadditivity of Fisher's information and logarithmic Sobolev inequalities,” Journal of Functional Analysis, vol. 101, no. 1, pp. 194–211, 1991.
- A. Cotsiolis and N. K. Tavoularis, “On logarithmic Sobolev inequalities for higher order fractional derivatives,” Comptes Rendus de l'Académie des Sciences I, vol. 340, no. 3, pp. 205–208, 2005.
- M. del Pino, J. Dolbeault, and I. Gentil, “Nonlinear diffusions, hypercontractivity and the optimal LP-euclidean logarithmic Sobolev inequality,” Journal of Mathematical Analysis and Applications, vol. 293, no. 2, pp. 375–388, 2004.
- E. Gómez, M. A. Gómez-Villegas, and J. M. Marín, “A multivariate generalization of the power exponential family of distributions,” Communications in Statistics—Theory and Methods, vol. 27, no. 3, pp. 589–600, 1998.
- S. Nadarajah, “A generalized normal distribution,” Journal of Applied Statistics, vol. 32, no. 7, pp. 685–694, 2005.
- T. M. Cover and J. A. Thomas, Elements of Information Theory, John Wiley & Sons, New York, NY, USA, 2nd edition, 2006.
- C. P. Kitsos and T. L. Toulias, “New information measures for the generalized normal distribution,” Information, vol. 1, no. 1, pp. 13–27, 2010.
- G. Verdoolaege and P. Scheunders, “On the geometry of multivariate generalized Gaussian models,” Journal of Mathematical Imaging and Vision, vol. 43, no. 3, pp. 180–193, 2012.
- C. P. Kitsos and T. L. Toulias, “Entropy inequalities for the generalized Gaussian,” Journal of Communication and Computer, vol. 9, no. 1, pp. 56–64, 2012.
Copyright © 2015 Thomas L. Toulias. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.