Recent Advances in Univariate and Multivariate ModelsView this Special Issue
Research Article | Open Access
Masses of Negative Multinomial Distributions: Application to Polarimetric Image Processing
This paper derives new closed-form expressions for the masses of negative multinomial distributions. These masses can be maximized to determine the maximum likelihood estimator of its unknown parameters. An application to polarimetric image processing is investigated. We study the maximum likelihood estimators of the polarization degree of polarimetric images using different combinations of images.
The univariate negative binomial distribution is uniquely defined in many statistical textbooks. However, extensions defining multivariate negative multinomial distributions (NMDs) are more controversial. Most definitions are based on the probability generating function (PGF) of these distributions. Doss  proposed to define the PGF of an NMD as the inverse th power of a polynomial linear in each of its variables. This definition can also be found in the famous textbook [2, page 93] and the computation of its modes has been investigated in . A more general class of NMDs introduced in  was characterized by PGFs of the form , where , is an matrix, and . In particular, matrices yielding infinitely divisible PGFs were derived. Finally, Bar-Lev et al.  introduced NMDs whose PGFs are defined as the inverse th power of any affine polynomial. Necessary and sufficient conditions on the coefficients of this affine polynomial were derived to obtain the PGF of a multivariate distribution defined on (where is the set of nonnegative integers) . These very general multivariate NMDs were recently used for image processing applications in .
The family of NMDs introduced in  can be defined as follows. Let us denote the set of the first nonzero integers. We denote as the monomial obtained by multiplying all the entries of the vector whose indexes belong to , where stands for any subset of the indexes. Let be an affine polynomial with respect to the variables such that . The NMD distribution defined at pair is represented by its PFG which is given by Such laws are denoted as . However, as explained in , all couples do not provide a valid NMD. More specifically, Bernardoff has derived a finite number of conditions over such that is the PGF of an NMD for all positive integer . The corresponding expression of the coefficient of in the Taylor expansion of is given by the formula where and is the set of all subsets of . However, this expression of does not allow us to explicitly compute the masses of NMDs in the general case.
As a first goal of this paper, we propose a way of computing the masses of multivariate NMDs defined above. A specific attention is devoted to bivariate and trivariate cases. In particular, it allows us to retrieve the results of  obtained for bivariate NMDs. The second part of the paper is devoted to the application of NMDs to image processing, more specifically to polarimetric image processing [8, 9]. Polarimetric image processing has received a considerable attention in the image processing and optical communities (see for instance [10–12] and references therein). The state of polarization of a polarimetric image is classically characterized by the degree of polarization (DoP) whose estimation is of major importance [13, 14]. The DoP of polarimetric images can be classically estimated by using four images associated to four different polarizations . However, estimating the DoP using less than four images is interesting since it allows one to reduce the acquisition time and the resulting cost of the imaging system. As a consequence, there has been recently an increasing interest in deriving estimators of the DoP based on a reduced number of polarimetric images. Depending on the intensity of the acquired images, polarimetric images are referred to as low flux or high flux images (low flux corresponding to a small intensity and high flux to a larger intensity). DoP estimation based on a single polarimetric image was considered in [16, 17] under high flux and low flux assumptions. DoP estimators derived from two intensity images degraded by fully developed speckle noise were studied in [18, 19]. Finally, imaging systems using three polarimetric images were studied in [20, 21], under high flux and low flux assumptions.
This paper studies the maximum likelihood estimators (MLEs) of the square DoP based on two or three polarimetric images. These estimators are computed by maximizing the masses of bivariate or trivariate NMDs derived in the first part of this work.
The paper is organized as follows. Section 2 recalls important results on NMDs. Section 3 proposes a new way of computing masses of NMDs. A particular attention is devoted to bivariate and trivariate cases. Section 4 addresses the problem of estimating the square DoP of low flux polarimetric images using the maximum likelihood (ML) principle. Different MLEs are constructed depending on the number of available polarimetric images. Simulation results are presented in Section 5.
2. Negative Multinomial Distributions
An -variate NMD is the distribution of a random vector taking its values in whose PGF is where denotes the mathematical expectation, , and is an affine polynomial of order . (A polynomial with respect to is affine if the one variable polynomial can be written as (for any ), where and are polynomials with respect to the 's with .) These discrete distributions have received much interest in the literature (see for instance  and the references therein). Of course, the affine polynomial has to satisfy appropriate conditions to ensure that is a PGF. These conditions include the trivial equality . However, determining all pairs such that is a PGF is still an open problem (see , for discussions related to this problem). As explained in , the affine polynomial can be rewritten where are positive numbers and is an affine polynomial such that . The Taylor expansions of and in the neighborhood of will be denoted as follows: where and . Equations (3) and (4) clearly show that the masses of multivariate NMDs denoted as can be expressed as follows:
3. Masses of Negative Multinomial Distributions
In this section, we derive new expressions for the coefficients that will be used to compute the masses of NMDs. The particular cases of bivariate and trivariate NMDs will play an important role for the estimation of the DoP on polarimetric images. In order to compute the , we derive several results summarized in this section whereas all demonstrations are reported in the appendix.
Theorem 1. Denote as the set of nonempty subsets of . Any affine polynomial such that denoted as can be expressed as follows: where is the cardinal of the set . Moreover where is the polynomial defined by and is related to the variables as follows:
Remark 2. In the trivariate case defined by , the polynomial can be expressed as where the coefficients of the polynomial can be determined using the relations The next theorem provides a relation between the coefficients of the polynomials and introduced above.
Theorem 3. Let , , and be the affine polynomial defined in (10) and (11). For any and in , denote as the coefficient of in the Taylor expansion of and as the coefficient of in the Taylor expansion . The following relation can be obtained:
The masses of NMDs can be directly obtained from this theorem. The particular cases of bivariate and trivariate NMDs are considered in the following subsections since the corresponding masses will be useful in the application considered in the second part of this paper.
3.1. Bivariate NMDs
Theorem 4. Consider the affine polynomial of order 2 with variables defined by The coefficient of in the Taylor expansion of , can be computed as follows
3.2. Trivariate NMDs
Theorem 6. Consider the affine polynomial with the three variables defined by The coefficient of in the Taylor expansion of are When , , , , , and , an equivalent expression is
4. Estimating the Polarization Degree of Low Flux Polarimetric Images Using Maximum Likelihood Methods
4.1. Low Flux Polarimetric Images
The state of the polarization of the light can be described by the random behavior of a complex vector , called the Jones vector, whose covariance matrix, called the polarization matrix, is where* denotes the complex conjugate. The covariance matrix is a nonnegative Hermitian matrix whose diagonal terms are the intensity components in the and directions. The cross terms of are the correlations between the Jones components. If we assume a fully developed speckle, the Jones vector is distributed according to a complex Gaussian distribution whose probability density function (pdf) is : where is the determinant of the matrix and denotes the conjugate transpose operator. As a consequence, the statistical properties of are fully characterized by the covariance matrix . The different components of can be classically estimated by using four intensity images that are related to the components of the Jones vector as follows (see , for more details): The state of the polarization of the light is classically characterized by the square DoP defined by [15, pages 134–136] where is the trace of the matrix . The light is totally depolarized for , totally polarized for , and partially polarized when . As a consequence, estimating the square DoP of a polarimetric image is important in many practical applications. Different estimation methods of using several combinations of intensity images were studied in . Since only one realization of the random vector was available for a given pixel of a polarimetric image, the image was supposed to be locally stationary and ergodic. These assumptions were used to derive square DoP estimators using several neighbor pixels belonging to a so-called estimation window.
This paper considers practical applications where the intensity level of the reflected light is very low (low flux assumption), which leads to an additional source of fluctuations on the detected signal. Under the low flux assumption, the quantum nature of the light leads to a Poisson-distributed noise which can become very important relatively to the mean value of the signal at a low photon level. As a consequence, the observed pixels of the low flux polarimetric image are discrete random variables contained in the vector such that the conditional distributions of the random variables , for are independent and distributed according to Poisson distributions with means , for . The resulting joint distribution of is a multivariate mixed Poisson distribution : where , , and is the joint pdf of the intensity vector. This section studies estimators of the square DoP defined in (28) based on several vectors belonging to the estimation window. These estimators are constructed from estimates of the covariance matrix elements , . As explained in the introduction, several studies have been recently devoted to the estimation of the square DoP using less than four polarimetric images. This paper goes into this direction by deriving estimators based on the observation of 2 or 3 polarimetric images.
The joint distribution of the intensity vector is known to be a multivariate gamma distribution whose Laplace transform is  where the affine polynomial is with and As a consequence, the distribution of is an NMD whose PGF can be written as (the interested reader is invited to consult  for more details) The results of Section 2 allow us to compute the masses of that will be useful for studying the maximum likelihood estimator (MLE) of the square DoP.
4.2. MLE Using Three Polarimetric Images
The ML estimator of based on several vectors belonging to the estimation window (where and is the number of pixels of the observation window) is obtained by maximizing the log-likelihood with respect to (note again that is the mass of a trivariate NMD that has been computed in Section 3.2). The practical determination of the ML estimator of is achieved by using a Newton-Raphson procedure. The ML estimators of the vector , denoted as , are then plugged into (28) to provide the ML estimator of the square DoP based on three polarimetric images:
4.3. MLE Using Two Polarimetric Images
The MLE of based on several vectors belonging to the estimation window is obtained by maximizing the log-likelihood with respect to (note that is the mass of a bivariate NMD that has been computed in Section 3.1). The practical determination of the ML estimator of is achieved by using a Newton-Raphson procedure. The ML estimators of the vector elements, denoted as , are then plugged into (28) to provide the MLE of the square DoP based on two polarimetric images:
5. Simulation Results
5.1. Estimation Performance
The performance of the ML estimators of the square DoP based on two or three polarimetric images has been evaluated via several experiments. The first simulations compare the log Mean Square Errors (MSEs) of the square DoP estimators constructed from two or three images. Eleven different covariance matrices of the Jones vector have been considered in order to define typical values of the DoP. The values of , for , (defining the covariance matrix elements of the Jones vector) and the corresponding values of the square DoPs are reported in Table 1. Note that all the covariance matrices , , have been normalized so that the mean of the total intensity is equal to 4. Thus, the average number of photons collected on each pixel equals 4 for each matrix of the considered set. This point is in agreement with the low flux assumption.
Figure 1(a) display the empirical log MSEs of the square DoP estimators for the set of covariance matrices defined in Table 1 as a function of the true square DoP value. The red plus markers + correspond to the estimators obtained for two polarimetric images (2D MLE), whereas the blue cross markers × correspond to the MLE obtained for three polarimetric images (3D MLE). Note that these empirical MSEs have been computed for a square observation window of size pixels, based on Monte-Carlo runs. The theoretical asymptotic log MSEs of the MLE are also displayed in Figure 1(a) with continuous lines. These asymptotic values correspond to the Cramer-Rao Lower Bound (CRLB) for the parameter . The MLE is known to be asymptotically unbiased and efficient under mild regularity conditions (that are satisfied for ). Thus, the MSE of the estimates can be approximated a for large sample by the CRLB. More details about the way of computing the square DoP CLRBs can be found in . Figure 1(a) indicates that the empirical MSEs are in good agreement with the corresponding CRLBs, except for the matrices and . Indeed, the CRLBs for these two matrices cannot be computed since the true value of the parameters belongs to the boundary of its definition domain. The empirical bias, standard deviations (“std”), MSEs, and asymptotic variances (“avar”) of the estimators of are also reported in Table 2. It is interesting to note that the MLE obtained using 3 images is slightly more biased than the one obtained using 2 images. However, the MLE based on 3 images provides lower MSEs than the estimator based on 2 images, as expected.
(a) Low flux
(b) High flux
In order to appreciate the influence of the Poisson noise due to the low flux assumption, experiments have been conducted using the high flux assumption. In this case, the intensity vector is assumed to be known. Thus, the high flux MLEs using two and three images can be derived from the intensity vectors and . The results are depicted in Figure 1(b). A comparison between Figures 1(a) and 1(b) allows one to appreciate a similar global behaviour for all the estimators, with a maximum MSE near and decreasing MSEs as goes to or . The degradation of the estimation performance due to the presence of Poisson noise (due to the low flux assumption) can also be clearly noticed.
The next set of simulations studies the performance of the different estimators as a function of the sample size . Figures 2(a) and 2(b) show the log MSEs of the square DoP estimates obtained for and images (for the two particular matrices and ). The empirical bias, standard deviations (“std”), MSEs, and asymptotic variances (“avar”) are also reported in Tables 3 and 4. One can see that the empirical MSEs are in good agreement with their theoretical asymptotic values for a large enough sample size, that is, . Moreover, the gain of a performance using images instead of is more significant for small values of (indeed, the difference between the different curves is more pronounced in the left figure corresponding to than in the right figure corresponding to .)