Table of Contents Author Guidelines Submit a Manuscript
Journal of Probability and Statistics
Volume 2017 (2017), Article ID 6579537, 24 pages
Research Article

Numerical Reconstruction of the Covariance Matrix of a Spherically Truncated Multinormal Distribution

1Istituto Nazionale di Statistica (ISTAT), Via Cesare Balbo 16, 00184 Rome, Italy
2Italian Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Via Enrico Fermi 45, 00044 Frascati, Italy

Correspondence should be addressed to Filippo Palombi; ti.aene@ibmolap.oppilif

Received 7 July 2016; Revised 7 October 2016; Accepted 12 October 2016; Published 10 January 2017

Academic Editor: Ramón M. Rodríguez-Dagnino

Copyright © 2017 Filippo Palombi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


We relate the matrix of the second moments of a spherically truncated normal multivariate to its full covariance matrix and present an algorithm to invert the relation and reconstruct from . While the eigenvectors of are left invariant by the truncation, its eigenvalues are nonuniformly damped. We show that the eigenvalues of can be reconstructed from their truncated counterparts via a fixed point iteration, whose convergence we prove analytically. The procedure requires the computation of multidimensional Gaussian integrals over an Euclidean ball, for which we extend a numerical technique, originally proposed by Ruben in 1962, based on a series expansion in chi-square distributions. In order to study the feasibility of our approach, we examine the convergence rate of some iterative schemes on suitably chosen ensembles of Wishart matrices. We finally discuss the practical difficulties arising in sample space and outline a regularization of the problem based on perturbation theory.

1. Introduction

It has been more than forty years since Tallis [1] worked out the moment-generating function of a normal multivariate , subject to the conditional eventThe perfect match between the symmetries of the ellipsoid and those of allows for an exact analytic result, from which the complete set of multivariate truncated moments can be extracted upon differentiation. Consider, for instance, the matrix of the second truncated moments, expressing the covariances among the components of within . From Tallis’ paper it turns out thatwith denoting the cumulative distribution function of a -variable with degrees of freedom. Inverting (2)—so as to express as a function of —is trivial, since is a scalar damping factor independent of . In this paper, we shall refer to such inverse relation as the reconstruction of from . Unfortunately, life is not always so easy. In general, the effects produced on the expectation of functions of by cutting off the probability density outside a generic domain can be hardly calculated in closed form, especially if the boundary of is shaped in a way that breaks the ellipsoidal symmetry of . Thus, for instance, unlike (2), the matrix of the second truncated moments is expected in general to display a nonlinear/nontrivial dependence upon .

In the present paper, we consider the case where is a -dimensional Euclidean ball with center in the origin and square radius . Specifically, we discuss the reconstruction of from the matrix of the spherically truncated second moments. To this aim, we need to mimic Tallis’ calculation, with (1) being replaced by the conditional eventThis is precisely an example of the situation described in the previous paragraph: although has a higher degree of symmetry than , still there is no possibility of obtaining a closed-form relation between and , since breaks the ellipsoidal symmetry of : technically speaking, in this case we cannot perform any change of variable under the defining integral of the moment-generating function, which may reduce the dimensionality of the problem, as in Tallis’ paper.

In spite of that, the residual symmetries characterizing the truncated distribution help simplify the problem in the following respects: (i) the reflection invariance of the whole setup still yields and (ii) the rotational invariance of preserves the possibility of defining the principal components of the distribution just like in the unconstrained case. In particular, the latter property means that and share the same orthonormal eigenvectors. In fact, the reconstruction of from amounts to solving a system of nonlinear integral equations, having the eigenvalues of as unknown variables and the eigenvalues of as input parameters. In a lack of analytic techniques to evaluate exactly the integrals involved, we resort to a numerical algorithm, of which we investigate feasibility, performance, and optimization.

The paper is organized as follows. In Section 2, we describe a few examples illustrating the occurrence of spherical truncations in practical situations. In Section 3, we show that the aforementioned integral equations have the analytic structure of a fixed point vector equation; that is to say, . This suggests achieving the reconstruction of numerically via suitably chosen iterative schemes. In Section 4, we prove the convergence of the simplest of them by inductive arguments, the validity of which relies upon the monotonicity properties of ratios of Gaussian integrals over . In Section 5, we review some numerical techniques for the computation of Gaussian integrals over with controlled systematic error. These are based on and extend a classic work by Ruben [2] on the distribution of quadratic forms of normal variables. For the sake of readability, we defer proofs of statements made in this section to the appendix. In Section 6, we report on our numerical experiences: since the simplest iterative scheme, namely, the Gauss–Jacobi iteration, is too slow for practical purposes, we investigate the performance of its improved version based on overrelaxation; as expected, we find that the latter has a higher convergence rate; yet it still slows down polynomially in as and exponentially in as ; in order to reduce the slowing down, we propose an acceleration technique, which boosts the higher components of the eigenvalue spectrum. A series of Monte Carlo simulations enables us to quantify the speedup. In Section 7 we discuss the problems arising when is affected by statistical uncertainty and propose a regularization technique based on perturbation theory. To conclude, we summarize our findings in Section 8.

2. Motivating Examples

Spherical truncations of multinormal distributions may characterize different kinds of experimental devices and may occur in various problems of statistical and convex analysis. In this section, we discuss two motivating examples.

2.1. A Two-Dimensional Gedanken Experiment in Classical Particle Physics

Consider the following ideal situation. An accelerator physicist prepares an elliptical beam of classical particles with Gaussian transversal profile. The experimenter knows a priori the spatial distribution of the beam, that is, the covariance matrix of the two-dimensional coordinates of the particles on a plane orthogonal to their flight direction. We can assume with no loss of generality that the transversal coordinate system has origin at the maximum of the beam intensity and axes along the principal components of the beam; thus it holds . The beam travels straightforward until it enters a linear coaxial pipeline with circular profile, schematically depicted in Figure 1, where the beam is longitudinally accelerated. While the outer part of the beam is stopped by an absorbing wall, the inner part propagates within the pipeline. At the end of the beam flight the physicist wants to know if the transversal distribution of the particles is changed, due to unknown disturbance factors arisen within the pipeline. Accordingly, he measures again the spatial covariance matrix of the beam. Unfortunately, the absorbing wall has cut off the Gaussian tail, thus damping the covariance matrix and making it no more comparable to the original one. To perform such a comparison in the general case , the covariance matrix of the truncated circular beam has to go through the reconstruction procedure described in next sections.

Figure 1: A classical particle beam with elliptical transversal profile is cut off upon entering a circular coaxial pipeline.
2.2. A Multivariate Example: Connections to Compositional Data Analysis

Compositional Data Analysis (CoDA) has been the subject of a number of papers, pioneered by Aitchison [3] over the past forty years. As a methodology of statistical investigation, it finds application in all cases where the main object of interest is a multivariate with strictly positive continuous components to be regarded as portions of a total amount (the normalization is conventionally adopted in the mathematical literature). In other words, compositional variates belong to the -simplex:with the taxi-cab norm of , while compositions with different norms can be always projected onto via the closure operator . There are countless types of compositional data, whose analysis raises problems of interest for statistics [4], for example, geochemical data, balance sheet data, and election data. The simplex constraint induces a kind of dependency among the parts of a composition that goes beyond the standard concept of covariance. This invalidates many ordinary techniques of statistical analysis.

In order to measure distances on , Aitchison introduced a positive symmetric function , explicitly defined byThe Aitchison distance is a key tool in CoDA. It is scale invariant in both its first and second arguments; that is, it is left invariant by redefinitions with . Accordingly, its support can be extended to by imposingIt was proved in [5] that is a norm-induced metric on , provided the latter is given an appropriate normed vector space structure. Owing to the compositional constraint , it holds . Accordingly, the description of in terms of components is redundant: an essential representation requires compositions to be properly mapped onto -tuples. Among various possibilities, the Isometric Log-Ratio (ILR) transform introduced in [5] is the only known map of this kind leaving invariant. More precisely, the ILR fulfills

It is known from [6] that if is a log-normal -variate, then is a logistic-normal -variate (the reader should notice that in [6] the simplex is defined by ; thus property of [6] is here reformulated so as to take into account such difference), with a known relation between and . Analogously, it is not difficult to show that is a normal -variate, with related to via the change of basis matrices derived in [5]. Just to sum up, it holdsNow, suppose that fulfills (8) and has a natural interpretation as a composition, a representative set of observations of is given, and we wish to select from those units which are compositionally closer to the center of the distribution, according to the Aitchison distance. To see that the problem is well posed, we first turn into a set , of compositional observations of . Then, we consider the special point , representing the center of the distribution of in a compositional sense: minimizes the expression over and fulfills , see [7]. By virtue of (8) this entails . In order to select the observations which are closer to , we set a threshold and consider only those elements fulfilling , with being a sample estimator of on . Such selection rule operates a well-defined truncation of the distribution of . Moreover, in view of (7) and (8), we havewith . As a consequence, we see that a compositional selection rule based on the Aitchison distance and (8) is equivalent to a spherical truncation of a multinormal distribution. Obviously, once has been spherically truncated, the covariance matrix of the remaining data is damped; thus an estimate of the full covariance matrix requires the reconstruction procedure described in next sections.

2.3. Compositional Outlier Detection via Forward Search Techniques

Outlier detection in CoDA is a practical problem where the considerations made in Section 2.2 find concrete application. Most of the established methods for outlier detection make use of the Mahalanobis metric. This is however unfit to describe the distance between compositions. Log-ratio transforms allow to get rid of the compositional constraint and make it possible to apply standard statistical methods [8]. Here we consider an innovative approach, namely, the Forward Search Algorithm (FSA), introduced in [9] and thoroughly discussed in [10]. The FSA admits an elegant extension to CoDA, of which the covariance reconstruction is a key step. In the next few lines we sketch the original algorithm and outline its extension to CoDA.

2.3.1. Construction of the Signal

In its standard formulation the FSA applies to normal data. It assumes a dataset with for . The null hypothesis is that all the elements of are independent realizations of a stochastic variable . The FSA consists of a sequence of steps where data are recursively sorted. Along the recursion a signal is built and tested.

As a preliminary step, observations are randomly chosen from the bulk of . Let be the set of these observations. is used to provide initial estimates , of the true mean and the true covariance matrix , respectively. For , the th step of the algorithm goes as follows:(i)sort the elements of according to the increasing values of the square Mahalanobis distance:(ii)take the th element of the newly sorted dataset and regard as the th component of the signal;(iii)let be the set of the first observations of the newly sorted dataset;(iv)use to provide new estimates , of the true mean and the true covariance matrix, respectively.Notice that is a truncated dataset. Therefore, must include the Tallis’ correction factor, (2) with . While the recursion proceeds, the inliers of populate progressively . The recursion stops at the th step, when the first outlier of produces a statistically relevant discontinuity in the signal.

2.3.2. Statistical Test of the Signal

Statistical tests are needed to assess the relevance of discontinuities observed in the signal. At each step of the algorithm a new test is actually performed. Specifically, at the th step, is computed together with the lowest and highest values, respectively, and , which are theoretically admissible for under the null hypothesis at significance level for some . These values are nothing but the - and -percentage points of . Their envelopes for generate two curves that surround the signal when plotted versus . More precisely, one curve lies below it and the other lies above, provided the null hypothesis is not broken. The violation of one of the curves by the signal is interpreted as the result of the entrance of an outlier into . Although the distribution of cannot be calculated in closed form, its percentage points are obtained from a general formula, first derived in [11], yieldingwith the -percentage point of the Fisher distribution with parameters . Equation (11) holds with decent approximation, as confirmed by numerical simulations.

2.3.3. Extension of the Forward Search to CoDA

When is a compositional dataset, it is unnatural to assume that its elements are realizations of a multivariate . In this case the use of the FSE as outlined above does not make sense at all. Sometimes it is reasonable to assume , as first argued in [6]. In this case we can use the FSA to find outliers, provided that we first modify the algorithm in two respects:(i)we replace the Mahalanobis distance by the Aitchison one;(ii)we perform statistical tests consistently with the change of null hypothesis.Specifically, at the th step of the algorithm, we sort according to the increasing values of the square Aitchison distance:where is the center of . Analogously, given the th element of the newly sorted dataset, we regard as the th component of the signal. The percentage points of are obtained from by using the probability correspondence established in (9). Since is a spherically truncated dataset, the estimate of the covariance matrix derived from it must undergo the reconstruction procedure described in next sections.

2.4. General Covariance Reconstruction Problem

The examples discussed in the previous subsections are special cases of a more general inverse problem, namely, the reconstruction of the covariance matrix of a normal multivariate on the basis of the covariance matrix of truncated to some (convex) region . This is the simplest yet nontrivial inverse problem, which can be naturally associated with the normal distribution. The case corresponds to a setup where theoretical and practical aspects of the problem can be investigated with relatively modest mathematical effort. It is certainly a well-defined framework where to study regularization techniques for nonlinear inverse problems in statistics, for which there is still much room for interesting work [12, 13].

3. Definitions and Setup

Let be a random vector with jointly normal distribution in dimensions. The probability that falls within is measured by the Gaussian integralSince is symmetric positive definite, it has orthonormal eigenvectors . Let us denote by the orthogonal matrix having these vectors as columns and by the diagonal counterpart of . From the invariance of under rotations, it follows that depends upon just by way of . Accordingly, we rename the Gaussian probability content of asNote that (14) is not sufficient to fully characterize the random vector under the spherical constraint, for which we need to calculate the distribution law as a function of . Alternatively, we can describe in terms of the complete set of its truncated momentsAs usual, these can be all obtained from the moment-generating functionby differentiating the latter an arbitrary number of times with respect to the components of ; namely,It will be observed that is in general not invariant under rotations of . Therefore, unlike , the moments depend effectively on both and . For instance, for the matrix of the second moments such dependence amounts toBy parity, the only nonvanishing terms in the above sum are those with . Hence, it follows that and share as a common diagonalizing matrix. In other words, if is the diagonal matrix of the eigenvalues of , then . Moreover, is related to byThe ratios are naturally interpreted as adimensional correction factors to the eigenvalues of , so they play the same role as in (2). However, depends explicitly on the subscript ; thus each eigenvalue is damped differently from the others as a consequence of the condition .

Remark 1. In practical terms, (18)-(19) tell us that estimating the sample covariance matrix of from a spherically truncated population , made of independent units, via the classical estimator , being the sample mean, yields a damped result. Nonetheless, the damping affects only the eigenvalues of the estimator, whereas its eigenvectors are left invariant.

3.1. Monotonicity Properties of Ratios of Gaussian Integrals

Equations (14) and (19) suggest introducing a general notation for the Gaussian integrals over , under the assumption . So, we definewith each subscript on the left-hand side addressing an additional factor of under the integral sign on the right-hand side (no subscripts means ). Several analytic properties of such integrals are discussed in [14]. In the following proposition, we lay emphasis on some issues concerning the monotonicity trends of the ratios .

Proposition 2 (monotonicities). Let denote the set of the full eigenvalues without . The ratios fulfill the following properties:)  is a monotonic increasing function of at fixed and ;()  is a monotonic decreasing function of at fixed and ;()  is a monotonic decreasing function of at fixed and ,where an innocuous abuse of notation has been made on writing in place of .

Proof. Let the symbol denote a derivative with respect to . In order to prove property (), we apply the chain rule of differentiation to and then we pass under the integral sign in and . In this way, we obtainMoreover, since the truncated marginal density of is positive within a set of nonzero measure in , the monotonic trend of in is strict.
Properties () and () are less trivial than (). Indeed, the same reasoning as above now yields on the one handand on the otherDespite being not a priori evident, the right-hand side of both (22) and (23) is negative (and vanishes in the limit ). The inequalities within Euclidean balls have been first discussed in [14], while the inequalities within generalized Orlicz balls have been discussed in [15, 16] for the case where the probability distribution of is flat instead of being normal. More recently, a complete proof of both inequalities has been given in [17]. Despite the technical difficulties in proving them, their meaning should be intuitively clear. The variance inequality quantifies the squeezing affecting as a consequence of the truncation (in the unconstrained case it would be ). The covariance inequality follows from the opposition arising among the square components in proximity of the boundary of . Indeed, if , then in order for to stay within .

3.2. Definition Domain of the Reconstruction Problem

A consequence of Proposition 2 is represented by the following.

Corollary 3. Given , , and , is bounded bywith denoting the Kummer function; namely,

Proof. The upper bound of (24) corresponds to the value of in the -tuple limit , . This is indeed the maximum possible value allowed for according to properties () and () of Proposition 2. In order to perform this limit, we observe thatwith the symbol on the right-hand side representing the Dirac delta function (the reader who is not familiar with the theory of distributions may refer, for instance, to [18] for an introduction). Accordingly,The lower bound corresponds instead to the value taken by as and is kept fixed. In this limit, all the Gaussian factors in the probability density function except the one flatten to one. Hence,Numerator and denominator of the rightmost ratio are easily recognized to be integral representations of Kummer functions (see, e.g., [19, ch. 13]).

The upper bound of (24) can be sharpened, as clarified by the following.

Proposition 4 (bounds on the truncated moments). Let , , and be given. If is a permutation of such that , then the following upper bounds hold:

Proof. The overall upper bound on the sum of truncated moments follows fromAt the same time, the sum can be split and bounded from below byThe single upper bounds on the ’s are then obtained from (30)-(31). It will be noted that (29) is sharper than the upper bound of (24) only for and .

From now on, we shall assume, with no loss of generality, that the eigenvalues of are increasingly ordered, namely, (we can always permute the labels of the coordinate axes, so as to let this be the case). An important aspect related to the eigenvalue ordering is provided by the following.

Proposition 5 (eigenvalue ordering). Let , , and be given. If , then holds as well.

Proof. In order to show that the spherical truncation does not violate the eigenvalue ordering, we make repeated use of the monotonicity properties of Proposition 2. Specifically, if , thenwhere the symbol “” is used to explain where the inequality sign preceding it comes from and the “exchange symmetry” refers to the formal property of the one-index Gaussian integrals over to fulfill .

Let us now focus on (19). They have to be solved in order to reconstruct from . Formally, if we introduce a family of truncation operators (parametrically depending on ), such thatthen the reconstruction of from amounts to calculating . One should be aware that is not a surjective operator in view of Corollary 3 and Proposition 4. Therefore, is only defined within a bounded domain . If we definethen we have , where is the set of permutations of elements. From Proposition 4 we conclude that , beingIn fact, there are vectors fulfilling and ; thus we conclude that is a proper subset of . Numerical experiences based on the techniques discussed in the next sections show indeed thatA graphical representation of (36) in and dimensions is depicted in Figure 2. The reader should note that until Section 7 we shall always assume that comes from the application of to some ; thus by construction.

Figure 2: (a) Numerical reconstruction of in dimensions. (b) Numerical reconstruction of in dimensions.

Now, we observe that (19) can be written in the equivalent formSince and are (nonindependent) input parameters for the covariance reconstruction problem (and in order to keep the notation light), in the sequel we shall leave the dependence of upon and implicitly understood; that is, we shall write (37) as . Hence, we see that the full eigenvalue spectrum is a fixed point of the operator . This suggests obtaining it as the limit of a sequenceprovided that this can be shown to converge. Note that since , it follows that , so the sequence is bounded from below by . In particular, this holds for . Therefore, the sequence moves to the right direction at least at the beginning. A formal proof of convergence, based on the monotonicity properties stated by Proposition 2, is given in the next section.

4. Convergence of the Fixed Point Equation

We split our argument into three propositions, describing different properties of the sequence . They assert, respectively, that (i) the sequence is componentwise monotically increasing; (ii) the sequence is componentwise bounded from above by any fixed point of ; and (iii) if has a fixed point, this must be unique. Statements (i) and (ii) are sufficient to guarantee the convergence of the sequence to a finite limit (the unconstrained spectrum is a fixed point of ). In addition, the limit is easily recognized to be a fixed point of . Hence, statement (iii) guarantees that the sequence converges to the unconstrained eigenvalue spectrum. We remark that all the monotonicities discussed in Proposition 2 are strict; that is, the ratios have no stationary points at finite and , which is crucial for the proof.

Proposition 6 (increasing monotonicity). Given , , and , the sequence , , , is monotically increasing; namely, .

Proof. The proof is by induction. We first notice thatthe inequality follows from . Suppose now that the property of increasing monotonicity has been checked off up to the nth element of the sequence. Then,the inequality follows this time from the inductive hypothesis and from properties () and () of Proposition 2.

Proposition 7 (boundedness). Given , , and , the sequence , , , is bounded from above; namely, , being a fixed point of .

Proof. We proceed again by induction. We first notice thatthe inequality follows as previously from . Suppose now that the property of boundedness has been checked off up to the nth element of the sequence. Then,the inequality follows for the last time from the inductive hypothesis and from properties () and () of Proposition 2.

According to Propositions 6 and 7, the sequence converges. Now, let be the limit of the sequence. Effortlessly, we prove that is a fixed point of . Indeed,Note that passing the limit over under the integral sign is certainly allowed for Gaussian integrals.

Proposition 8 (uniqueness of the fixed point). Let and be two fixed points of , corresponding to the same choice of , , and . Then, it must be that .

Proof. According to the hypothesis, and fulfill the equationsHence,where denotes the Jacobian matrix of and is given by having set . It will be noted that is essentially the covariance matrix of the square components of under spherical truncation (we have come across its matrix elements in (21)–(23)). As such, is symmetric positive definite. Indeed,On setting , we can represent as . If is not the null vector, then . Moreover, the eigenvalues of fulfill the secular equationwhence it follows that is positive definite as well (though it is not symmetric). Since the sum of positive definite matrices is positive definite, we conclude that is positive definite too. As such, it is nonsingular. Therefore, from (47), we conclude that .

5. Numerical Computation of Gaussian Integrals over

Let us now see how to compute with controlled precision. Most of the relevant work has been originally done by Ruben in [2], where the case of is discussed. We extend Ruben’s technique to Gaussian integrals containing powers of the integration variable. Specifically, it is shown in [2] that can be represented as a series of chi-square cumulative distribution functions:The scale factor has the same physical dimension as and . It is introduced in order to factorize the dependence of upon and at each order of the expansion. The series on the right-hand side of (51) converges uniformly on every finite interval of . The coefficients are given by having defined for and as the uniform average operator on the -sphere ; namely,Unfortunately, (52) is not particularly convenient for numerical computations, since is only given in integral form. However, it is also shown in [2] that the coefficients can be extracted from the Taylor expansion (at ) of the generating functionThis series converges uniformly for . On evaluating the derivatives of , it is then shown that ’s fulfill the recursion:Finally, the systematic error produced on considering only the lowest terms of the chi-square series of (51) is estimated bywith .

Now, as mentioned, it is possible to extend the above expansion to all Gaussian integrals . Here, we are interested only in and , since these are needed in order to implement the fixed point iteration and to compute the Jacobian matrix of . The extension is provided by the following.

Theorem 9 (Ruben’s expansions). The integrals and admit the series representations:with being an arbitrary positive constant. The series coefficients are given, respectively, by with denoting the Kronecker symbol. The series on the right-hand side of (57)-(58) converge uniformly on every finite interval of . The functionsare generating functions, respectively, for the coefficients , and (); that is, they fulfillfor . Finally, the coefficients ,  , and () can be obtained iteratively from the recursionswhere the auxiliary coefficients and are defined by

It is not difficult to further generalize this theorem, so as to provide a chi-square expansion for any Gaussian integral . The proof follows closely the original one given by Ruben. We reproduce it in the appendix for , just to highlight the differences arising when the Gaussian integral contains powers of the integration variable.

Analogously to (56), it is possible to estimate the systematic error produced when considering only the lowest terms of the chi-square series of and . Specifically, we find thatIn order to evaluate all Ruben series with controlled uncertainty, we first set (see once more [2] for an exhaustive discussion on how to choose )   ; then we choose a unique threshold representing the maximum tolerable systematic error; for example, (roughly corresponding to double floating-point precision), for all , , and , and finally for each we compute the integerproviding the minimum number of chi-square terms, for which the upper bound to the residual sum lies below . Of course, this procedure overshoots the minimum number of terms really required for the ’s to lie below , since we actually operate on the ’s instead of the ’s. Nevertheless, the computational overhead is acceptable, as it will be shown in the next section. For the sake of completeness, it must be said that typically the values of for , , and with the same (and , ) are not much different from each other.

To conclude, we notice that depends nontrivially upon . By contrast, since is monotically increasing in , we clearly see that is monotically increasing in . Now, should one evaluate and the like for a given at several values of , say , it is advisable to save computing resources and work out Ruben coefficients just once, up to the order corresponding to , since . We made use of this trick throughout our numerical experiences, as reported in the sequel.

6. Numerical Analysis of the Reconstruction Process

The fixed point (39) represents the simplest iterative scheme that can be used in order to reconstruct the solution . In the literature of numerical methods, this scheme is known as a nonlinear Gauss–Jacobi (GJ) iteration (see, e.g., [20]). Accordingly, we shall rewrite it as . As we have seen, the sequence converges with no exception as , provided . Given , the number of steps needed for an approximate convergence with relative precision , that is,depends not only upon , but also on and (note that the stopping rule is well conditioned, since and also ). In order to characterize statistically the convergence rate of the reconstruction process, we must integrate out the fluctuations of due to changes of ; that is, we must average by letting fluctuate across its own probability space. In this way, we obtain the quantity , which better synthesizes the cost of the reconstruction for given and . It should be evident that carrying out this idea analytically is hard, for on the one hand depends upon nonlinearly and on the other hand has a complicated distribution, as we briefly explain below.

6.1. Choice of the Eigenvalue Ensemble

Since is the eigenvalue spectrum of a full covariance matrix, it is reasonable to assume its distribution to be a Wishart for some scale matrix and for some number of degrees of freedom . In the sequel, we shall make the ideal assumption , so that the probability measure of is (see, e.g., [21])Under this assumption, the probability measure of is obtained by performing the change of variable in (72). Unfortunately, we have no analytic representation of . Thus, we have neither an expression for the distribution of . However, can be extracted numerically as follows:(i)generate randomly by means of the Bartlett decomposition [22];(ii)take the ordered eigenvalue spectrum of ;(iii)obtain by applying the truncation operator to .Note that since is only defined for , we need to rescale as increases. The simplest choice is to keep the ratio fixed. The larger this ratio, the closer fluctuates around (recall that if , then and ). In view of this, large values of are to be avoided, since they reduce the probability of testing the fixed point iteration on eigenvalue spectra characterized by large condition numbers . For this reason, we have set in our numerical study.

Having specified an ensemble of matrices from which the eigenvalue spectra are extracted, we are now ready to perform numerical simulations. To begin with, we report in Figure 3 the marginal probability density function of the ordered eigenvalues and their truncated counterparts for the Wishart ensemble at , as obtained numerically from a rather large sample of matrices (106 units). It will be noted that (i) the effect of the truncation is severe on the largest eigenvalues, as a consequence of the analytic bounds of Corollary and Proposition ; (ii) while the skewness of the lowest truncated eigenvalues is negative, it becomes positive for the largest ones. This is due to a change of relative effectiveness of (29)