Abstract
Marginal probability density and cumulative distribution functions are presented for multidimensional variables defined by nonsingular affine transformations of vectors of independent two-piece normal variables, the most important subclass of Ferreira and Steel's general multivariate skewed distributions. The marginal functions are obtained by first expressing the joint density as a mixture of Arellano-Valle and Azzalini's unified skew-normal densities and then using the property of closure under marginalization of the latter class.
1. Introduction
In the literature on probability distributions, there are several approaches for extending the multivariate normal distribution with the introduction of some sort of skewness. Arellano-Valle et al. [1] provide a unified view of this literature. The largest group of contributions was initiated by Azzalini and Dalla Valle [2] and Azzalini and Capitanio [3] and generalizes the univariate skew-normal (SN) distribution studied by Azzalini [4, 5]. These “multivariate skew-normal distributions” are generated from a normal distribution either by conditioning on a truncated variable or by a convolution mechanism.
An alternative approach was proposed by Ferreira and Steel [6–8] and is based on nonsingular affine transformations of random vectors with independent components, each having a skewed distribution with probability density function (pdf) constructed from a symmetric distribution using the inverse scaling factor method introduced by Fernández and Steel [9]. (Arellano-Valle et al. [10] consider a general class of asymmetric univariate distributions that includes the distributions generated according to the procedure proposed by Fernández and Steel [9] as a special case.) If the univariate symmetric distribution is the standard normal, then the corresponding univariate skewed distribution becomes (with a different parameterization) the two-piece normal (tpn) analyzed by John [11] (see also Johnson et al. [12]). To overcome an issue of overparameterization, Ferreira and Steel [7, 8] pay particular attention to the subclass associated with transformation matrices that can be factorized as the product of an orthogonal matrix and a diagonal positive definite matrix. Villani and Larsson [13] studied this subclass when the basic univariate skewed distribution is the tpn and named these distributions “multivariate split normal.”
Under the acronym SUN (standing for “unified skew-normal”), Arellano-Valle and Azzalini [14] suggested a formulation for the first approach that encompasses the most relevant coexisting variants of multivariate skew-normal distributions. Like the multivariate normal and SN distributions, the class of SUN distributions is closed under affine transformations, marginalization, and conditioning to given values of some components. Besides these important properties, the SUN class is also closed under sums of independent components. However, one limitation of the SUN distributions is that the vector of location parameters does not have a direct interpretation as the mean or the mode of the distribution, which are rather complicated functions of all the parameters. Even in the simplest case of the basic SN, both the mean and the mode (for which there is no closed expression) depend on the parameters regulating dispersion and skewness.
Ferreira and Steel’s independent components approach to the construction of multivariate skewed normal distributions (henceforth FS-SN) provides an alternative to the SUN class in applications for which it is important to have some location measure that does not depend on the dispersion and skewness parameters. Indeed, the FS-SN distributions have the convenient feature that the mode is part of the distribution parameters and therefore is invariant to dispersion and skewness. In addition, the FS-SN distributions are closed under nonsingular affine transformations. However, unlike the SUN class, the FS-SN distributions are not closed under marginalization (neither under conditioning) and, to my knowledge, general closed expressions of their marginal pdf and cumulative distribution function (cdf) are not available in the literature.
This paper aims at filling the gap and proposing expressions for the marginal density and cumulative distribution functions of an FS-SN distribution. Obviously, the expressions will also apply to the subclass of multivariate split normal distributions studied by Villani and Larsson [13]. The technique used to derive the marginal distributions is simple and consists of expressing the joint FS-SN distribution as a finite mixture of singular SUN distributions and then making use of their property of closure under marginalization.
An area of application of the results presented in this paper is macroeconomic density forecasting. Many institutions that publish macroeconomic forecasts complement their point forecasts with information on the dispersion and skewness of the probability distributions of the forecasting errors. Fan charts are one of the most popular tools to convey the predictive densities, and they gained prominence through their use in inflation reports released by many central banks, with the Bank of England and the Sveriges Riksbank (the Swedish central bank) featuring as pioneers in this respect [15, 16] (see also Wallis [17, 18] and Tay and Wallis [19]). The characterization of the forecast densities is complicated by the fact that typically institutional forecasts are not based on a single model but stem from different competing models combined with judgements by experts (the latter regarding, in particular, the skewness, i.e., the balance of upward and downward risks to the forecasts). Most of the procedures used to generate the fan charts take the point baseline forecasts as given and assume that the sources of uncertainty and asymmetry have univariate tpn distributions. These sources of forecasting error are then aggregated according to a linear mapping, envisaged as an approximation around the baseline to the underlying unknown data generating process. In the absence of closed expressions for the exact distribution of a linear combination of tpn variables, some aggregation procedures resort to informal approximations based on the first moments, while other procedures are based on numerical simulation. Examples of the first approach are Blix and Sellin [16, 20, 21] and Elekdag and Kannan [22], while Pinheiro and Esteves [23] opted to simulate the distribution. The results presented in Section 3 allow to overcome this aggregation difficulty.
2. The SUN and the FS-SN Distributions
If the M-dimensional random vector , then its pdf and cdf are, respectively, for any point where and denote, respectively, the pdf and the cdf at point of a normal distribution , and are vectors of parameters, is a positive definite covariance matrix, is the diagonal matrix formed by the standard deviations of , is the correlation matrix associated with (hence ), with , is a positive definite correlation matrix, and is such that is also a (semi-definite positive) correlation matrix. (Arellano-Valle and Azzalini [14, Appendix C] consider three cases of singular SUN distributions: (i) singular; (ii) singular; (iii) singular with nonsingular and . For our purposes, only the latter case is relevant.) The SUN distribution collapses to the multivariate normal when , being the matrix of parameters that regulate skewness. It collapses to the basic multivariate SN distribution suggested by Azzalini and Dalla Valle [2] when and (implying that ).
Now let the scalar random variable be tpn distributed with zero mode. Its pdf may be parameterized as follows: where denotes the pdf, is a scale parameter, and is a shape parameter. When , the density becomes the normal pdf with zero mean and standard deviation (so that when the latter parameter is 1 the pdf collapses to ). Values of above (below) unity correspond to densities skewed to the right (left). Let be an N-dimensional random vector of independent tpn components with zero mode and unitary scale . Its pdf is where is as in (2.4) (with ) and . An N-dimensional random vector is said to be distributed if there is a random vector with density (2.5) and two vectors (the joint mode) and (the “shape vector”) and a nonsingular matrix (the “scale matrix”) such that . Vector has pdf It is straightforward to confirm that (i) when , this density collapses to the pdf of a distribution, (ii) the distribution is unimodal with mode , invariant with respect to and , and (iii) by construction, the FS-SN class is closed under nonsingular affine transformations.
3. The Marginal FS-SN Distributions
To establish additional notations, let denote the identity matrix of order N, and let the number of zero elements in vector , one if all elements of vector are nonnegative and zero otherwise, and the generic element of the Nth Cartesian power of , , , , , , and .
Proposition 3.1. The pdf and the cdf of the N-dimensional random vector with nonsingular scale matrix can be expressed, respectively, as where and are pdfs and cdfs of singular distributions, with . The latter functions may be written as
Note that Hence, the distribution can be envisaged as a finite mixture of singular distributions.
As pointed out by Arellano-Valle and Azzalini [14, Appendix C], the rank deficiency of does not affect the properties of the SUN distributions and its only impact is of a computational nature. In our case, it actually simplifies the computation of the pdf values because the evaluation of a normal cdf is not required anymore, unlike when computing (2.1), the general expression of a SUN pdf.
In order to derive the marginal pdfs and cdfs of , one needs to consider its partition with and of dimensions and , respectively, and the corresponding partitions with , , , , , and . Proposition 3.2 follows directly from Proposition 3.1 and from the result of Arellano-Valle and Azzalini [14, Appendix A] on the marginal distributions of members of the SUN class.
Proposition 3.2. Let . Then, the marginal pdf and the cdf of the -dimensional subvector are, respectively, where and are pdfs and cdfs of singular distributions, with . The latter functions may be written as
Appendix
Proof of Proposition 3.1
When , the pdf of the univariate tpn (2.4) can be written as where Hence, from (2.5), Note that whenever . Hence, the nonzero terms in the latter summation are those associated with N-tuples for which . If there is only one such term. If includes zero elements, there are nonzero identical terms in the previous summation. In both cases, the density of may be expressed as follows: where is the sign function and is the pdf of the univariate SN distribution with zero location parameter, scale parameter , and shape parameter :
From the above expression of , by considering the change of variable with nonsingular, one obtains the pdf of : with In order to show that is the pdf of a , note that where is the density of a distribution with Thus, as , The simplified expression for presented in Proposition 3.1 is obtained from (A) simply by taking into account that As regards the cdf of , Moreover, one gets from(2.1) The latter equality follows from the singularity of , which for given allows one to write the probability of where , as the probability of for .