Abstract

This paper introduces a standard logistic L-moment-based system of distributions. The proposed system is an analog to the standard normal conventional moment-based Tukey g-h, g, h, and h-h system of distributions. The system also consists of four classes of distributions and is referred to as (i) asymmetric -, (ii) log-logistic , (iii) symmetric , and (iv) asymmetric -. The system can be used in a variety of settings such as simulation or modeling events—most notably when heavy-tailed distributions are of interest. A procedure is also described for simulating -, , , and - distributions with specified L-moments and L-correlations. The Monte Carlo results presented in this study indicate that estimates of L-skew, L-kurtosis, and L-correlation associated with the -, , , and - distributions are substantially superior to their corresponding conventional product-moment estimators in terms of relative bias and relative standard error.

1. Introduction

Simulating or modeling phenomena that involve heavy-tailed distributions have increasingly become of interest in many areas. Some examples include investigations associated with modeling stream flow and flood frequency [13], regional frequency analysis [4], aircraft landing processes [5], disaster analysis [6], finance [7], signal and image processing [8], and latent traits in the social and behavioral sciences [9]. Further, it is also common practice for methodologists to investigate the type I error and power properties associated with inferential statistics using heavy-tailed distributions [1014]. In many cases, these investigations may only require an elementary transformation to produce heavy-tailed distributions with specified values of conventional skew and kurtosis. A system of transformations that is particularly well suited for this task is the Tukey -, , , and - classes, which can be used for simulating or modeling univariate and multivariate heavy-tailed distributions [1420].

The quantile function associated with - distributions can be succinctly described as where , the parameters and control the skew and kurtosis of , and the quantile functions for the one-parameter log-normal or symmetric distributions can be obtained by taking the or , respectively. The class of asymmetric - distributions can be obtained by a doubling technique of distributions that is described in [18, 21]. Hence, an attractive feature of the Tukey system (-, , , - classes) is that it is computationally efficient because each class of distributions only requires the knowledge of one or two parameters and an algorithm that generates standard normal random deviates.

One problem that arises in the context of heavy-tailed distributions is that conventional moment-based estimators of skew and kurtosis have unfavorable attributes insofar as they can be substantially biased, have high variance, or can be influenced by outliers [4, 2225]. To obviate this problem, -moment-based estimators such as -skew and -kurtosis were introduced to address the limitations associated with conventional estimators [22]. Specifically, some of the advantages that -moments have over conventional moments are that they (i) exist whenever the mean of the distribution exists, (ii) are nearly unbiased for all sample sizes and distributions, and (iii) are more robust in the presence of outliers [4, 2225].

Thus, it would be advantageous to have a -moment-based Tukey system of transformations for the purpose of simulating or modeling univariate and multivariate heavy-tailed distributions. However, because of the complexities associated with (1.1) and the fact that the standard normal distribution function is not available in closed form, the derivation of a -moment-based Tukey system would be problematic. We would note that the Tukey and - classes of distributions have been characterized in the context of -moments and the -correlation (see [21]).

In view of the above, the present aim is to introduce an -moment Tukey system analog that is based on the standard logistic distribution. More specifically, the system consists of four quantile functions that produce continuous symmetric and asymmetric distributions with specified values of -skew, -kurtosis, and -correlation. The four classes of distributions are referred to as (i) asymmetric -, (ii) log-logistic , (iii) symmetric , and (iv) asymmetric -. Some of the advantages that the new system has is that the class of log-logistic () distributions has demonstrated to be more efficient than its log-normal () counterpart in terms of computing its hazard function or when censored data are encountered in the context of survival analysis [26]. Note that the and - classes of distributions were recently introduced in an article by Headrick and Pant [27].

The rest of the paper is outlined as follows. In Section 2, the cumulative distribution function and probability density function as well as other properties associated with the four classes of distributions are derived. A summary of univariate -moment theory is also provided, and the derivations of the systems of equations for specifying values of -skew and -kurtosis for the four classes of distributions are subsequently provided. In Section 3, the coefficient of -correlation is introduced, and the equations are developed for determining intermediate correlations for specified -correlations associated with the four classes of distributions. In Section 4, the steps for implementing a simulation procedure are described. Numerical examples and the results of a simulation are also provided to confirm the derivations and compare the new methodology with its conventional moment-based counterpart. In Section 5, the results of the simulation are discussed.

2. Methodology

2.1. Definitions and Properties for the System of -, , , and - Distributions

The derivation of the cumulative distribution function (cdf), probability density function (pdf), and other properties associated with the system of -, , , and - distributions begins with the following definitions.

Definition 2.1. Let be a random variable that has a standard logistic distribution with cdf and pdf expressed as

Definition 2.2. Let the quantile function associated with the system of -, , , and - distributions be defined as where in (2.3)–(2.6) is a strictly monotone increasing function with real-valued parameters , , , . Equations (2.3)–(2.6) produce distributions defined as (i) asymmetric - (, ), (ii) log-logistic (), (iii) symmetric (), and (iv) asymmetric - (, , ). The parameter in (2.3) and (2.4) controls the degree of skew associated with a distribution. Taking the negative of will change the direction of the skew but not its magnitude that is, . The parameter in (2.3) and (2.5) controls the tail weight of a distribution where the function (i) preserves symmetry, (ii) is increasing for and , and (iii) produces increased tail weight as the value of becomes larger. Analogous to , the real-valued parameter () in (2.6) controls the left (right) tail weight of a - distribution.

Theorem 2.3. The cdf and pdf associated with the -, , , and - classes of distributions in (2.3)–(2.6) are with derivatives expressed as where and for (2.9).

Proof. The requirement that in (2.3)–(2.6) is a strictly monotone increasing function implies that an inverse function () exists and thus . Differentiating both sides with respect to yields . Hence, . Whence, the pdf in (2.8) integrates to one because is the logistic pdf in (2.2) and will be nonnegative on its support for ) given from Definition 2.2 that , , , , and where as and .

Definition 2.4. If the monotonicity assumption () holds for the pdf in (2.8) for all , then (2.8) is defined to be a global pdf.

Remark 2.5. Inspection of (2.2) and (2.3)–(2.9) indicates that the height of a global pdf in (2.8) for any -, , , or - distribution at will be .

Remark 2.6. The mode associated with a global pdf in (2.8) is located at where is the critical number that solves and globally maximizes at the mode . It is noted that a global pdf in (2.8) will have a global maximum because the standard logistic pdf in (2.2) has a global maximum, and the transformations in (2.3)–(2.6) are assumed to be strictly monotone increasing functions.

Remark 2.7. The median associated with a global pdf in (2.8) is located at . This can be shown by letting denote the 50th percentile. In general, we must have such that holds in (2.7) for the standard logistic distribution. As such, the limit of the quantile function locates the median at for the system of -, , , and - distributions.

Remark 2.8. In terms of the - (or ) class of distributions, the monotonicity assumption () holds locally in (2.8) for cases where and ( or ). This leads to Definition 2.9.

Definition 2.9. A - (or ) distribution is defined to have a local pdf if the monotonicity assumption holds in accordance with Remark 2.8 and , where is a user specified threshold probability based on the cdf in (2.7) (e.g., ).
Examples of -, , , and - distributions based on the cdf and pdf in (2.7) and (2.8) are presented in Figure 1 through Figure 4, respectively. The heavy-tailed distributions in Figures 1(a), 2(a), 3(b), and 4(a) are used in the simulation portion of this study in Section 4. In the next section, univariate -moments are introduced, and the -moments for the system of -, , , and - distributions are subsequently derived, and other properties are discussed.

2.2. Preliminaries on Univariate -Moments

Let be independent and identically distributed random variables each with continuous pdf , cdf , order statistics denoted as , and -moments defined in terms of either linear combinations of (i) expectations of order statistics or (ii) probability weighted moments (). For the purposes considered herein, the first four -moments associated with are expressed as [4, pages 20–22] where the are determined from where . The coefficients associated with in (2.14) are obtained from shifted orthogonal Legendre polynomials and are computed as shown in [4, page 20] or in [22].

The -moments and in (2.10) and (2.11) are measures of location and scale and are the arithmetic mean and one-half the coefficient of mean difference, respectively. Higher order -moments are transformed to dimensionless quantities referred to as -moment ratios defined as for , and where and are the analogs to the conventional measures of skew and kurtosis. In general, -moment ratios are bounded in the interval as is the index of -skew () where a symmetric distribution implies that all -moment ratios with odd subscripts are zero. Other smaller boundaries can be found for more specific cases. For example, the index of -kurtosis () has the boundary condition for continuous distributions of [28]

2.3. -Moments for the System of -, , , and - Distributions

The derivation of the first four -moments associated with the - class of distributions begins by defining the probability weighted moments based on (2.14) in terms of (2.1)–(2.3) as

In general, to obtain finite values of , , , based on (2.16), we must have (i) , (ii) , and (iii) . As such, integrating (2.16) for and using (2.10)–(2.13) gives , , , and as with harmonic number functions of , , , , , and . As such, given user specied values of and , (2.19) and (2.20) can be numerically solved to obtain the parameters for and .

Taking the limit of (2.17)–(2.20) as gives the class associated with the log-logistic distributions in (2.4) as

Analogously, taking the limit of (2.17)–(2.20) as gives the system associated with the symmetric family of distributions in (2.5) as in [27] with polygamma functions of , , , and .

The -moments for the class of asymmetric - distributions in (2.6) can be determined by separately evaluating and summing two integrals of the form in (2.16) as

For an asymmetric - distribution with a global pdf to have finite values of , , , based on (2.29) will require and . As such, integrating (2.29) to obtain and subsequently substituting these terms into (2.10)–(2.13) yield as in [27] with polygamma functions of , , , , , , , .

As with the - class of distributions, given specified values of and , (2.32) and (2.33) can be numerically solved to obtain the corresponding values of and . Inspection of (2.32) and (2.33) reveals that interchanging values for the parameters of and reverses the direction of and has no effect on . As such, a graph of the region for feasible combinations of and for (2.32) and (2.33) is provided in Figure 5. Feasible combinations of -skew and -kurtosis will lie in the region above the curve graphed in the , plane of Figure 5. Note that the curve in Figure 5 was graphed by setting with in (2.32) and (2.33).

The conventional moment-based systems for the - and - classes of distributions are given in Appendices A and B, respectively. These systems were used to determine the values of skew and kurtosis associated with the distributions given in Figures 14. It is worthy to point out that the conventional moment-based systems have a disadvantage in terms of moment existence. That is, the integral in (A.1) of Appendix A reveals that for the th moment to exist we must have in general (i) , (ii) and (iii) . For example, if the mean, variance, skew, and kurtosis exist, then we must have . And, analogously for the - class of distributions in Appendix B, the parameters are bounded in the range of and . The advantage that the -moment system has in this context is attributed to Hosking’s Theorem 1 [22] which states that if the mean () exists, then all other -moments will have finite expectations. We note that the conventional moment-based systems for the log-logistic and symmetric classes of distributions can be obtained by simplifying the system as described in Appendix A. In the next section we first introduce the topic of the -correlation and subsequently develop the methodology for simulating -, , , and - distributions with specified -correlations.

3. -Correlations for the System of -, , , and - Distributions

The coefficient of -correlation [29] is introduced by considering two random variables and with distribution functions and , respectively. The second -moments of and can alternatively be expressed as

The second -comoments of toward and toward are

As such, the -correlations of toward and toward are expressed as

The -correlation in (3.5) (or (3.6)) is bounded such that where a value of () indicates a strictly increasing (decreasing) monotone relationship between the two variables. In general, we would also note that .

In the context of the -moment-based -, , , and - classes of distributions, suppose that it is desired to simulate a -variate distribution based on quantile functions of the forms in (2.3)–(2.6) with a specified -correlation matrix and where each distribution has its own specified values of and . Let denote standard normal variables where the distribution functions and bivariate density function associated with and are expressed as

Using (3.7), it follows that the th distribution associated with (2.3)–(2.6) can be expressed as , where is standard logistic because . As such, using (3.5), the -correlation of (())) toward (())) can be evaluated using the solved value(s) of the parameter(s) (i.e., -; ; ; -) for (())), a specified intermediate correlation (IC) in (3.9), and the following integral generally expressed as

We would point out that the purpose of the IC () in (3.9) and (3.10) is to adjust for the effect of the transformation , which is induced by the parameters, such that has its specified -correlation () toward (())). Further, to simplify the computation, the quantile function in (3.10) is standardized by a linear transformation such that it has a mean of zero and one-half the coefficient of mean difference equal to that of the unit-normal distribution as where is a mean from (2.17), (2.21), (2.25), or (2.30) and is a constant that scales in (2.18), (2.22), (2.26), or (2.31) and in the denominator of (3.5) to as

Analogously, the -correlation of toward is expressed as

Note also for the special case that if in (3.10) and (())) in (3.13) have the same parameters, that is, ; ; ; , then . Source code written in Mathematica [30] that implements the computation of an IC () based on (3.10) is provided in Algorithm 1. The details for simulating -, , , and - distributions with specified values of -skew, -kurtosis, and -correlations are described in the next section.

( Intermediate Correlation for Distributions 1 and 2. See Table 3. )
0.380206;
Needs[“MultivariateStatistics”]
= PDF[MultinormalDistribution ;
= CDF[NormalDistribution 0, 1 ;
= CDF[NormalDistribution 0, 1 ];
( Parameters for Figure 1(a) )
;
;
= Log /(1 ) ;
( Quantile function from (2.3)*)
;
( Standardizing constants from (2.17) and from (3.12)*)
;
( Compute the specified -correlation in Table 1 for Distributions 1 and 2 )
0.40

4. The Procedure for Simulation and Monte Carlo Study

To implement the method for simulating -, , , and - distributions with specified -moments and -correlations we suggest the following six steps.(1)Specify the -moments for transformations of the forms in (2.3)–(2.6) that is, and obtain the solutions for the parameters of -, , , or - distributions by solving (2.19), (2.20); (2.23), (2.24); (2.27), (2.28); (2.32), (2.33) using the specified values of -skew () and -kurtosis () for each distribution. Specify a matrix of -correlations () for toward , where .(2)Compute the (Pearson) intermediate correlations (ICs) by substituting the solutions of the parameters from Step (1) into (3.10) and then numerically integrate to solve for (see Algorithm 1 for an example). Repeat this step separately for all pairwise combinations of correlations.(3)Assemble the ICs into a matrix and decompose this matrix using a Cholesky factorization. Note that this step requires the IC matrix to be positive definite.(4)Use the results of the Cholesky factorization from Step (3) to generate standard normal variables correlated at the intermediate levels as follows: where are independent standard normal random variables and where represents the element in the th row and the th column of the matrix associated with the Cholesky factorization performed in Step (3).(5)Substitute from Step (4) into the following Taylor series-based expansion for the standard normal cdf [30]: where denotes the standard normal pdf and where the absolute error associated with (4.2) is less than .(6)Substitute the zero-one uniform deviates, , generated from Step (5) into the equations of the form of (())), where is standard logistic to generate the -, , , and - distributions with the specified -moments and -correlations.

To demonstrate the steps above and evaluate the proposed method, a comparison between the proposed -moment and conventional product-moment-based procedures is subsequently described. Specifically, the heavy-tailed distributions in Figures 1(a), 2(a), 3(b), and 4(a) are used as a basis for a comparison using the specified correlation matrix in Table 1. Tables 2 and 3 give the solved IC matrices for the conventional moment and -moment-based methods, respectively. See Algorithm 2 for an example of computing ICs for the conventional method. Tables 4 and 5 give the results of the Cholesky decompositions on the IC matrices, which are then used to create with the specified ICs by making use of the formulae given in (4.1) of Step (4) with . The values of are subsequently transformed to using (4.2) and then substituted into equations of the forms in (2.3)–(2.6) to produce for both methods.

Intermediate Correlation for Distributions 1 and 4. See Table 2. )
0.835142;
Needs[“MultivariateStatistics”]
DF ;
DF ;
DF ;
( Parameters for Figure 1(a) and Figure 4(a) in Figure 4 )
;
;
.0475;
.1358;
og[ /(1 )];
og[ /(1 )];
( Quantile functions from (2.3) and (2.6)*)
;
;
;
( Standardizing constants , ((2.17), (2.30)) and , ((A.2), (B.2)) from
Appendices A and B )
;
;
;
( Compute the specified Pearson correlation in Table 1 for Distributions 1 and 4. )
= NIntegrate[ [ , , , , , −10, 10}, ,
−10, 10}, Method MultiDimensional
0.80

In terms of the simulation, a Fortran algorithm was written for both methods to generate 25,000 independent sample estimates for the specified parameters of (i) conventional skew (), kurtosis (), and Pearson correlation () and (ii) -skew (), -kurtosis (), and -correlation (). All estimates were based on sample sizes of and . The formulae used for computing estimates of were based on Fisher’s -statistics that is, the formulae currently used by most commercial software packages such as SAS, SPSS, and Minitab, for computing indices of skew and kurtosis (where for the standard normal distribution). The formulae used for computing estimates of were Headrick’s Equations (2.4) and (2.6) [25]. The estimate for was based on the usual formula for the Pearson product moment of correlation statistic, and the estimate for was computed based on (3.5) using the empirical forms of the cdfs in (3.1) and (3.3). The estimates for and were both transformed using Fisher’s transformation. Bias-corrected accelerated bootstrapped average (mean) estimates, confidence intervals (CIs), and standard errors were subsequently obtained for the estimates associated with the parameters (, , , ) using 10,000 resamples via the commercial software package Spotfire S+ [31]. The bootstrap results for the estimates of the means and CIs associated with and were transformed back to their original metrics (i.e., estimates for and ). Further, if a parameter () was outside its associated bootstrap CI, then an index of relative bias (RB) was computed for the estimate () as . Note that the small amount of bias associated with any bootstrap CI containing a parameter was considered negligible and thus not reported. The results of the simulation are reported in Tables 613 and are discussed in the next section.

5. Discussion and Conclusion

One of the primary advantages that -moments have over conventional moment-based estimators is that they can be far less biased when sampling is from distributions with more severe departures from normality (e.g., [21, 25]). Inspection of the simulation results in Tables 6 and 7 of this study clearly indicates that this is also the case for the system of -, , , and - distributions. Specifically, the superiority that estimates of -moment ratios () have over their corresponding conventional moment-based counterparts () is obvious. For example, with samples of size the estimates of skew and kurtosis for Distribution 1 (Tables 6 and 8) were, on average, only 26.80% and 2.38% of their associated population parameters whereas the estimates of -skew and -kurtosis were 87.80% and 94.81% of their respective parameters. It is also evident from Tables 6 and 8 that -skew and -kurtosis are more efficient estimators as their relative standard errors RSE = (standard error/estimate) × 100 are substantially smaller than the conventional estimators of skew and kurtosis. For example, in terms of Distribution 1, inspection of Tables 7 and 9 indicates RSE measures of and compared with and . This demonstrates that -skew and -kurtosis have more precision because they have less variance around their estimates.

The results associated with the conventional Pearson and -correlations are presented in Tables 1013. Overall inspection of these tables indicates that the -correlation is substantially superior to the Pearson correlation in terms of relative bias. For example, in terms of a moderate correlation (Table 10, , ) the relative bias for Distributions 1 and 2 was 9.73% for the Pearson correlation compared to only 1.85% for the -correlation (Table 12, , ). Further, for large sample sizes (Table 13, ), the -correlation bootstrap CIs contained all of the population parameters whereas the Pearson correlation CIs contained none of the parameters (Table 11).

In summary, the proposed -moment-based system of -, , , and - distributions is an attractive alternative to the traditional conventional-moment-based system. In particular, the -moment-based system has distinct advantages when heavy-tailed distributions are of concern. Finally, we would note that Mathematica Version 8.0 [30] source code is available from the authors for implementing the -moment-based method.

Appendices

A. System of Conventional Moment-Based Equations for -Distributions

The moments () associated with the class of - distributions in (2.3) can be determined from

In general, to obtain defined values of based on (A.1) we must have (i) , (ii) , and (iii) .

The mean, variance, skew, and kurtosis are defined in general as in [17]

The moments associated with the location and scale parameters in (A.2) are with harmonic number functions of , , , , , , and with harmonic number functions of , , , , , , , , , and .

The moments associated with the shape parameters of skew and kurtosis in (A.2) are with harmonic number functions of , , , , , , , , , , , , , , and with harmonic number functions of , , , , , , , , , , , , , , , , and .

The moments for the asymmetric class of distributions in (2.4) can be obtained by taking the limit of (A.3)–(A.6) as . Analogously, the moments for the symmetric class of distributions in (2.5) can be obtained by taking the limit of (A.3)–(A.6) as .

B. System of Conventional Moment-Based Equations for - Distributions

The moments () associated with the asymmetric - class of distributions in (2.6) can be determined from

Using the expressions in (A.2), the moments for the location and scale parameters are where the polygamma functions are: , , , , , , , , and . The notation in (B.3) is the Hurwitz zeta function. The moments related to skew and kurtosis are as follows: where the polygamma functions are , , , , , , , , , , , , and . The notation in (B.4) is the zeta function.