Abstract

This paper derives a procedure for determining the expectations of order statistics associated with the standard normal distribution (𝑍) and its powers of order three and five (𝑍3 and 𝑍5). The procedure is demonstrated for sample sizes of 𝑛9. It is shown that 𝑍3 and 𝑍5 have expectations of order statistics that are functions of the expectations for 𝑍 and can be expressed in terms of explicit elementary functions for sample sizes of 𝑛5. For sample sizes of 𝑛=6,7 the expectations of the order statistics for 𝑍, 𝑍3, and 𝑍5 only require a single remainder term.

1. Introduction

Order statistics have played an important role in the development of techniques associated with estimation [1, 2], hypothesis testing [3, 4], and describing data in the context of L-moments [5, 6]. In terms of the latter, L-moments are based on the expectations of linear combinations of order statistics associated with a random variable 𝑋. Specifically, the first four L-moments are expressed as𝜆1𝑋=𝐸11,𝜆2=12𝐸𝑋22𝑋12,𝜆3=13𝐸𝑋332𝑋23+𝑋13,𝜆4=14𝐸𝑋443𝑋34+3𝑋24𝑋14(1.1)or more generally as 𝜆𝑟=1𝑟𝑟1𝑗=0(1)𝑗𝑗𝐸𝑋𝑟1𝑟𝑗𝑟,(1.2) where the order statistics 𝑋1𝑛𝑋2𝑛𝑋𝑛𝑛 are drawn from the random variable 𝑋. The values of 𝜆1 and 𝜆2 are measures of location and scale and are the arithmetic mean and one-half the coefficient of mean difference (or Gini’s index of spread), respectively. Higher-order L-moments are transformed to dimensionless quantities referred to as L-moment ratios defined as 𝜏𝑟=𝜆𝑟/𝜆2 for 𝑟3, and where 𝜏3 and 𝜏4 are the analogs to the conventional measures of skew and kurtosis. In general, L-moment ratios are bounded in the interval 1<𝜏𝑟<1 as is the index of L-skew (𝜏3) where a symmetric distribution implies that all L-moment ratios with odd subscripts are zero. Other smaller boundaries can be found for more specific cases. For example, the index of L-kurtosis (𝜏4) has the boundary condition for continuous distributions of [7] 5𝜏2314<𝜏4<1.(1.3)

Headrick [8] derived classes of standard normal-L-moment-based power method distributions using the polynomial transformation𝑝(𝑍)=𝑚𝑖=1𝑐𝑖𝑍𝑖1,(1.4) where 𝑍𝑖.𝑖.𝑑.𝑁(0,1). Setting 𝑚=4(𝑚=6) gives the third- (fifth-) order class of power method distributions. The shape of 𝑝(𝑍) in (1.4) is contingent on the values of the constant coefficients 𝑐𝑖. For the larger class of nonnormal distributions associated with 𝑚=6, the coefficients are computed from the system of equations given in Headrick ([8, Equations (2.8)-(2.13)] for specified values of L-moment ratios (𝜏3,,6). In general, 𝜆1 and 𝜆2 are standardized to the unit normal distribution as𝜆1=𝑐1+𝑐3+3𝑐5𝜆=0,2=4𝑐2+10𝑐4+43𝑐64𝜋=1𝜋.(1.5)

The pdf and cdf associated with (1.4) are given in parametric form as in [8, Equations (1.3) and (1.4)]𝑓𝑝(𝑧)(𝑝(𝑧))=𝑓(𝑧)=𝑝(𝑧),𝜙(𝑧)𝑝,𝐹(𝑧)𝑝(𝑧)(𝑝(𝑧))=𝐹(𝑧)=(𝑝(𝑧),Φ(𝑧)),(1.6) where 𝑓2 and 𝐹2 are the parametric forms of the pdf and cdf with the mappings 𝑧(𝑥,𝑦) and 𝑧(𝑥,𝑣) with 𝑥=𝑝(𝑧),𝑦=𝜙(𝑧)/𝑝(𝑧),𝑣=Φ(𝑧), and where 𝜙(𝑧) and Φ(𝑧) are the standard normal pdf and cdf, respectively. For further details on the distributional properties associated with power method transformations see [9, pages 9–30] and [8] in terms of conventional moment and L-moment theory, respectively.

Of concern in this study are three power method distributions related to (1.4) and (1.5) as 𝑝𝑡(𝑍)=𝑐2𝑡𝑍2𝑡1,whereif𝑡=1,𝑐2=1,𝑐4=0,𝑐6=0,𝑡=2,then𝑐2=0,𝑐4=2/5,𝑐6=0,𝑡=3,𝑐2=0,𝑐4=0,𝑐6=4/43,(1.7) and thus 𝑝1(𝑍)=𝑍,𝑝2(𝑍)=(2/5)𝑍3 and 𝑝3(𝑍)=(4/43)𝑍5. Note that these power method distributions are symmetric and imply that 𝑐1,3,5=0 in (1.4). The graphs of the pdfs associated with the distributions in (1.7) are given in Figure 1 along with their values of L-skew and L-kurtosis. We would point out that the importance of these distributions was noted by Stoyanov [10, page 281], “…power transformations [such as 𝑝2(𝑍) and 𝑝3(𝑍)] can be considered as functional transformations on random data, usually called Box-Cox transformations. Their importance in the area of statistics and its applications is well known.”

The standard normal distribution 𝑝1(𝑍) in (1.7) is the only case of the three distributions considered that is moment determinant. That is, 𝑝2(𝑍) and 𝑝3(𝑍) have the so-called classical problem of moments insofar as their respective cdfs have nonunique solutions (i.e., they are moment indeterminant, see [1012]). However, as pointed out by Huang [12], 𝑝2(𝑍) and 𝑝3(𝑍) are determinant in the context of order statistics moments.

The derivation of the expected values of single order statistics associated with 𝑝1(𝑍) in terms of explicit elementary functions has been attempted by numerous authors (see [1317]). As indicated by Johnson et al. [18, pages 93-94] these attempts fail to give explicit expressions in terms of elementary functions for the expected values of order statistics with sample sizes of 𝑛>5. However, Renner [19] provides a technique for expressing the expected values of order statistics associated with 𝑝1(𝑍) for 𝑛=6,7 based on a single power series.

There is a paucity of research on the expectations of order statistics associated with 𝑝2(𝑍) and 𝑝3(𝑍) in the context of explicit elementary functions. Thus, what follows in Section 2 is the development of an approach for determining the expected values of the order statistics for 𝑝2(𝑍) and 𝑝3(𝑍), which is based on a generalization of Renner’s [19] discussion in the context of 𝑝1(𝑍). In Section 3, some specific evaluations of the generalization are provided to demonstrate the methodology.

2. Methodology

The expected values of the order statistics associated with (1.7) can be determined based on the following expression [20, page 34]: 𝐸𝑝(𝑍)𝑗𝑛=𝑛2𝑛𝑛1𝑗10𝑝𝑡[](𝑧)𝜑(𝑧)1+Ψ(𝑧)𝑗1[]1Ψ(𝑧)𝑛𝑗[]1Ψ(𝑧)𝑗1[]1+Ψ(𝑧)𝑛𝑗𝑑𝑧,(2.1) where 𝑝𝑡(𝑧) is defined as in (1.7) and 𝜑(𝑧)=2𝜙(𝑧) and Ψ(𝑧)=2Φ(𝑧)1 are the pdf and cdf of the folded unit normal distribution at 𝑧=0. Table 1 gives a summary of some specific expansions of the polynomial in (2.1) for sample sizes of 𝑛=1,,9, which are applicable to all three distributions related to 𝑝𝑡(𝑧). Inspection of Table 1 indicates that we have in general (a) 𝐸[𝑝(𝑍)𝑗𝑛]=𝐸[𝑝(𝑍)𝑛+1𝑗𝑛], (b) the median 𝐸[𝑝(𝑍)𝑗𝑛]=𝐸[𝑝(𝑍)𝑗𝑛]=0, and (c) the 𝐸[𝑝(𝑍)𝑗𝑛] are linear combinations of the integrals 𝐼2𝑟1 for 𝑟=1,2,, with only odd subscripts appearing as only odd powers of Ψ(𝑧) appear in the polynomial expansions associated with (2.1). As such, 𝐼2𝑟1 in (2.1) can be expressed as𝐼2𝑟1=0𝑝𝑡[](𝑧)𝜑(𝑧)Ψ(𝑧)2𝑟1𝑑𝑧.(2.2)

Equation (2.2) may be integrated by parts as 𝐼2𝑟1=(2𝑟1)0𝑞𝑡(𝑧)𝜑(𝑧)2[]Ψ(𝑧)2𝑟2𝑑𝑧,(2.3) where 𝑞1(𝑧)=1, 𝑞2(𝑧)=(2/5)(𝑧2+2) and 𝑞3(𝑧)=(4/43)(𝑧4+4𝑧2+8), for 𝑝1(𝑧), 𝑝2(𝑧), and 𝑝3(𝑧), respectively. Note that Ψ(0)=0 and lim𝑧+𝜑(𝑧)=0. Evaluating (2.3) for 𝑟=1 gives a coefficient of mean difference of 𝐼1=0𝑞𝑡(𝑧)𝜑(𝑧)21𝑑𝑧=𝜋(2.4) for all 𝑝𝑡(𝑧) in (1.7), which is consistent with the specification in (1.5) and given in Table 1.

The expression [Ψ(𝑧)]2𝑟2 in (2.3) can be expressed as []Ψ(𝑧)2𝑟2=2𝜋𝑟1𝑧01exp2𝑢2𝑑𝑢2𝑟2(2.5) or analogously as a double integral over 2 as []Ψ(𝑧)2𝑟2=2𝜋𝑟1𝑧01exp2𝑧21+𝑧22𝑑𝑧1𝑑𝑧2𝑟1.(2.6) Using (2.6), let 𝑧2=𝑧1tan𝜃1 and thus 𝑑𝑧2=𝑧1sec2𝜃1𝑑𝜃1. Further, let 𝑧21+𝑧22=𝑧21sec2𝜃1. As such, the region of integration will be reduced to one-half of the area of the original rectangle associated with (2.6). Thus, we have []Ψ(𝑧)2𝑟2=2𝜋𝑟120𝜋/4𝑧01exp2𝑧21sec2𝜃1𝑑𝑧1𝑧1sec2𝜃1𝑑𝜃1𝑟1=4𝜋𝑟10𝜋/4𝑧01exp2𝑧21sec2𝜃1𝑧1𝑑𝑧1sec2𝜃1𝑑𝜃1𝑟1.(2.7) Subsequently, setting 𝑧21=𝑤 in (2.7), where 𝑧1𝑑𝑧1=𝑑𝑤/2, gives[]Ψ(𝑧)2𝑟2=4𝜋𝑟10𝜋/4𝑧201exp2𝑤sec2𝜃1𝑑𝑤2sec2𝜃1𝑑𝜃1𝑟1=4𝜋𝑟10𝜋/412exp(1/2)𝑤sec2𝜃1(1/2)sec2𝜃1𝑧20sec2𝜃1𝑑𝜃1𝑟1,(2.8) and hence []Ψ(𝑧)2𝑟2=4𝜋𝑟10𝜋/411exp2𝑧2sec2𝜃1𝑑𝜃1𝑟1.(2.9) Expanding (2.9) yields []Ψ(𝑧)2𝑟2=1+𝑟1𝑘=1(1)𝑘𝑘4𝑟1𝜋𝑘0𝜋/40𝜋/41exp2𝑧2𝑘𝑖=1sec2𝜃𝑖𝑑𝜃1𝑑𝜃𝑘,(2.10) where the subscript 𝑖 runs faster than 𝑘. For example, if 𝑟=4, then (2.10) would appear more specifically as []Ψ(𝑧)2𝑟214=1𝑟1𝜋0𝜋/41exp2𝑧2sec2𝜃1𝑑𝜃1+24𝑟1𝜋20𝜋/41exp2𝑧2sec2𝜃1+sec2𝜃2𝑑𝜃1𝑑𝜃234𝑟1𝜋30𝜋/41exp2𝑧2sec2𝜃1+sec2𝜃2+sec2𝜃3𝑑𝜃1𝑑𝜃2𝑑𝜃3.(2.11)

Substituting (2.10) into (2.3) and initially integrating with respect to 𝑧 (Lichtenstein, [21]) yields 𝜋0𝑞𝑡(𝑧)𝜑(𝑧)21exp2𝑧2𝑘𝑖=1sec2𝜃𝑖𝑑𝑧=𝑔𝑡sec2𝜃𝑖,(2.12) where the specific forms of 𝑔𝑡(sec2𝜃𝑖), which are associated with 𝑝𝑡(𝑧), are 𝑔1sec2𝜃𝑖=22+𝑘𝑖=1sec2𝜃𝑖1/2,𝑔2sec2𝜃𝑖=225+2𝑘𝑖=1sec2𝜃𝑖52+𝑘𝑖=1sec2𝜃𝑖3/2,𝑔3sec2𝜃𝑖=423+42+𝑘𝑖=1sec2𝜃𝑖+82+𝑘𝑖=1sec2𝜃𝑖2432+𝑘𝑖=1sec2𝜃𝑖5/2.(2.13)

Equations (2.13) can be more conveniently expressed as 𝑔𝑡sec2𝜃𝑖=𝑔1sec2𝜃𝑖𝑡sec2𝜃𝑖,(2.14) where the specific forms of 𝑡(sec2𝜃𝑖) are 1sec2𝜃𝑖=0,(2.15)2sec2𝜃𝑖=2𝑘𝑖=1sec2𝜃𝑖52+𝑘𝑖=1sec2𝜃𝑖3/2,(2.16)3sec2𝜃𝑖=211𝑘𝑖=1sec4𝜃𝑖+28𝑘𝑖=1sec2𝜃𝑖+22𝑖<𝑗sec2𝜃𝑖sec2𝜃𝑗432+𝑘𝑖=1sec2𝜃𝑖5/2(2.17) and where 𝑖<𝑗 in (2.17) indicates summing over all 𝑘(𝑘1)/2 pairwise combinations. Hence, the integral in (2.3) can be expressed as 𝐼2𝑟1=2𝑟1𝜋1+𝑟1𝑘=1(1)𝑘𝑘4𝑟1𝜋𝑘0𝜋/40𝜋/4𝑔𝑡sec2𝜃𝑖𝑑𝜃1𝑑𝜃𝑘,(2.18) and subsequently substituting (2.14) into (2.18) gives 𝐼2𝑟1=2𝑟1𝜋1+𝑟1𝑘=1(1)𝑘𝑘4𝑟1𝜋𝑘0𝜋/40𝜋/4𝑔1sec2𝜃𝑖𝑡sec2𝜃𝑖𝑑𝜃1𝑑𝜃𝑘.(2.19)

The integral associated with 𝑔1(sec2𝜃𝑖) in (2.19) cannot be expressed in terms of explicit elementary functions for 𝑘>1, which also implies 𝑟>2 and sample sizes of 𝑛>5 in Table 1. As such, we will consider the approximating function 𝑔1(sec2𝜃𝑖) as 𝑔1sec2𝜃𝑖=2𝑘/2𝑘𝑖=112+sec2𝜃𝑖1/2,(2.20) where 0𝜋/40𝜋/4𝑔1sec2𝜃𝑖𝑑𝜃1𝑑𝜃𝑘=0𝜋/40𝜋/4𝑔1sec2𝜃𝑖𝑑𝜃1𝑑𝜃𝑘=tan11/2,𝑘=1,0,𝑘.(2.21)

Thus, for finite 𝑘>1 we have 0𝜋/40𝜋/4𝑔1sec2𝜃𝑖𝑑𝜃1𝑑𝜃𝑘=0𝜋/40𝜋/4𝑔1sec2𝜃𝑖𝑑𝜃1𝑑𝜃𝑘+𝜀𝑘=tan112𝑘+𝜀𝑘,(2.22) where 𝜀𝑘 is the remainder term required for 𝑘>1 and where 𝜀1=0 for 𝑟=1,2 and 𝑛5. Thus, using (2.22), (2.19) can be expressed as 𝐼2𝑟1=2𝑟1𝜋1+𝑟1𝑘=1(1)𝑘𝑘4𝑟1𝜋𝑘×tan112𝑘+𝜀𝑘0𝜋/40𝜋/4𝑡sec2𝜃𝑖𝑑𝜃1𝑑𝜃𝑘.(2.23)

The remainder terms 𝜀𝑘>1 in (2.23) can be solved by using (2.3), (2.15), (2.23), and the error function Erf [22], where Erf would replace Φ(𝑧) in (2.3) where Ψ(𝑧)=2Φ(𝑧)1. More specifically, Table 2 gives the values of 𝜀𝑘 for 𝑘=1,12,25, and 50 with 40-digit precision. Inspection of Table 2 indicates that the (positive) remainder term achieves a maximum at 𝜀4 and thereafter tends to zero as 𝑘 increases (i.e., 𝜀𝑘0 for 𝑘>4).

We would note that the approach taken here to determine 𝜀2 is analogous to Renner’s [19] approach of developing a power series for this value. That is, the remainder term 𝜀2 in Table 2 is also the value approximated in [19] for 𝑝1(𝑍). Further, we would note that extending the approach in [19] for computing the remainder terms for 𝑘>2 would become computationally burdensome.

To demonstrate (2.23) more specifically, if 𝑟=4 and 𝑡=2 in (1.7), then the integral 𝐼7 associated with 𝑝2(𝑍) would appear as𝐼7=2𝑟1𝜋141𝑟1𝜋tan1120𝜋/42sec2𝜃𝑖𝑑𝜃1+24𝑟1𝜋2tan1122+𝜀20𝜋/42sec2𝜃𝑖𝑑𝜃1𝑑𝜃234𝑟1𝜋3tan1123+𝜀30𝜋/42sec2𝜃𝑖𝑑𝜃1𝑑𝜃2𝑑𝜃3.(2.24)

3. Evaluations

Tables 35 give evaluations for the expected values of the order statistics for 𝑝1(𝑍),𝑝2(𝑍), and 𝑝3(𝑍) in (1.7), which are based on (2.23) and the general formulae given in Table 1 for sample sizes of 𝑛=4,5. Inspection of Tables 4 and 5 indicates that the expected values for 𝑝2(𝑍) and 𝑝3(𝑍) are all expressed in terms of elementary functions and are also functions of the expectations associated with 𝑝1(𝑍) in Table 3.

Presented in Tables 6, 7 and 8 are the evaluations for all three distributions in (1.7) for samples of sizes 𝑛=6,7 where the expectations of the order statistics for 𝑝1(𝑍), 𝑝2(𝑍), and 𝑝3(𝑍) are all expressed in terms of explicit elementary functions and a single remainder term. Tables 9 and 10 give the expected values of the order statistics associated with the standard normal distribution 𝑝1(𝑍) for sample sizes of 𝑛=8 and 𝑛=9, respectively. We would also note that Mathematica [22] software is available from the authors for implementing the methodology.