Abstract

We consider the asymptotic behavior of a probability density function for the sum of any two lognormally distributed random variables that are nontrivially correlated. We show that both the left and right tails can be approximated by some simple functions. Furthermore, the same techniques are applied to determine the tail probability density function for a ratio statistic, and for a sum with more than two lognormally distributed random variables under some stricter conditions. The results yield new insights into the problem of characterization for a sum of lognormally distributed random variables and demonstrate that there is a need to revisit many existing approximation methods.

1. Introduction

The lognormal distribution has applications in many fields such as survival analysis [1], genetic studies [2, 3], financial modelling [4, 5], telecommunication studies [6, 7] amongst others. It has been found that many types of data can be modeled by lognormal distributions, which include human blood pressure, microarray data, stock options, survival rate for different groups of human beings, and the received power's long-term fluctuation. In these occasions, we wish to make some inferences based on the collected data involving the addition of a few lognormally distributed random variables (RVs). Deriving the statistical properties of a sum of lognormally distributed RVs is therefore desirable [6, 8]. Also note that the number of summands is so small in practice that the central limit theorem is not applicable.

Many research works assume that all the summands are independent, either justified by practical considerations or for the sake of simplicity. However, there are some applications (e.g., the Asian option pricing model [4]) in which correlations among the summands are inevitable. Our study will address the correlation problem.

In some cellular mobile systems (see [9]), the signal quality is largely dictated by signal to interference ratio (SIR). On a large scale, both useful signals and interfering signals experience lognormal shadow fadings. That is to say, SIR can be modeled by where all the RVs are lognormally distributed. SIR only characterizes the instantaneous quality. For ordinary users and network operators, one important factor to consider is the outage probability target to a certain SIR threshold. For example, for the users of a data transfer service, it is required that the outage probability such that BER needs to be less than 0.01. Here BER stands for bit error rate that usually depends on SIR and other factors. In this paper, Theorem 2.8 provides an approximation to when , .

In addition, Theorems 2.6 and 2.7 try to characterize the left and right tails of a sum of lognormally distributed RVs. The theorems are useful to construct a Padé approximation to the probability density function (PDF) of a sum of lognormally distributed RVs. For example, if that approximation is available, and if useful signals and interfering signals are independent, the outage probability can be numerically estimated as follows: Let and , then

Since Fenton [10] addressed the problem, many methods have been developed, but none of them have been successful in finding a closed form representation for the PDF of a sum of multiple lognormally distributed RVs. These methods can be divided into three categories.

(i)The first type of methods attempt to characterize the PDF by calculating the moment generating function [11, 12] or the characteristic function [13, 14]. The results obtained can be used in the numerical computations of a PDF or a cumulative density function (CDF). To our knowledge, no work has succeeded in using the results of this category to describe the shape of a PDF or CDF. (ii)The second type of methods [1518] use the bound technique for the CDF of an underlying statistic.(iii) The third type of methods focuses on finding a good approximation to either the PDF or CDF of the underlying statistic. Most published works belong to this category. The way to find the approximation can often be described as follows: first, assume a specific distribution that the sum (or the ratio of sum) of the lognormally distributed RV follows; then use a variety of methods to identify the parameters for that specific distribution. The specific distributions in the literature include lognormal [15, 19], reciprocal Gamma [4], log shifted Gamma [20], and user-defined PDF [21, 22]. In some works [23], only the CDF approximation is defined. Moment matching [10, 24, 25], moment generating function matching [19], and the least squares fitting [26, 27] are a few popular methods used to determine the parameters associated with the distribution.

In this paper, we will rigorously characterize the right and left tails behavior of a PDF for a random variable , where are jointly distributed with as distribution. This is our first step towards understanding the more general problem: the characterization of the PDF of where are jointly distributed as . Note here we do not assume that the are independent (except for Theorem 2.7), nor do we assume that have the same marginal distribution. We hope that our study can lead to a better solution to the works presented in [28] or [29].

Janos [15] is the first one to study the right tail probability of a sum of lognormals. More advanced and more general studies can be found in [30]. We have not found any theoretical results regarding the left tails. In addition, the right tails results we show cannot be deduced from the results in [5, 3032].

Our results show that it is possible to find some elementary functions such that The explicit forms of and enable us to assess the performance of the existing approximation methods and to determine how to improve these methods. By Theorem 2.3 (see also the subsequent remark and Corollary 2.5), we can determine that at the left tail region, even under the independence assumption, given any function within the families of PDFs such as lognormal, reciprocal Gamma or log shifted Gamma, either does not exist or can be only zero or . No previous works have led to this discovery. Szyszkowicz [18] has pointed out that some precedent models are wrong in the tail region, but this work was based on a hypothesis that was only justified by the numerical results, and it still focused on finding the best lognormal type approximation. In view of our results, such efforts are unlikely to succeed.

Our characterization of the behavior of the tail of the PDF of a sum of two lognormals is complete in the sense that our results cover all nondegenerate covariance matrices. Our work regarding the ratio RV is obtained under more stringent conditions. This new result shows that the ratio RV is neither lognormal nor log Gamma. This indicates that others should be cautious with the method in [9] despite the successful examples demonstrated therein.

When the number of summands exceeds two, the situation for a PDF approximation becomes much more complicated. We are able to show some left tail and right tail results by imposing some conditions on the covariance matrix that covers the independent case. The result of Theorem 2.7 could be well-known to experts working with functions from the subexponential class. Unfortunately, we did not find any references that explicitly state the result, so we provide a short proof in Appendix F. Further in this line, [5] has presented the complete CDF approximation for the right tail with an arbitrary covariance matrix. However our results cannot be deduced trivially from the CDF behavior and are interesting in themselves. For example, for any polynomial growth continuous function , we can say that whereas such an approximation cannot be a direct consequence from the result in [5].

In the following sections, we will first present our results followed by a numerical validation. We will then discuss some future studies that this paper does not cover. Also we present the proofs in the appendix.

2. Main Results

Let be a jointly normally distributed random vector. Let be the correlation coefficient, and , for Then the joint PDF of is given by where We wish to study the left and right tail probabilities of which has PDF as with We hope to understand the asymptotic behavior of when . Direct calculus yields where the exponent Rewrite in three terms , where the are defined as

Hence, by changing the variable to ,

We regroup the integrand in (2.7) in the form

with Remark 2.1. Without loss of generality, in this paper, we always assume We also use the following notation.

Definition 2.2. We say that two functions and are equivalent near some point, denoted by , if we have. For the left tail, we have the following result.

Theorem 2.3. Let be defined as above for and be the correlation coefficient. Let be the PDF of , then as follows.(i)If , one has Here the functions , , are defined by (2.9) and (2.6).(ii)If , one has (iii)If , one has

In particular, when , we find that , which means that any lognormal, reciprocal Gamma or log shifted Gamma cannot be used to fit the left tail, under the independence hypothesis. The situation for the right tail of is simpler. It is interesting to remark that the result does not depend on the correlation coefficient (see also [5]). Here and later on, we employ the lexicographical order to the couple .

Theorem 2.4. Let and be defined as above, then , where is defined as follows.(i)If , one has (ii) If , one has .

The following corollary is an immediate consequence of Theorems 2.3 and 2.4.

Corollary 2.5. Let be the PDF of where are i.i.d. RVs following distributions. Then, one has The results in Corollary 2.5 confirm those results reported in [18]. Furthermore, we can easily show that the models in [4, 21] will also fail in the tail regions.

Next we show that our left tail and right tail study can be extended for some special cases in higher dimension by using the Laplace methods.Theorem 2.6. Let () be a joint normally distributed random variable with distribution . Let Let be the PDF of random variable . If satisfies for all , then the left tail of satisfies where , , denotes the Hessian matrix of . Here, and is given by Theorem 2.7. Let be independent normally distributed RVs, that is, . Let Define for the lexicographical order and the number of maximum points, that is . Then the PDF of , satisfies Finally, we show a result for the quotient of sums of i.i.d lognormal variables.Theorem 2.8. Let be i.i.d random variables. Each of them follows distribution. Let and be the PDF of , then This result can be generalized to the case where follow follow with any positive constants and Indeed, we can prove that For the sake of brevity, we only present the proof for the special case where both variances are equal.

3. Numerical Validation

We have validated the two-dimensional theoretical results by performing Monte-Carlo simulations. The curve generated by Monte Carlo method is obtained through bin-based density estimation. In all of the presented cases, we can see that our approximations match the numerical results closely.

For the simulation parameters of the statistic, in order to test our results in the extreme cases, we have chosen . The mean values were arbitrarily set. The values of were chosen to be 9.6 and 12 so that their ratio is 0.8.

For the parameters of the ratio statistic (denoted by ), we used for two groups of normally distributed RVs. The mean for these RVs were set to 0. Due to the symmetry properties that has, it is sufficient to show the verification results for the left tail of

4. Further Remarks

We have seen that our tail density approximations (for a sum of lognormal RVs) do not deal with an arbitrary covariance matrix. In the 2D proofs, we used the classical approximation technique for integrals, called the Laplace method (see Lemmas A.1, E.2). Then we divided the study into a few subcases and then proceeded in different ways. Comparing Theorems 2.3, 2.6 to Theorems 2.4, 2.7, it seems that in general the left tail behavior is more involved than the right tail case. We hope to adapt our approach to the higher-dimensional space, especially for the right tail behavior, which will lead to the result in [5]. It is also useful to perform a higher order approximation for both tails so that an efficient Padé approximation can be developed accordingly.

In view of the work in [32], it should be worthwhile to extend our lognormal work (at least for the right tail) to a more general family such as the subexponential class. The importance of this distribution family can be found in [5, 32, 33]. Perhaps future work for the sum of lognormals mentioned above may shed some insight on the subexponentionl class problem.

Appendices

A. Preliminaries

Here we list some basic lemmas useful in the subsequent discussion. Their proofs use standard techniques and hence are omitted.

Lemma A.1. Let be a positive, integrable function on an interval . Let be concave such that verifies , , then

This result is the so-called Laplace method in modern analysis (see [34]), which is more often cited as saddle point approximation in other fields such as statistics or physics. Later on, we also give a higher dimensional version (see Lemma E.2).

Lemma A.2. Let be a nonnegative function defined on , and Assume that is a nontrivial, nonnegative function such that for any fixed near , one has and for any , If moreover is a bounded uniformly continuous function over , such that , then

In fact, we need a special case of lemma 16. When exists, we can simply require that is a continuous function and replace the term in (A.3) by .

Lemma A.3. Let be a bounded measurable function defined on with , such that exists. Let be a positive constant. Then

Using usual developments, we also have the following asymptotic expansion.

Lemma A.4. Fix if satisfies , then as , is uniquely determined and

B. The Left Tail Behavior

In this section, we will prove Theorem 2.3. We discuss the cases , and respectively. Recall that and .

B.1.

Case 1. . Using the formulas (2.7) and (2.8), we need only to understand the behavior of as . Here and are defined by (2.9). Since Thus, has a unique solution We also have in and , integrable over . Hence, Lemma A.1 allows us to conclude

B.2.

Case 2. . Here we can simplify and as Rewrite with Clearly Thus, is uniformly bounded and uniformly continuous over Furthermore, we have If , , for any . Let be the unique solution of . Obviously, satisfies the equation According to Lemma A.4, we obtain and where .
The situation here is more delicate than in Lemma A.1, but we can follow the same idea. For any fixed, there exists such that , for any . Choosing now near such that and

We will decompose the integral into three parts: First we consider the integral of over . Using Taylor expansion and the monotonicity of , we get
By (B.12), for small enough, Using , we also have (for small )
Consider now the integral of on . Since is strictly concave in ,
Moreover, , , so we get Similarly, Combining all these estimates, we deduce As can be arbitrarily small, by (2.7), Applying (B.5), (B.9) and (B.11), we complete the proof.

B.3.

Case 3. . Rewrite with two positive constants Thus, with given by (B.8) and Notice that is diffeomorphism from into , with in . Let be the inverse function of , namely . We have the following properties: The change of variable yields For , as , in and for , we obtain Furthermore, where is a bounded function in by properties of and . Otherwise, using (B.9) and (B.25) Applying Lemma A.3, we get Finally, combining (B.27), (B.31), (B.22) and (2.7) which is just the claimed result.

C. The Right Tail Behavior

Here we prove Theorem 2.4. We begin with the formulae (2.7), (2.6) and divide the study into two cases: and . Since the arguments are often similar to the previous consideration and the situation is simpler, we will proceed with less details.

C.1.

Case 1. . We have with given by (B.22). Since and , it's clear that in and is the unique solution of Hence is decreasing on , increasing on and . Thus for any , there exist exactly two solutions for . Before proceeding, we list some properties of

Note that where , and are defined by (B.8), (B.24), (B.29) respectively. Then we can repeat the above proof for the left tail (the third case), substituting the function by , using the properties in (C.2), we conclude that For the integral over , we have with

On the other hand, where

Using the properties (C.1), Using Lemma A.3, we get the behavior of as , and a simplification leads to

Since , the dominant term of is clearly given by if , However when , we need to compare and . Finally,

C.2.

Case 2. . We always have , but now . Thus, and for , so is a diffeomorphism from to . Denote as the inverse function of . satisfies (C.1) and

Let and be defined in (C.8), (C.6) respectively. Replace by . It is not difficult to prove that we always have and Moreover, when tends to , the behavior of is dominated by the integral over . By (C.1) and (C.12), is uniformly bounded in . Using Lemma A.3 again, as for in Case 1, we come to the conclusion .

D. Quotient of Sum of Lognormal

Here is the proof for Theorem 2.8. Let , as before, Denoting , we get Direct calculus yields with Thus, The last equality is obtained by the symmetry of our integrand between , and , . Since is increasing in , clearly for any , Denote by the inverse function of , then Moreover, by change of variables , , where . Making a new change of variables , , we then get with Obviously,

Consider we claim that

Indeed, for any We use for the last equality. Using dominated convergence and (D.11), the claim (D.13) is obtained immediately. Defining , we finally get As , we need to understand the behavior of when tends to . More precisely, for , Here, is not concave in . But notice that has a unique global maximum point at in and . So the spirit of the Laplace method works. This implies Hence by (D.13), Finally, using we conclude that where The proof is complete, since is an even function.

E. A Higher Dimension Left Tail Result

We will prove Theorem 2.6 in this section. Let be a joint normal variable with distribution where is the mean vector and is a non-singular covariance matrix. Let We wish to study the left tail of , and the PDF of . Let be the PDF of , that is, Let . By change of variables, it is clear that where , and is a vector defined by with the vector given by (2.18). Consequently, with in Theorem 2.6. Therefore, if has only one saddle point, we can expect to apply the Laplace method to the higher dimensional case. One necessary and sufficient condition to ensure that is summarized in the following lemma.

Lemma E.1. The function has a unique critical point within the set , if and only if for any, . Moreover, the Hessian of , is negative definite over when .

Indeed, it is easy to see that where Since , the solution to the system would satisfy for all . Thus a solution exists in if and only if all ’s are positive. As with the Kronecker symbol, for any , It is easy to conclude as long as and . The following lemma is our key argument.

Lemma E.2. Let be a convex domain. Let be two continuous functions defined on We further assume is a positive integrable function over ; is , strictly concave and has a unique critical point in . Then where denotes the Hessian matrix of .

This lemma is an extension to higher dimension of Lemma A.1 and the proof is very similar, so we leave the details to interested readers. Returning to our proof, since all ’s are positive, we can verify that

(i) is convex, since is a convex function.(ii)The function has only one critical point and .(iii) is negative definite for any point .(iv) The function is clearly positive in . The integrability of over is ensured by the fact that the matrix is positive definite. Because , is bounded over . Finally, as a straightforward application of Lemma E.2 results in Theorem 2.6.

F. A Higher Dimension Right Tail Result

We will prove Theorem 2.7 by induction on the number . It is trivially true for , since . Suppose that the result holds for Consider now where is a normally distributed random vector whose covariance matrix is diagonal. Without loss of generality, we can assume that the sequence is nonincreasing. There exist three possible cases:

(i)(ii)(iii)

By translation, we can assume that .

Let us denote the PDF of by , then is the PDF of . Consider the asymptotic behavior of as . First, For any fixed (to be chosen later), as the function is decreasing near , there exists such that on for all . Hence for such , Noting that for the case (i), we immediately see For the case (ii), since , there exists such that for all , With such , we have Recall that So the estimate (F.6) still holds. Consider now . For any , by hypothesis of induction, we can fix large enough such that Consequently for where Using exactly the same proof as for Theorem 2.4 (the case ), we get that Finally,

(F.6) implies that when tends to . As , we conclude As is arbitrary, the proof is completed for the cases (i) and (ii) because .

We now consider the case (iii). First we observe that (iii) simply means that are i.i.d. following so that and . The estimate (F.12) is always true for any satisfying (F.9). However, we need to estimate differently. For any , we fix such that (F.9) holds and Because in case (iii), with which converges uniformly to in we obtain

Combining with (F.12), it is easy to conclude.

Acknowlegment

The authors would like to thank the Editor, the anonymous referee for their valuable comments and suggestions.