Research Article | Open Access
Asymptotic Behavior of Tail Density for Sum of Correlated Lognormal Variables
We consider the asymptotic behavior of a probability density function for the sum of any two lognormally distributed random variables that are nontrivially correlated. We show that both the left and right tails can be approximated by some simple functions. Furthermore, the same techniques are applied to determine the tail probability density function for a ratio statistic, and for a sum with more than two lognormally distributed random variables under some stricter conditions. The results yield new insights into the problem of characterization for a sum of lognormally distributed random variables and demonstrate that there is a need to revisit many existing approximation methods.
The lognormal distribution has applications in many fields such as survival analysis , genetic studies [2, 3], financial modelling [4, 5], telecommunication studies [6, 7] amongst others. It has been found that many types of data can be modeled by lognormal distributions, which include human blood pressure, microarray data, stock options, survival rate for different groups of human beings, and the received power's long-term fluctuation. In these occasions, we wish to make some inferences based on the collected data involving the addition of a few lognormally distributed random variables (RVs). Deriving the statistical properties of a sum of lognormally distributed RVs is therefore desirable [6, 8]. Also note that the number of summands is so small in practice that the central limit theorem is not applicable.
Many research works assume that all the summands are independent, either justified by practical considerations or for the sake of simplicity. However, there are some applications (e.g., the Asian option pricing model ) in which correlations among the summands are inevitable. Our study will address the correlation problem.
In some cellular mobile systems (see ), the signal quality is largely dictated by signal to interference ratio (SIR). On a large scale, both useful signals and interfering signals experience lognormal shadow fadings. That is to say, SIR can be modeled by where all the RVs are lognormally distributed. SIR only characterizes the instantaneous quality. For ordinary users and network operators, one important factor to consider is the outage probability target to a certain SIR threshold. For example, for the users of a data transfer service, it is required that the outage probability such that BER needs to be less than 0.01. Here BER stands for bit error rate that usually depends on SIR and other factors. In this paper, Theorem 2.8 provides an approximation to when , .
In addition, Theorems 2.6 and 2.7 try to characterize the left and right tails of a sum of lognormally distributed RVs. The theorems are useful to construct a Padé approximation to the probability density function (PDF) of a sum of lognormally distributed RVs. For example, if that approximation is available, and if useful signals and interfering signals are independent, the outage probability can be numerically estimated as follows: Let and , then
Since Fenton  addressed the problem, many methods have been developed, but none of them have been successful in finding a closed form representation for the PDF of a sum of multiple lognormally distributed RVs. These methods can be divided into three categories.(i)The first type of methods attempt to characterize the PDF by calculating the moment generating function [11, 12] or the characteristic function [13, 14]. The results obtained can be used in the numerical computations of a PDF or a cumulative density function (CDF). To our knowledge, no work has succeeded in using the results of this category to describe the shape of a PDF or CDF. (ii)The second type of methods [15–18] use the bound technique for the CDF of an underlying statistic.(iii) The third type of methods focuses on finding a good approximation to either the PDF or CDF of the underlying statistic. Most published works belong to this category. The way to find the approximation can often be described as follows: first, assume a specific distribution that the sum (or the ratio of sum) of the lognormally distributed RV follows; then use a variety of methods to identify the parameters for that specific distribution. The specific distributions in the literature include lognormal [15, 19], reciprocal Gamma , log shifted Gamma , and user-defined PDF [21, 22]. In some works , only the CDF approximation is defined. Moment matching [10, 24, 25], moment generating function matching , and the least squares fitting [26, 27] are a few popular methods used to determine the parameters associated with the distribution.
In this paper, we will rigorously characterize the right and left tails behavior of a PDF for a random variable , where are jointly distributed with as distribution. This is our first step towards understanding the more general problem: the characterization of the PDF of where are jointly distributed as . Note here we do not assume that the are independent (except for Theorem 2.7), nor do we assume that have the same marginal distribution. We hope that our study can lead to a better solution to the works presented in  or .
Janos  is the first one to study the right tail probability of a sum of lognormals. More advanced and more general studies can be found in . We have not found any theoretical results regarding the left tails. In addition, the right tails results we show cannot be deduced from the results in [5, 30–32].
Our results show that it is possible to find some elementary functions such that The explicit forms of and enable us to assess the performance of the existing approximation methods and to determine how to improve these methods. By Theorem 2.3 (see also the subsequent remark and Corollary 2.5), we can determine that at the left tail region, even under the independence assumption, given any function within the families of PDFs such as lognormal, reciprocal Gamma or log shifted Gamma, either does not exist or can be only zero or . No previous works have led to this discovery. Szyszkowicz  has pointed out that some precedent models are wrong in the tail region, but this work was based on a hypothesis that was only justified by the numerical results, and it still focused on finding the best lognormal type approximation. In view of our results, such efforts are unlikely to succeed.
Our characterization of the behavior of the tail of the PDF of a sum of two lognormals is complete in the sense that our results cover all nondegenerate covariance matrices. Our work regarding the ratio RV is obtained under more stringent conditions. This new result shows that the ratio RV is neither lognormal nor log Gamma. This indicates that others should be cautious with the method in  despite the successful examples demonstrated therein.
When the number of summands exceeds two, the situation for a PDF approximation becomes much more complicated. We are able to show some left tail and right tail results by imposing some conditions on the covariance matrix that covers the independent case. The result of Theorem 2.7 could be well-known to experts working with functions from the subexponential class. Unfortunately, we did not find any references that explicitly state the result, so we provide a short proof in Appendix F. Further in this line,  has presented the complete CDF approximation for the right tail with an arbitrary covariance matrix. However our results cannot be deduced trivially from the CDF behavior and are interesting in themselves. For example, for any polynomial growth continuous function , we can say that whereas such an approximation cannot be a direct consequence from the result in .
In the following sections, we will first present our results followed by a numerical validation. We will then discuss some future studies that this paper does not cover. Also we present the proofs in the appendix.
2. Main Results
Let be a jointly normally distributed random vector. Let be the correlation coefficient, and , for Then the joint PDF of is given by where We wish to study the left and right tail probabilities of which has PDF as with We hope to understand the asymptotic behavior of when . Direct calculus yields where the exponent Rewrite in three terms , where the are defined as
Hence, by changing the variable to ,
We regroup the integrand in (2.7) in the form
with Remark 2.1. Without loss of generality, in this paper, we always assume We also use the following notation.
Definition 2.2. We say that two functions and are equivalent near some point, denoted by , if we have. For the left tail, we have the following result.
Theorem 2.3. Let be defined as above for and be the correlation coefficient. Let be the PDF of , then as follows.(i)If , one has Here the functions , , are defined by (2.9) and (2.6).(ii)If , one has (iii)If , one has
In particular, when , we find that , which means that any lognormal, reciprocal Gamma or log shifted Gamma cannot be used to fit the left tail, under the independence hypothesis. The situation for the right tail of is simpler. It is interesting to remark that the result does not depend on the correlation coefficient (see also ). Here and later on, we employ the lexicographical order to the couple .
Theorem 2.4. Let and be defined as above, then , where is defined as follows.(i)If , one has (ii) If , one has .
Corollary 2.5. Let be the PDF of where are i.i.d. RVs following distributions. Then, one has The results in Corollary 2.5 confirm those results reported in . Furthermore, we can easily show that the models in [4, 21] will also fail in the tail regions.
Next we show that our left tail and right tail study can be extended for some special cases in higher dimension by using the Laplace methods.Theorem 2.6. Let () be a joint normally distributed random variable with distribution . Let Let be the PDF of random variable . If satisfies for all , then the left tail of satisfies where , , denotes the Hessian matrix of . Here, and is given by Theorem 2.7. Let be independent normally distributed RVs, that is, . Let Define for the lexicographical order and the number of maximum points, that is . Then the PDF of , satisfies Finally, we show a result for the quotient of sums of i.i.d lognormal variables.Theorem 2.8. Let be i.i.d random variables. Each of them follows distribution. Let and be the PDF of , then This result can be generalized to the case where follow follow with any positive constants and Indeed, we can prove that For the sake of brevity, we only present the proof for the special case where both variances are equal.
3. Numerical Validation
We have validated the two-dimensional theoretical results by performing Monte-Carlo simulations. The curve generated by Monte Carlo method is obtained through bin-based density estimation. In all of the presented cases, we can see that our approximations match the numerical results closely.
For the simulation parameters of the statistic, in order to test our results in the extreme cases, we have chosen . The mean values were arbitrarily set. The values of were chosen to be 9.6 and 12 so that their ratio is 0.8.
For the parameters of the ratio statistic (denoted by ), we used for two groups of normally distributed RVs. The mean for these RVs were set to 0. Due to the symmetry properties that has, it is sufficient to show the verification results for the left tail of
4. Further Remarks
We have seen that our tail density approximations (for a sum of lognormal RVs) do not deal with an arbitrary covariance matrix. In the 2D proofs, we used the classical approximation technique for integrals, called the Laplace method (see Lemmas A.1, E.2). Then we divided the study into a few subcases and then proceeded in different ways. Comparing Theorems 2.3, 2.6 to Theorems 2.4, 2.7, it seems that in general the left tail behavior is more involved than the right tail case. We hope to adapt our approach to the higher-dimensional space, especially for the right tail behavior, which will lead to the result in . It is also useful to perform a higher order approximation for both tails so that an efficient Padé approximation can be developed accordingly.
In view of the work in , it should be worthwhile to extend our lognormal work (at least for the right tail) to a more general family such as the subexponential class. The importance of this distribution family can be found in [5, 32, 33]. Perhaps future work for the sum of lognormals mentioned above may shed some insight on the subexponentionl class problem.
Here we list some basic lemmas useful in the subsequent discussion. Their proofs use standard techniques and hence are omitted.
Lemma A.1. Let be a positive, integrable function on an interval . Let be concave such that verifies , , then
This result is the so-called Laplace method in modern analysis (see ), which is more often cited as saddle point approximation in other fields such as statistics or physics. Later on, we also give a higher dimensional version (see Lemma E.2).
Lemma A.2. Let be a nonnegative function defined on , and Assume that is a nontrivial, nonnegative function such that for any fixed near , one has and for any , If moreover is a bounded uniformly continuous function over , such that , then
In fact, we need a special case of lemma 16. When exists, we can simply require that is a continuous function and replace the term in (A.3) by .
Lemma A.3. Let be a bounded measurable function defined on with , such that exists. Let be a positive constant. Then
Using usual developments, we also have the following asymptotic expansion.
Lemma A.4. Fix if satisfies , then as , is uniquely determined and
B. The Left Tail Behavior
In this section, we will prove Theorem 2.3. We discuss the cases , and respectively. Recall that and .
Case 1. . Using the formulas (2.7) and (2.8), we need only to understand the behavior of as . Here and are defined by (2.9). Since Thus, has a unique solution We also have in and , integrable over . Hence, Lemma A.1 allows us to conclude
Case 2. . Here we can simplify and as
Thus, is uniformly bounded and uniformly continuous over Furthermore, we have
If , , for any . Let be the unique solution of . Obviously, satisfies the equation According to Lemma A.4, we obtain and
The situation here is more delicate than in Lemma A.1, but we can follow the same idea. For any fixed, there exists such that , for any . Choosing now near such that and
We will decompose the integral into three parts: First we consider the integral of over . Using Taylor expansion and the monotonicity of , we get
By (B.12), for small enough, Using , we also have (for small )
Consider now the integral of on . Since is strictly concave in ,
Moreover, , , so we get Similarly, Combining all these estimates, we deduce As can be arbitrarily small, by (2.7), Applying (B.5), (B.9) and (B.11), we complete the proof.
Case 3. . Rewrite with two positive constants Thus, with given by (B.8) and Notice that is diffeomorphism from into , with in . Let be the inverse function of , namely . We have the following properties: The change of variable yields For , as , in and for , we obtain Furthermore, where is a bounded function in by properties of and . Otherwise, using (B.9) and (B.25) Applying Lemma A.3, we get Finally, combining (B.27), (B.31), (B.22) and (2.7) which is just the claimed result.
C. The Right Tail Behavior
Here we prove Theorem 2.4. We begin with the formulae (2.7), (2.6) and divide the study into two cases: and . Since the arguments are often similar to the previous consideration and the situation is simpler, we will proceed with less details.
Case 1. . We have with given by (B.22). Since and , it's clear that in and is the unique solution of Hence is decreasing on , increasing on and . Thus for any , there exist exactly two solutions for . Before proceeding, we list some properties of
Note that where , and are defined by (B.8), (B.24), (B.29) respectively. Then we can repeat the above proof for the left tail (the third case), substituting the function by , using the properties in (C.2), we conclude that For the integral over , we have with
On the other hand, where
Since , the dominant term of is clearly given by if , However when , we need to compare and . Finally,
Case 2. . We always have , but now . Thus, and for , so is a diffeomorphism from to . Denote as the inverse function of . satisfies (C.1) and
Let and be defined in (C.8), (C.6) respectively. Replace by . It is not difficult to prove that we always have and Moreover, when tends to , the behavior of is dominated by the integral over . By (C.1) and (C.12), is uniformly bounded in . Using Lemma A.3 again, as for in Case 1, we come to the conclusion .
D. Quotient of Sum of Lognormal
Here is the proof for Theorem 2.8. Let , as before, Denoting , we get Direct calculus yields