New Advances in BiostatisticsView this Special Issue
Research Article | Open Access
A. Wong, L. Jiang, "Improved Small Sample Inference on the Ratio of Two Coefficients of Variation of Two Independent Lognormal Distributions", Journal of Probability and Statistics, vol. 2019, Article ID 7173416, 7 pages, 2019. https://doi.org/10.1155/2019/7173416
Improved Small Sample Inference on the Ratio of Two Coefficients of Variation of Two Independent Lognormal Distributions
Without the ability to use research tools and procedures that yield consistent measurements, researchers would be unable to draw conclusions, formulate theories, or make claims about generalizability of their results. In statistics, the coefficient of variation is commonly used as the index of reliability of measurements. Thus, comparing coefficients of variation is of special interest. Moreover, the lognormal distribution has been frequently used for modeling data from many fields such as health and medical research. In this paper, we proposed a simulated Bartlett corrected likelihood ratio approach to obtain inference concerning the ratio of two coefficients of variation for lognormal distribution. Simulation studies show that the proposed method is extremely accurate even when the sample size is small.
In health and medical research, it is common that the variable of interest, , such as the survival time, takes only positive values and the underlying distribution of this variable is highly skewed to the right. In this case, the frequently assumed normal distribution for is not suitable. A standard approach to first transform such that the transformed variable is normally distributed. Then the existing statistical theories developed for the normal distribution can be applied. For and the distribution of is highly skewed to the right, the most common transformation is the logarithmic transformation. In other words, is normally distributed. Hence, is lognormally distributed. Detailed review of the theories of the lognormal distribution can be found in Aitchison and Brown , and Crow and Simizu . In practice, Fears et al.  investigated the variability and reproducibility of hormone assays used by laboratories with the capability of performing large numbers of tests. They assumed the hormone samples used in laboratories are independent lognormally distributed. In this case, it is of special interest to know if each sample yields consistent measurements.
The coefficient of variation () is defined as the ratio of the standard deviation to the mean, where the mean is assumed to be non zero. It is an important index for assessment of the reliability of a measuring procedure. Hence, the problem considered in Fears et al.  can be viewed as testing if the coefficients of variation used in each laboratory are the same or not.
Mathematically, if a random variable is distributed as lognormal, then is distributed as normal with mean and variance . It is well-known thatHence, the coefficient of variation, , is
Nam and Kwon  compared various approximate interval estimations of the ratio of two coefficients of variation for independent lognormal distributions. And their simulation results showed that empirical coverage rates of these methods are satisfactorily close to the nominal coverage rate for medium sample size. The aim of this paper is to develop a more accurate method to obtain inference for the ratio of two coefficients of variation for independent lognormal distributions. Moreover, the proposed method can be generalized to test if the coefficients of variation from independent lognormal distributions are heterogeneous.
The rest of the paper is organized as follows. Section 2 reviewed the existing methods for obtaining inference concerning the ratio of two coefficients of variation from independent lognormal distribution. The simulated Bartlett corrected likelihood method is proposed in Section 3. A real data example is presented in Section 4 to illustrate the application of the method discussed in this paper. Simulation studies are performed to compare the accuracy of the methods discussed in this paper in Section 5. Extension to testing for homogeneity of coefficients of variations from independent lognormal distributions is discussed in Section 6. Some concluding remarks are recorded in Section 7.
2. Existing Methods for Inference on the Ratio of Two Coefficients of Variation of Two Independent Lognormal Distributions
Let be the sample from the lognormal distribution, where . Then is the sample from the normal distribution with mean and variance . From (2), the coefficient of variation is . Nam and Kwon  compared four methods in obtaining confidence intervals for . The following is the summary of the methods discussed in Nam and Kwon :(1)Wald type method Let the observed test statistic be where , , , and . Then is asymptotically distributed as standard normal distribution. The significance function of is , where is the cumulative distribution function of the standard normal distribution.(2)Fieller type method Let the observed test statistic be where Then is also asymptotically distributed as standard normal distribution. The significance function of is .(3)Log method Let the observed test statistic be where Then is also asymptotically distributed as standard normal distribution. The significance function of is .(4)Method of variance estimates recovery (MOVER) This is a method that will directly obtain an approximate confidence interval for only. Let Then an approximate confidence interval for is where Thus, an approximate confidence interval for is . If , for , to be the same as that obtained in the Log method, the MOVER method is identical to the Log method. Note that Hasan and Krishamoorthy  proposed an improved version of the MOVER method.
3. Proposed Method
In this section, we will first review the likelihood based methods and the Bartlett corrected likelihood ratio method. Since the required Bartlett adjustment for the Bartlett corrected likelihood ratio method is very difficult to obtain, a numerical algorithm is proposed to approximate the Bartlett adjustment. Then the methods are applied to obtain inference for the ratio of two coefficients of variaation of two independent lognormal distribution.
3.1. Likelihood Based Methods and Bartlett Corrected Likelihood Ratio Method
Let be a sample from a known distribution with probability density function , where is a -dimensional vector of parameters. Let , which has dimension be the parameter of interest. The log-likelihood function isUnder the regularity conditions stated in Barndorff-Nielsen and Cox , we have the standardized maximum likelihood estimate (MLE) statistic and the likelihood ratio statistic that are asymptotically chi-square distributed with degrees of afreedom, , where is the overall MLE, which is the value of that maximized , and is approximately the inverse of the Fisher’s expected information. When the parameter of interest is , Barndorff-Nielsen and Cox  showed that similar statistics can be obtained. The standardized MLE statistic becomeswhere , and can be approximated by the delta method, which takes the formThe likelihood ratio statistic iswhere is the constrained MLE, which is obtained by maximizing for the given value. Both and are asymptotically . As defined in Fraser , the significance function for is defined as or can be used to obtain inference concerning where and are the observed values of and , respectively. In particular, the confidence region of isrespectively, where is the percentile of .
It is well-known that these two asymptotic methods have rate of convergence , and they are referred to as the first-order methods. In statistics literature, there exists various adjustments to improve the accuracy of the above methods. In particular, Barndorff-Nielsen [8, 9] introduced the modified signed log-likelihood ratio statistics, a third-order method. However, this method is restricted to scalar parameter of interest only. On the other hand, Bartlett  proposed a transformation of the likelihood ratio statistic such that the mean of the transformed statistic matched the mean of the asymptotic distribution. More specifically,where is the Bartlett adjustment such that . And is known as the Bartlett corrected likelihood ratio statistic. An obvious choice of isBartlett  showed that the Bartlett corrected likelihood ratio statistic is also asymptotically distributed and it has rate of convergence . Therefore, it is an extremely accurate method. Nevertheless, except in a few well-defined problem, is very difficult to obtain which hinders the use of this method in applied statistics. A review of the Bartlett corrected likelihood ratio method can be found in Barndorff-Nielsen and Cox .
Although, mathematically, the explicit closed form of , or even an asypmptotic expansion of , is difficult to obtain, we propose the following algorithmic way to obtain numerically, and hence, an estimated .
Given: is a sample of size from a distribution with known probability density function .
Interest: Inference concerning .
Have: Overall maximum likelihood estimate , the constrained maximum likelihood estimate , and the observed likelihood ratio statistic .
Step 1: Simulate samples of data of size from .
Step 2: For each set of simulated data, obtain the simulated observed likelihood ratio statistic. As a result, we have .
Step 3: Calculatewhich is an estimate of the mean of the likelihood ratio statistic. Hence, we have .
Step 4: The observed simulated Bartlett corrected likelihood ratio statistic isis asymptotically distributed as with fourth order rate of convergence. Thus, the significance function is , and the confidence region of is
As a final note on the proposed algorithm, theoretically, the choice of should be as large as possible. However, the larger is, the more calculations are required to obtain . Moreover, the more nuisance parameters exist in the model, the larger has to be. We recommend to use trial-by-error of until is stablized.
3.2. Applying Likelihood Based Method to Obtain Inference on the Ratio of Two Coefficients of Variation of Two Independent Log Normal Distribution
Let and . Then is normally distributed with mean and variance , and is . Moreover, and are independent. Hence, inference concerning will be based on . Since is a function of only, inference concerning will be based on . Let . Then the likelihood function for can be written asIt is easy to show that the overall MLESince our parameter of interest is , where , we haveFor a given value, the log-likelihood function in (20) can be expressed as a function of only, and isHence, to solve for the constrained MLE , we have to find that maximized (23), and then . Once we have both the overall and constrained MLEs, we can obtain the observed likelihood ratio statistic as given in (13). Therefore, the significance function is . Moreover, by applying the algorithm given in the previous section, we can also obtain the observed simulated modified likelihood ratio statistic and the corresponding significance function is .
4. Real Data Example
To illustrate the application of the methods discussed in this paper, we revisit the example discussed in Nam and Kwon . Faupel-Badger et al.  compare concentrations of estrogen metabolites by RIA with the concentrations obtained using a novel and high-performance liquid chromatography-tandem mass spectrometry (LC-MS/MS). The 10% blinded quality control samples were used for assessment of quality control of the laboratory assay. Partial summary of data were presented in Nam and Kwon  and we havewhere the first sample is taken from RIA, and the second sample is taken from LC-MS/MS. Table 1 records the 95% confidence interval for the ratio of the two coefficients of variation assuming that the data are obtained from independent lognormal distributions obtained by the methods discussed in this paper. Note that the MOVER method is identical to the Log method and Hasan and Krishnamoorthy  showed that results from the improved version of the MOVER method are still similar to those obtained by the Log method. Hence, both the MOVER method and its improved version are not included in the calculations. Except for the Wald type, the intervals obtained in Nam and Kwon  seem to be close to each other. Notice that the results from the Fieller type are different from that reported in Nam and Kwon . Moreover, we observed that the likelihood ratio method and the proposed Bartlett correction method seem to be different from the other methods by having a larger upper confidence limit.
With the above observation, it is of interest to compare the accuracy of the methods discussed in this paper, especially when the sample size is small.
5. Simulation Studies
To compare the accuracy of the methods discussed in this paper, simulations studies are performed. The parameters settings are given in Table 2. Other settings have also been calculated but not reported because the results are very similar to those presented. However, they are available upon request. Since we are interested in developing a method that is accurate even for small sample sizes, hence the chosen sample sizes in the simulations studies are relatively small.
For each study, we obtain 10,000 simulated samples. Theoretically, should be as large as possible because we want to use to be the estimate of . However, numerically, we have simulated samples, and for each simulated sample, we have to do simulations to obtain . For these simulation studies, we use . For each simulated sample, we compute the 95% confidence interval obtained by the methods discussed in this paper. Table 3 reported the percentage of samples where the true is less than the lower 95% confidence limit (le), is within the 95% confidence interval (cc), and is greater than the upper 95% confidence limit (ue). The nominal values are 2.5%, 95% and 2.5%, respectively.
(a) Empirical coverage rate for the simulation studies 1 to 8
(b) Empirical coverage rate for the simulation studies 9 to 16
From Table 3, the three methods discussed in Nam and Kwon  do not give satisfactory coverage, especially when the sample sizes are small. The coverage of the likelihood ratio method is improving when the sample sizes increase and, in general, it has asymmetric errors. Nevertheless the proposed simulated Bartlett corrected likelihood ratio method is extremely accurate even when the sample sizes are as small as 5.
6. Testing Homogeneity of Coefficients of Variation from Independent Lognormal Distributions
For samples from independent lognormal distribution, the required log-likelihood function can be written aswhere is the unbiased sample variance estimate of the sample given in Section 3. It is well-known that the overall MLE isThe aim is to testwhich, in this case, is the same as testingTherefore, when is true, the log-likelihood function can be re-written in terms of and isand the constrained MLE is which is the usual pooled variance estimate. The observed likelihood ratio statistic iswhich is asymptotically distributed as . Hence, the observed simulated Bartlett corrected likelihood ratio statistic iswhere is obtained by the algorithm given in Section 2.
Simulation studies are performed to compare the accuracy of the likelihood ratio method and the simulated Bartlett corrected likelihood ratio method. In particular, three samples of data from lognormal distribution are generated. is calculated and is also the calculation with . We repeat this process . The proportion of samples that have -values less than is reported in Table 4 for various sample sizes. The choice of is not important because it does not involve in any of the calculations and, hence, we take it to be . Different choices of result in similar results and are not reported, but they are available upon request. Table 4 reported the cases and . When sample sizes are small, the likelihood ratio method does not give satisfactory results, but it is improving when the sample sizes increase. The simulated Bartlett corrected likelihood ratio method consistently gives extremely accurate result even when the sample sizes are small.
The lognormal distribution has been frequently used for modeling positive valued right skewed data, which commonly arise in health and medical research. In this paper, we proposed a simulated Bartlett corrected likelihood ratio approach to obtain inference concerning the ratio of two coefficients of variation for lognormal distribution. Simulation studies show that the proposed Bartlett correction method is extremely accurate even when the sample size is small. Moreover, the proposed proposed Bartlett correction method is extended to test homogeneity of coefficients of variation from independent lognormal distributions.
The data set for compared concentrations of estrogen metabolites by RIA with the concentrations obtained using a novel and high-performance liquid chromatography-tandem mass spectrometry (LC-MS/MS) is from previously reported in Faupel-Badger et al. , which has been cited. This data set was further analyzed in Nam and Kwon , which was also cited in the manuscript. The other numerical examples in the submitted paper are based on simulation studies, which is available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
- J. Aitchison and J. A. C. Brown, The Lognormal Distribution, Cambridge University Press, Cambridge, UK, 1957.
- E. L. Crow and K. Simizu, Lognormal Distributions. Theory and Application, Marcel Dekker, New York, NY, USA, 1988.
- T. R. Fears, R. G. Ziegler, J. L. Donaldson et al., “Reproducibility studies and interlaboratory concordance for androgen assays of male plasma hormone levels,” Cancer Epidemiology, Biomarkers & Prevention, vol. 11, no. 8, pp. 785–789, 2002.
- J. Nam and D. Kwon, “Inference on the ratio of two coefficients of variation of two lognormal distributions,” Communications in Statistics—Theory and Methods, vol. 46, no. 17, pp. 8575–8587, 2017.
- M. S. Hasan and K. Krishnamoorthy, “Improved confidence intervals for the ratio of coefficients of variation of two lognormal distributions,” Journal of Statistical Theory and Applications, vol. 16, no. 3, pp. 345–353, 2017.
- O. E. Barndorff-Nielsen and D. R. Cox, Inference and Asymptotics, Chapman and Hall, New York, NY, USA, 1994.
- D. A. S. Fraser, “P-values: The insight to modern statistical inference,” Annual Review of Statistics and Its Application, vol. 4, pp. 1–14, 2017.
- O. E. Barndorff-Nielsen, “Inference on full or partial parameters based on the standardized signed log likelihood ratio,” Biometrika, vol. 73, no. 2, pp. 307–322, 1986.
- O. E. Barndorff-Nielsen, “Modified signed log likelihood ratio,” Biometrika, vol. 78, no. 3, pp. 557–563, 1991.
- M. S. Bartlett, “Properties of sufficiency and statistical tests,” Proceedings of the Royal Society A Mathematical, Physical and Engineering Sciences, vol. 160, no. 901, pp. 268–282, 1937.
- J. M. Faupel-Badger, B. J. Fuhrman, X. Xu et al., “Comparison of liquid chromatography-tandem mass spectrometry, RIA, and ELISA methods for measurement of urinary estrogens,” Cancer Epidemiology, Biomarkers & Prevention, vol. 19, no. 1, pp. 292–300, 2010.
Copyright © 2019 A. Wong and L. Jiang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.