Research Article | Open Access
A Mixture of Generalized Tukey’s Distributions
Mixtures of symmetric distributions, in particular normal mixtures as a tool in statistical modeling, have been widely studied. In recent years, mixtures of asymmetric distributions have emerged as a top contender for analyzing statistical data. Tukey’s family of generalized distributions depend on the parameters, namely, , which controls the skewness. This paper presents the probability density function (pdf) associated with a mixture of Tukey’s family of generalized distributions. The mixture of this class of skewed distributions is a generalization of Tukey’s family of distributions. In this paper, we calculate a closed form expression for the density and distribution of the mixture of two Tukey’s families of generalized distributions, which allows us to easily compute probabilities, moments, and related measures. This class of distributions contains the mixture of Log-symmetric distributions as a special case.
The main focus of interest in financial economics is the distribution of stock market returns. Mandelbrot  suggested the family of stable Paretian distributions for stock market returns. Fama  established that the normality assumption of the empirical data does not hold as the distribution is fat tailed. Kon  and Tse  used a mixture of normal distributions for stock return. Fielitz and Rozelle  proposed a mixture of nonnormal stable distributions for stock price. Consequently, greater emphasis has been placed on using distributions which have asymmetry and leptokurtic properties. Recently Jiménez et al.  proposed option pricing based mixture of log-skew-normal distributions. If extreme events tend to occur more frequently than normal events, then skewness and kurtosis of nonnormal distributions play an essential role for the volatility smile.
The most important and useful characteristic of Tukey’s family of distributions introduced by Tukey  is that it covers most of the Pearsonian family of distributions. It can also generate several known distributions, for example, lognormal, Cauchy, exponential, and Chi-squared (see Martínez and Iglewicz , page 363). From Tukey’s family of distribution, we obtain distribution, which is closely related to lognormal distribution and possesses similar properties of moments. Tukey’s family of distributions have been used to study financial markets. Badrinath and Chatterjee [9, 10] and Mills  used to model the return on a stock index, as well as the return on shares in several markets. Dutta and Babbel  found that the skewness and leptokurtic behavior of LIBOR were modeled effectively using distribution. Dutta and Babbel  used to model interest rates and options on interest rates, while Tang and Wu  proposed a new method for the Decomposition of Portfolio VaR. Dutta and Perry  and recently Jiménez and Arunachalam  used distribution to study the operational risk for heavy tailed severity models. Jiménez and Arunachalam  provided explicit expressions for VaR and CVaR calculations using the family of Tukey’s distributions. Currently Jiménez et al.  studied generalization of Tukey’s family of distributions, when the standard normal random variable is replaced by a continuous random variable with mean and variance
The subfamily of distributions exhibits skewness and has great importance in the study of asymmetric distributions for analyzing data. This kind of distribution allows us to obtain scaled Log-symmetric distributions. Vitiello and Poon  considered a simple mixture of two distributions for option pricing data. The purpose of this paper is to present a mixture of Tukey’s distributions and derive some statistical properties including the pdf and moment generating function and its properties.
The paper is organized as follows: Section 2 presents Tukey’s family of generalized distributions and its pdf, as well as the cumulative distribution function (cdf). In Section 3, some theoretical results of the mixture of two Tukey’s families of generalized distributions are presented and Section 4 explains the methodology of calculating estimation of parameters by the method of moments. Section 5 discusses the adjustment methodology of our proposed model to real data of Heating-Degree-Days (HDD) indices and finally, in Section 6, we conclude.
2. Tukey’s Family of Generalized Distributions
Tukey  introduced the family distributions by means of two nonlinear transformations given bywith , where the distribution of is standard normal. When these transformations are applied to a continuous random variable with mean and variance such that its pdf is symmetric about the origin and cdf , the transformation is obtained, which henceforth will be termed Tukey’s generalized distribution. If , Tukey’s generalized distribution reduces towhich is known as Tukey’s generalized distribution.
In order to model an arbitrary random variable using the transformation given in (2), Hoaglin and Peters  introduced two new parameters, (location) and (scale), and proposed the following linear transformation:The following properties for pdf, cdf, and quantile functions of Tukey’s generalized distribution were established by Jiménez et al.  in terms of the and of as follows:where and . We say that the random variable has a Log-symmetric distribution (such distributions are all asymmetric; see for reference Johnson et al.  and Stuart and Ord ) with three parameters: threshold , scale , and shape , denoted by
The first expression of (4) allows us to obtain the following pdf associated with Tukey’s distribution. Table 1 shows the parameters of the pdf of that we obtain using a selected set of well known symmetrical distributions (from Jiménez et al. ).
The th moment of the random variable is given bywhere and is the moment generating function of the random variable , which are even function; that is, Table 2 shows parameters of the and the moment generating function for a random variable , using a selected set of well known symmetrical distributions.
Expression (5) allows us to obtain the moments of Tukey’s generalized distribution. The th moment of the random variable given by (3) can be obtained using the formula where Note that the above expression of the th moment does not depend on the parameter . Formulas for calculating the standardized skewness, , and standardized excess kurtosis, , are given bywhere denote the signum function. Note that these expressions only depend on the parameter and its sign, respectively. Any LS distribution should satisfy the following test given in Stuart and Ord :
3. The Mixture of Two Distributions
We assume that follows a Log-Symmetric Mixture (LSMIX) distribution. Let us assume that is the weighted sum of -component LS densities; that is,We use the notation , where , and each element is the parameter vector that defines the th component and probability weights, , satisfying the conditions According to Titterington et al.  the two-component mixture of known distributions is set by two weights. Let Then we can assume that is the weighted sum of two Tukey’s mixture densities such that . Thuswhere, without loss of generality, we let and for with We use the notation Vitiello and Poon  did not provide the piecewise nature of the mixture density function above in (12). In this case the of is given bywhere Begin with the fact that the quartile function is the inverse of the cdf. Thus, replacing in (14), we obtainIf we assume that , (12) can be written aswhere is the standard normal pdf. Note that the expression above matches the pdf of a mixture of three-parameter lognormal distributions. Letting , the above pdf reduces to that of a mixture of two-parameter lognormal distributions.
Given that every normal pdf is a version of the standard normal pdf then if we have and (12) can be written asIf the parameters are scaled by , that is, , then with Note that the expression above matches the pdf of a mixture of three-parameter lognormal distributions, which is a generalization of the pdf given in (16), and we use the notation Similarly, we can obtain pdf of a mixture of distributions for the random variables listed in Table 1.
4. Estimation of the Mixtures of Two Tukey’s Distributions
In this section, we explain the estimation of the mixture of two Tukey’s distributions. The expected value of is given by The th raw moment of the random variable is given bywhere and is the moment generating function of the random variable The central moments of the random variable are given byThe first five central moments are as follows:where for Because , upon equating population moments to the corresponding sample moments, it follows from (23) that Left-hand side of system (23) is multiplied by ; the equations take the following form:where () denote the th central moment of the sample. Equations (26) accordingly constitute a system of five equations to be solved simultaneously for the estimates of the five parameters , , , , and
Note that, from the first equation of system of (26), it follows that We eliminate between the first and the subsequent equations of (26) in turn and thereby reduce the system to the following four equations in four unknowns , , , and :These systems of equations are solved computationally by using scientific software package and we do not need to verify the unique solution of the system as the parameter estimates. We skip further details and numerical illustration owing to space constraint.
In this section we discuss some examples and applications of the results derived in Section 3 with two examples. In the first example, we discuss the pricing of a call option using a mixture of two Tukey’s -generalized distributions as an example to illustrate the results of Section 3. In the second example, we examine the empirical real data of Heating-Degree-Day to demonstrate usefulness of our approach of mixture of LS distributions.
Jiménez et al.  derived the option price of an European option assuming that the terminal price distribution follows a -generalized distribution. Instead if we use a mixture of two Tukey’s classes of -generalized distributions, then the price of the call option denoted by with a strike price and maturity date can be expressed as follows:where and When , (29) reduces towhere denotes the of a standard univariate normal variable. If we assume that , then (31) reduces toNote that when , these expressions coincide with the option pricing formula given in Bahra . The authors also established closed form formula for the calculation of the sensitives measures of option pricing (Greek parameters of the option). Here we wish to observe that our mixture model uses less unknown parameters for calculating the option pricing, whereas Vitiello and Poon  used nine unknown parameters to obtain the same for the mixture of two -distributions. It has been known that when we increase the number of parameters, we lose degrees of freedom and it is no longer acceptable for the best fit of data. This gives an advantage of our approach for the mixture of two -generalized distributions.
We now present, as an example, the use of Heating-Degree-Days (HDD) in relation to winter temperature risk as a substitute for gas demand. HDD based contracts are listed on the Chicago Mercantile Exchange (CME). We consider an example that consists of monthly aggregate Heating-Degree-Day (HDD) data values at the Chicago O’Hare International Airport from December to December given in Wang  and explored also by Vitiello and Poon . We describe first a LS distribution with three parameters based method to infer the implied risk-neutral probability density (RND). In Table 3, we present the estimated values of the three parameters of lognormal and Log-Logistic distributions; our interest is to compare with Vitiello and Poon  risk-neutral densities with our proposed mixture model.
The smaller value of the Kolmogorov-Smirnov (KS) test confirms that the data obeys the LS distributions with three parameters. We wish to observe that Anderson-Darling (AD) test is more sensitive to the tails of the LS distributions in comparison with KS test. In this case, we choose the Log-Logistic distribution as the best fit for the HDD data.
The implicit risk-neutral densities (RND) of LS distributions are shown in Figure 1 and compared with Figure of Vitiello and Poon . We have obtained a similar plot by our method with less unknown parameters than method given by Vitiello and Poon . Furthermore, their KS test value of which is higher than the KS test values of Table 3 favors the best fit for the frequency of the LS distributions. Therefore, finite mixtures are attractive from the application viewpoint because of its flexibility and permit us to model various kinds of shaped distributions. In Table 4, we give the estimate values of the parameters of the mixture LS distributions. These parameters are estimated using (28). The estimated two -densities and the implied risk-neutral densities (RND) are shown in Figure 2.
We observe that the bimodal LS mixture distribution has same fitting performance of the empirical distribution function (EDF) and lognormal mixture distribution gives best goodness of fit using the KS test.
This paper presents a mixture of Tukey’s -generalized distributions and its properties. The methodology of estimating the unknown parameters by the method of moments is also presented. The proposed model has the advantage that it provides flexibility, when skewness, kurtosis, or other moments of the underlying distribution do not follow a normal distribution. Some special cases of well known distributions are obtained from the proposed model.
The authors declare that they have no competing interests.
- B. Mandelbrot, “The variation of certain speculative prices,” The Journal of Business, vol. 36, no. 4, pp. 394–419, 1963.
- E. F. Fama, “The behavior of stock-market prices,” The Journal of Business, vol. 38, no. 1, pp. 34–105, 1965.
- S. J. Kon, “Models of stock returns—a comparison,” The Journal of Finance, vol. 39, no. 1, pp. 147–165, 1984.
- Y. K. Tse, “Price and volume in the tokyo stock exchange: an exploratory study,” Working Paper 1573, University of Illinois at Urbana-Champaign, 1989.
- B. D. Fielitz and J. P. Rozelle, “Stable distributions and the mixtures of distributions hypotheses for common stock returns,” Journal of the American Statistical Association, vol. 78, no. 381, pp. 28–36, 1983.
- J. A. Jiménez, V. Arunachalam, and G. M. Serna, “Option pricing based on a log-skew-normal mixture,” International Journal of Theoretical and Applied Finance, vol. 18, no. 8, Article ID 1550051, 22 pages, 2015.
- J. W. Tukey, “Modern techniques in data analysis,” in Proceedings of the NSF Sponsored Regional Research Conference, Southeastern Massachusetts University, North Dartmouth, Mass, USA, 1977.
- J. Martínez and B. Iglewicz, “Some properties of the tukey g and h family of distributions,” Communications in Statistics—Theory and Methods, vol. 13, no. 3, pp. 353–369, 1984.
- S. G. Badrinath and S. Chatterjee, “On measuring skewness and elongation in common stock return distributions: the case of the market index,” The Journal of Business, vol. 61, no. 4, pp. 451–472, 1988.
- S. G. Badrinath and S. Chatterjee, “A data-analytic look at skewness and elongation in common-stock-return distributions,” Journal of Business & Economic Statistics, vol. 9, no. 2, pp. 223–233, 1991.
- T. C. Mills, “Modelling skewness and kurtosis in the london stock exchange index return distributions,” The Statistician, vol. 44, no. 3, pp. 323–332, 1995.
- K. K. Dutta and D. F. Babbel, On Measuring Skewness and Kurtosis in Short Rate Distributions: The Case of the US Dollar London Inter Bank Offer Rates, The Warton School, University of Pennsylvania, Philadelphia, Pa, USA, 2004.
- K. K. Dutta and D. F. Babbel, “Extracting probabilistic information from the prices of interest rate options: tests of distributional assumptions,” Journal of Business, vol. 78, no. 3, pp. 841–870, 2005.
- X. Tang and X. Wu, “A new method for the decomposition of portfolio VaR,” Journal of Systems Science and Information, vol. 4, no. 4, pp. 721–727, 2006.
- K. K. Dutta and J. Perry, “A tale of tails: an empirical analysis of loss distribution models for estimating operational risk capital,” Working Paper 06-13, Federal Reserve Bank of Boston, 2007.
- J. A. Jiménez and V. Arunachalam, “Evaluating operational risk by an inhomogeneous counting process based on Panjer recursion,” The Journal of Operational Risk, vol. 11, no. 1, pp. 1–21, 2016.
- J. A. Jiménez and V. Arunachalam, “Using Tukey's g and h family of distributions to calculate value at risk and conditional value at risk,” Journal of Risk, vol. 13, no. 4, pp. 95–116, 2011.
- J. A. Jiménez, V. Arunachalam, and G. M. Serna, “A generalization of Tukey's g-h family of distributions,” Journal of Statistical Theory and Applications, vol. 14, no. 1, pp. 28–44, 2015.
- L. Vitiello and S.-H. Poon, “General equilibrium and risk neutral framework for option pricing with a mixture of distributions,” The Journal of Derivatives, vol. 15, no. 4, pp. 48–60, 2008.
- D. C. Hoaglin and S. C. Peters, Software for Exploring Distribution Shape, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, 1979.
- N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distributions, vol. 1, John Wiley & Sons, New York, NY, USA, 1994.
- A. Stuart and J. K. Ord, Kendall's Advanced Theory of Statistics: Distribution Theory, vol. 1, Edward Arnold, London, UK; Halsted Press, New York, NY, USA, 6th edition, 1994.
- D. M. Titterington, A. F. Smith, and U. E. Makov, Statistical Analysis of Finite Mixture Distributions, vol. 7, John Wiley & Sons, New York, NY, USA, 1985.
- J. A. Jiménez, V. Arunachalam, and G. M. Serna, “Option pricing based on the generalised Tukey distribution,” International Journal of Financial Markets and Derivatives, vol. 3, no. 3, pp. 191–221, 2014.
- B. Bahra, “Implied risk-neutral probability density functions from option prices: a central bank perspective,” in Forecasting Volatility in the Financial Markets, J. Knight and S. Satchell, Eds., pp. 201–226, Elsevier, 3rd edition, 1997.
- S. S. Wang, “A universal framework for pricing financial and insurance risks,” ASTIN Bulletin, vol. 32, no. 2, pp. 213–234, 2002.
Copyright © 2016 José Alfredo Jiménez and Viswanathan Arunachalam. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.