Abstract

We discuss the calibration of the univariate and multivariate generalized hyperbolic distributions, as well as their hyperbolic, variance gamma, normal inverse Gaussian, and skew Student’s -distribution subclasses for the daily log-returns of seven of the most liquid mining stocks listed on the Johannesburg Stocks Exchange. To estimate the model parameters from historic distributions, we use an expectation maximization based algorithm for the univariate case and a multicycle expectation conditional maximization estimation algorithm for the multivariate case. We assess the goodness of fit statistics using the log-likelihood, the Akaike information criterion, and the Kolmogorov-Smirnov distance. Finally, we inspect the temporal stability of parameters and note implications as criteria for distinguishing between models. To better understand the dependence structure of the stocks, we fit the MGHD and subclasses to both the stock returns and the two leading principal components derived from the price data. While the MGHD could fit both data subsets, we observed that the multivariate normality of the stock return residuals, computed by removing shared components, suggests that the departure from normality can be explained by the structure in the common factors.

1. Introduction

Empirical evidence that stock prices do not generally follow geometric Brownian motion precedes even the Black-Scholes-Merton option pricing model [13]. While numerous models have been investigated to describe both path and distributional behaviour more realistically for portfolio optimisation and hedging risk, comparatively less attention has been devoted to the assessment of more sophisticated models relative to one another.

The hyperbolic Lévy model was first proposed in finance by Eberlein and Keller [4] to model returns of DAX stocks via the generalized hyperbolic distributions (GHD for short) of Barndorff-Nielsen [5]. Round the same time, special cases were investigated; Barndorff-Nielsen proposed the normal inverse Gaussian (NIG) [6], Hansen [7] was the first to propose the skewed Student’s -distribution, and Madan and Seneta [8], Madan and Milne [9], and Madan et al. [10] proposed the variance gamma process for the dynamics of the log-returns. McNeil et al. [11] review some empirical investigations and applications of the GHD in finance. Fajardo and Farias [12] calibrated the GHD to Brazilian market data and more recently Necula [13] fit the GHD to a series of index returns from Romania, Hungary, and the Czech republic; Fajardo and Farias [12] estimate the multivariate affine GHD for market data from several well-established markets and Hellmich and Kassberger [14] apply the multivariate generalized hyperbolic distributions ( for short) to portfolio modeling. These empirical studies point out the superior capacities of the univariate and multivariate generalized hyperbolic distribution and its subclasses for realistically describing financial data. In connection with the JSE, some work has been done to study asset prices (see, e.g., [15, 16]) but, to the best of our knowledge, no work has been conducted using the generalized hyperbolic distributions together with the expectation maximization (EM) based or the multicycle expectation conditional maximization (MCECM) [11] estimation algorithms.

In this work we fit the univariate and the multivariate GHD and some of their subclasses, namely, the hyperbolic, the normal inverse Gaussian, the variance gamma, and the skewed Student’s -distributions, to the daily log-returns of seven liquid mining stocks listed on the Johannesburg Stocks Exchange (JSE) from January 2006 to December 2011. To estimate the parameters of the distributions, we use an EM-based estimation algorithm for the univariate case. We then apply goodness of fits tests and consider the stability of the parameters calibrated on the daily basis, as criteria for discerning between models. For the multivariate case, we apply the MCECM estimation algorithm before and after filtering off common driving factors computed via principal component analysis.

The paper is organized as follows. Section 2 describes our data set. In Section 3, we briefly review the multivariate generalized hyperbolic distributions and focus on some subclasses, namely, the hyperbolic, the normal inverse Gaussian, the skewed Student’s -, and the variance gamma distributions. Section 4 is devoted to the presentation of the univariate estimation result and inspects the stability of parameters. In Section 5 we test the multivariate GHD hypotheses and find that these models are not ruled out. We then apply principal component analysis to identify common factors driving returns and then reconsider the multivariate GHD models after filtering the data to remove these exogenous effects. Section 6 is devoted to the conclusion.

2. Data

The data used in the present study consists of daily closing prices between January 2006 and December 2011 for 7 of the most liquid mining stocks in the J200 Index (representing the JSE TOP 40 companies). Each set of data contains 1500 observations. The seven companies under consideration are the following: Anglo American Plc (AGL), Anglo American Platinum Corporation Limited (AMS), Anglo Gold Ashanti Limited (ANG), BhP Billington Plc (BIL), Gold Fields Limited (GFI), Harmony Gold Mining Company Limited (HAR), and Impala Platinum Holdings Limited (IMP).

The daily log-returns are calculated using where is the stock price on day , .

The mean, the standard deviation, the skewness, and the kurtosis are presented in Table 1.

From Table 1 we can see that the returns are skewed and characterized by heavy tails since the kurtosis are significantly greater than 3. While heavy tails suggest that it may be meaningful to apply extreme value theory to model the tail distributions, the focus of this work is to investigate models for the distributions as a whole.

We normalize the -returns and assume that the -scored daily -returns are independent and identically distributed.

3. The Generalized Hyperbolic Distributions

The generalized hyperbolic distribution (GHD) was introduced by Barndorff-Nielsen [5] to model the distribution of sand grain sizes and can account for heavy tails. It has since been applied to turbulence theory, geomorphology, financial mathematics (see Eberlein and Keller [4]), and so forth. In this section, we will define the multivariate GHD as a normal mean-variance mixture distribution, where the mixture variable has the generalized inverse Gaussian distribution as in McNeil et al. [11, pp. 78].

Definition 1 (normal mean-variance mixture). The random variable is said to have a multivariate normal mean-variance distribution if where and are deterministic parameter vectors in , follows a -dimensional normal distribution, is a positive scalar random variable independent of , and is a matrix.

Letting , from the definition of , we can easily see that

The following definition of the generalized inverse Gaussian distribution together with Definition 1 will help us to define the generalized hyperbolic distributions.

Definition 2. The random variable is said to have a generalized inverse Gaussian (GIG) distribution with parameters , , and if its probability density function is given by

Here, is the modified Bessel function of third kind with index satisfying the differential equation For more details, on this function, we refer to Abramowitz and Stegun [17].

It can be shown that the parameters satisfy Note that for nonlimiting cases when and , (see McNeil et al. [11], pp. 497) the following holds: The can now be obtained from the GIG distribution.

Definition 3. If the mixture variable in Definition 1 is GIG distributed, then is said to have a multivariate generalized hyperbolic distribution (). When then is said to have a symmetric generalized hyperbolic distribution.

Theorem 4 (see [11, Section 3.2]). When the mixing variable and is nonsingular, it can be shown that the probability density function of the -dimensional is given for by with the normalizing constant where denotes the determinant.

The following hold true for the :(1) defines the subclasses of and is related to the tail flatness.(2) and determine the distribution shape; in general, the larger those parameters are, the closer the distribution is to the normal distribution.(3) is the location parameter and can take any real value.(4) is a the dispersion matrix.(5) is the skewness parameter.

Proposition 5. If and where and then

(See McNeil et al. [11], pp. 79).

3.1. Parametrizations

(1)The -parametrization has the following drawback: the distributions of and coincide for any , since Therefore, an identification problem arises when starting to fit the parameters of the to data. This problem can be addressed in several ways. One possible way is to require the determinant of the dispersion matrix to be equal to .(2)The -parametrization, is considered to be a more elegant way to eliminate the degree of freedom than requiring the determinant of the dispersion matrix to be equal to . This parametrization makes the interpretation of the skewness parameter simpler and, in addition, the fitting procedure becomes faster. It requires the expected value of the generalized inverse Gaussian distributed mixing variable to be . The drawback of the -parametrization is that it does not exist when and , which corresponds to a Student’s -distribution without variance. If we set then the following formulas are used to switch from the -parametrization to the -parametrization: (3)The following formulas are used to switch from the -parametrization to the -parametrization: The -parametrization was introduced by Blæsild [18] for the GHD. Similar to the -parametrization, there is an identification problem which can be addressed by constraining the determinant of to .

3.2. Mean and Covariance

By (3)–(7) the mean and covariance of are given by Note that further properties of the can be found in [19].

3.3. The Univariate Generalized Hyperbolic Distributions

If we set and in (8), we obtain the univariate generalized hyperbolic distribution. The probability density function is given by with the normalizing constant

3.4. Key Subclasses of the GHD

The generalized hyperbolic family of distributions is very flexible; many distributions arise as subclasses or limiting cases, are known by alternative names and have become very popular in financial modeling. We now take a closer look at some of those distributions.

3.4.1. Hyperbolic Distributions (HYP)

(i)When , one arrives at the multivariate hyperbolic distribution. However, its marginal distributions are no longer hyperbolic distributions.(ii)When , one can obtain a whose univariate marginal distributions are hyperbolic.

3.4.2. Normal Inverse Gaussian (NIG) Distributions

Setting leads to the subclass of normal inverse Gaussian (NIG) distributions. The multivariate NIG distribution is widely used in financial modeling (see for example Aas et al. [20]) for recent applications. We note that the tails of this subclass are slightly heavier than those of the hyperbolic subclass.

3.4.3. Variance Gamma (VG) Distributions

When and , if we use the fact that , as for , we obtain the limiting case which is known as variance gamma (VG) distribution. The mean and the covariance of a variance gamma distributed random vector are given by

3.5. The Skewed Student’s -Distribution (St)

If and we obtain another limiting case called the generalized hyperbolic skew Student’s -distribution often simply call the skew Student’s -distribution when . If we use the facts that and , as for , and define , the mean and the covariance of a skew Student’s -distributed random vector are given by where the mean exists only if (i.e., ), and the covariance matrix is only defined for .

4. Univariate Estimation Results

Basic properties of the univariate GHD and some of its special cases can be found in [21] and references therein. In this section, we present the univariate estimation results obtained via the EM-based algorithm (details can be found in, e.g., [11, Section 3.2]) which is implemented in the ghyp R package.

To illustrate the superior fit of the GHD, in Figure 1, we plot the empirical density and log-density of the log-returns of AGL together with fitted density functions for the GHD and normal distribution. One can clearly see the better fit of the GHD, particularly with respect to the fits to the tails.

The univariate estimation parameters are presented in Table 2. We obtain , , and from the values of , 4 stocks are left skewed and 3 are right skewed for the period investigated.

4.1. Comparisons of the Estimated Parameter Sets

We analyze and compare the goodness of fits of the univariate generalized hyperbolic distributions under consideration. To this end, the following four criteria will be used to compare the goodness-of-fit of different candidate distributions.(i)The log-likelihood (LL): the LL is an overall measure of goodness-of-fit, with higher values of LL implying a more likely distribution candidate to model the data.(ii)The Akaike information criterion (AIC): the AIC is a measure of the relative goodness of fit which estimates relative support for a model. Let be the number of parameters in the calibrated model, then, where is the maximized value of the likelihood function of the estimated model.(iii)Kolmogorov-Smirnov (KS) test statistics: the KS test uses the Kolmogorov distance of the empirical distribution function and a given continuous distribution (null distribution) function to test whether the data was sampled from the distribution . The Kolmogorov distance is the supremum over the absolute differences between two density functions. It is given by where and are the empirical and the estimated CDFs, respectively.(iv)We simultaneously compute the values of the Kolmogorov test statistics. The value is a measure of how much evidence we have against the null hypothesis (that the data is drawn from the theoretical distribution concerned) against an alternative hypothesis (that the null hypothesis is false). The smaller the value the more evidence we have against the null hypothesis. In this work, if , we will say that the data appears to be consistent with the null hypothesis, and if we will conclude that there is very strong evidence against the null hypothesis.

From Table 3, we can see that the generalized hyperbolic distribution (GHD) has the highest log-likelihood by a small margin for all the returns analyzed. The largest discrepancy between the log-likelihood indications occurs for AGL, where LL(GHD) = −2029.336 compared to LL(VG) = −2038.502. This amounts to a percentage difference of less than 10%.

Amongst the subclasses, Student’s -distribution has the highest log-likelihood and the smallest AIC for AGL and IMP, while ANG, BIL, and GFI are best modeled by the NIG distribution according to the LL and AIC criteria.

From Table 4, there is evidence against the null hypothesis in 1 case. Specifically, the Hyp is ruled out for ANG stock. For AGL, GFI, and IMP stocks, the generalized hyperbolic distribution has the smallest Kolmogorov distance.

4.2. Temporal Stability of Parameters

We inspect the stability of parameters via the plots of parameters for daily rolling window for the GH and the VG distributions. For the daily parameter variations, we first calibrate the daily -returns from January 3, 2006 to December 31, 2010 (1250 observations). We then remove one observation at the beginning and add one observation at the end until December 2011. We obtain the following figures.

The subplots in Figure 2 suggest that the parameters are not very stable over time for the generalized hyperbolic distribution. However, Figure 3 suggests a more stable parametrization for the variance gamma distribution when is constrained to zero. It can also be noted that the varying of may be consistent with a more general model with time-varying volatility.

Next, we compute the densities of the change in parameters.

From Figures 4 and 5, it is clear that the variation in parameter estimates is significantly diminished for the VG model. Thus, consideration of the temporal stability of parameters provides a further criterion for discerning between models.

5. Multivariate Case

To examine the suitability of the multivariate GHD model for the seven stocks, we fit the -scored data, where stocks are listed in the same order as in Section 2 for calculations. The MCECM algorithm, implemented in the ghyp R package (a detailed description of the algorithm is documented in [11, Section 3.2]), yielded the following parameters estimates for the joint return distribution for the GHD case: Parameters obtained for the subclasses are given in Appendix A.1.

Table 5 gives the -likelihood, the AIC, and KS distance for each of the fitted multivariate distributions. The generalized hyperbolic distribution has the highest log-likelihood and the smallest AIC. However, from the values of the KS test, we can conclude that the data appears to be consistent with the null hypothesis for the subclasses, MNIG, MVG, and MSt, as well but the MHYP is rejected by the value. In addition we fit a multivariate normal distribution and from the last row of Table 5, it is clear that the multivariate normal distribution is ruled out by the log-likelihood, the AIC, and the Kolmogorov-Smirnov statistics test for the -scored multivariate data.

From Table 5 we also note that the result obtained for the log-likelihood and the AIC of the MSt and are very similar, although the MSt has less parameters. This was also observed for fitting of Dow Jones daily returns to the model and its subclasses, using the MCEM algorithm (see McNeil et al. [11], pp. 83 for details description).

Since the data set comprises stocks from the same market and the same sector, it is possible that there are nontrivial positive correlations. Therefore, we apply principal components analysis (PCA) to identify common statistical factors in order to filter off a reduced-dimension set of shared exogenous price determinants, before fitting multivariate generalized hyperbolic distributions and subclasses [22, 23]. As in Section 2, we use the -scored data for the PCA.

The PCA is done as follows.(i)First, we estimate the covariance matrix for the entire data set.(ii)Second, we calculate the seven eigenvalues for the estimated covariances matrix and obtain that and account for greater than of the variation.(iii)Third, we compute the two eigenvectors corresponding to and , respectively:

(iv)Next, we compute the two leading principal components as common statistical factors and regress the returns data for each stock against the two statistical factors.(v)Finally, we fit the to both the pair of principal components derived from the price data, as well as to set of seven stock return components which are not explained by shared exogenous drivers.

We obtain the following parameters estimates for the joint return distribution of the two principal components: The parameters estimated for the other subclasses are given in Appendix A.2.

Table 6 gives the -likelihood, the AIC, and KS distance for each of the fitted multivariate distributions, with the last line documenting results for the fit of a multivariate normal distribution. The generalized hyperbolic distribution has the highest log-likelihood and the smallest AIC and, from the values of the KS test, we can conclude that the data for the two shared statistical factors appears to be consistent with the null hypothesis for five GHD subclasses considered. However, it is clear from the last row of Table 5 that the bivariate normal distribution is ruled out by the log-likelihood, the AIC, and the Kolmogorov-Smirnov statistics test for the pair of common factors.

We assessed the fit of the and multivariate normal models to the residuals obtained from the regression of the seven stocks against the two principal components. We found that these stock return components were explained by a multivariate normal distribution and that more complex models were ruled out. The combined outcome for the shared factors and the residuals suggests that the success of model for explaining returns, with positive results in Table 5, may be a reflection of the structure in the two principal components, revealed in Table 5.

6. Conclusions

We estimated the parameters of the univariate generalized hyperbolic, hyperbolic, variance gamma, normal inverse Gaussian, and skew Student’s -distributions for the -scored daily -returns of liquid mining stocks listed on the Johannesburg Stocks Exchange from January 2006 to December 2011. According to the log-likelihood (LL) and the Akaike information criteria (AIC), the generalized hyperbolic distribution offered the best fit for all seven stocks considered. However, the differences between the models were small, with disagreement of at most 10% in the criteria computed. Moreover, application of the Kolmogorov-Smirnov (KS) statistics test ruled out the hyperbolic distribution for one of the stocks, namely, ANG.

Considering only the proper subclasses, the LL and AIC pointed to the NIG distribution for ANG, BIL, and GFI but by even narrower margins relative to the alternatives (less than 0.8% differences). The KS statistics tests suggested that AMS, ANG, BIL, and IMP were best modeled by NIG distribution.

On inspection of the temporal stability of the parameters fits for the most general case, we observed that the model parameters varied through time, with the suggestion that a VG model would offer a more stable calibration for AGL. We also noted that the volatility parameter varied over the period considered, which is consistent with the literature on time-varying volatility models and suggests a further line of investigation in the context of models with GHD type increments.

We considered the multivariate generalized hyperbolic and its hyperbolic, variance gamma, normal inverse Gaussian, and skew Student’s subclasses as possible models for the joint distributions of returns. It was found that it was possible to fit models to joint returns, with this model narrowly outperforming the multivariate Student’s -distribution with the next best fit.

Closer analysis of common risk factors via principal component analysis yielded two shared factors which were successfully modeled with a bivariate GHD model. The regression residuals for the seven stocks, which were obtained by removing the common price determinants, were found to be normally distributed. This provided evidence for the view that the GHD structure of the principal components was adequate for explaining the dependence structure of the seven stocks.

Appendix

A. Multivariate and Bivariate Estimation Results

A.1. Multivariate Estimation Results

In this section, we present the 7-dimensional estimation results for the subclasses of generalized hyperbolic distribution.

(i) We calibrate the MHyp model to the daily returns of 7-dimensional -scored above mentioned daily returns and we obtain the following result:

(ii) We present the parameters estimated for the MNIG distribution,

(iii) We obtain the following parameters for the MVG,

(iv) The estimated multivariate skewed Student’s -parameters are

(v) The estimated multivariate normal parameters are

A.2. Bivariate Estimation Result

In this section, we present the bivariate estimation results for the subclasses of generalized hyperbolic distribution.(i)We calibrate the bivariate Hyp model to the daily returns of the two compressed above mentioned daily returns and we obtain the following result: (ii)We present the parameters estimated for the bivariate NIG distribution as follows: (iii)We obtain the following parameters for the bivariate VG: (iv)The estimated bivariate skewed Student’s -parameters are (v)The estimated bivariate normal parameters are

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The first author was supported by the NRF (National Research Foundation) Grant no. SFP 1208157898. The authors also would like to thank the anonymous referees for helpful comments in improving this paper.