Abstract

VaR (Value at Risk) in the gold market was measured and predicted by combining stochastic volatility (SV) model with extreme value theory. Firstly, for the fat tail and volatility persistence characteristics in gold market return series, the gold price return volatility was modeled by SV-T-MN (SV-T with Mixture-of-Normal distribution) model based on state space. Secondly, future sample volatility prediction was realized by using approximate filtering algorithm. Finally, extreme value theory based on generalized Pareto distribution was applied to measure dynamic risk value (VaR) of gold market return. Through the proposed model on the price of gold, empirical analysis was investigated; the results show that presented combined model can measure and predict Value at Risk of the gold market reasonably and effectively and enable investors to further understand the extreme risk of gold market and take coping strategies actively.

1. Introduction

Gold is one of the most brilliant and beautiful of metals. The price of gold has been the concern of investors for the special value-added property. With the gradual liberalization of the price of gold, gold itself supply and demand, the dollar exchange rate, interest rates, and other complex factors will cause greater volatility in prices. O’Connor et al. [1] have investigated physical gold demand and supply, gold mine economics, and analyses of gold as an investment. Also they have researched gold market efficiency, the issue of gold market bubbles, gold’s relation to inflation, and interest rates. The volatility of gold price increases the loss of earnings. In this case, it is more important to measure and control the gold spot market effectively.

VaR (Value at Risk) is the main method of risk management at present. It indicates the maximum loss level of financial assets at a given confidence level for a certain period of time in the future. Generally, the income series is regarded as a product of fixed variance normal distribution, but it does not satisfy the time-varying financial markets. The GARCH [24] model captures the time-varying and sequence correlation of price volatility for the characteristics of “peak-tail” and “volatility clustering” of financial income series. But GARCH models define the conditional variance as the deterministic function of the square of past observations and the variance of the previous condition. The estimation of the conditional variance is directly related to the past observations. Therefore, the estimated volatility series is not very stable when there are anomalous observations, and GARCH model for long-term volatility prediction ability is relatively poor. Another model to characterize volatility is stochastic volatility (SV) models proposed by Taylor (2003) [5], which introduce an autoregressive equation with implied volatility into the model, making the models more useful, and characterize the essential characteristics of financial markets. Zhen-long and Yi-zhou [6], Jing-dong and Chi [7], and other authors compared the GARCH models and SV models to describe the volatility of financial time series, and the SV model can fit the financial sequence better. Su-hong and Shi-ying [8] proposed an improvement of the error term of the standard SV model, which can further enhance ability of the SV model to describe the financial sequence. Therefore, this paper chooses the SV-T model to describe volatility of the gold return series. With developments of Gibbs sampling technique and computer technology, Jacquier et al. [9] firstly used the Markov chain Monte Carlo (MCMC) method to estimate parameters of SV model in 1994. Subsequently, Hui-ming et al. [10, 11] further studied the MCMC method based on Gibbs sampling to obtain the best SV model parameters estimation results. Since the SV-T model is difficult to estimate and the out-of-sample volatility prediction, the SV-T model is transformed into a linear state space without loss of any information [1217]. Standard Kalman filter can obtain the better state estimation only in the linear Gaussian state space. But the tail distribution of the yield at the same time of linearization is no longer Gaussian distribution, and it obeys the logarithm of left partial long tail square distribution. The improper model error may be generated when a single standard normal distribution approximation is used, so Gaussian mixture distribution approximation is proposed. Anderson and Moore (1979) [18] proved that an arbitrary random variable can be approximated by a finite Gaussian mixture distribution, and the approximation error can be arbitrarily small when the normal number of the factor is large enough. In this paper, we estimate parameters of SV-T model using the MCMC algorithm firstly and then estimate the parameters of Gaussian mixture distribution using EM algorithm. Finally, we propose a new method for estimating the parameters of Gaussian mixture model by means of Lemke (2006) [16]. Approximate Filter (AMF) algorithm is applied to realize out-of-sample prediction of volatility.

VaR is a main measurement for the future specific period of time under normal fluctuations in the case of the maximum possible loss. The inadequate consideration of financial market causes the tail risk underestimated. Extreme Value Theory (EVT), which does not consider the whole of the distribution of return series, directly uses the tail of data to fit its distribution, and it can deal with the thick tail phenomenon more effectively and measure the risk loss under extreme conditions [1925]. At present, the extreme value of the domestic financial risk measurement of financial assets rarely focuses on SV models. Therefore, in this paper, we use the SV-T-MN model to describe time series volatility of financial series and combine with the extreme value theory to fit tail distribution of standard residuals and then establish a new financial risk measurement model, dynamic VaR forecast model based on EVT-SV-T-MN. Finally, we make an empirical analysis on daily closing price of AU99.99 in Shanghai Gold Exchange to guide investors to fully understand the extreme risks of gold market and take active countermeasures.

The article includes five parts. The second part introduces the SV model of gold price volatility, and the third part introduces VaR model combining SV model and extreme value theory. The fourth part analyzes daily closing price data of AU99.99 of Shanghai Gold Exchange. The fifth part is the summary of full text.

2. SV-T-MN Model Based on State Space

The core of measuring VaR of gold market returns is to accurately predict its volatility. In this paper, the SV model is used to capture “peak-tail” characteristics of volatility of gold market return series. Geweke (1994) proposed a thick tail SV-T model:

is the yield of gold price in period , is the logarithm fluctuation, and is the persistence parameter, which is a direct impact on the current volatility of future fluctuations in the intensity. Where obeys a normal distribution with mean 0 and variance , follows a standard distribution with degrees of freedom is the drift level of the wave equation. Due to the fact that there is a degree of freedom in the distribution of the error term in the thick tail SV-T model, it is difficult to estimate the parameters and to predict the volatility out of the sample.

In this paper, by taking the logarithmic square transformation of metric equation without loss of any information, it is linearly equivalent to state space model. The SV-T-MN model is established in two steps. In the first step, the SV-T model is linearized and transformed into the form of state space. In the second step, the logarithm of linearization T-squared distribution is approximated by Gaussian mixture distribution, and the Gaussian mixture model (SV-T-MN) is obtained as follows:

In the above model, represents , represents , represents , is the weight of the mixed normal distribution, , is the factor mean, is the factor variance, and is the number of the Gaussian mixture distribution factor. The estimation of all parameters of SV-T-MN model is carried out in two steps. In the first step, the estimation of parameters , , , and of SV-T model is obtained by MCMC method. The second step obtains its estimate by EM algorithm [26] for , , and in SV-T-MN model.

When the stochastic perturbation term satisfies a single normal distribution, the Gaussian mixture model of (2) degenerates to the general linear Gaussian state space model, and the outlier prediction usually adopts standard Kalman filtering algorithm [14]. For prediction of Gaussian mixture model, standard Kalman filtering algorithm can be extended to exact filtering algorithm, but the cost is exponentially increasing with time, which is difficult to be used in the real long-term observation sequence [26]. Therefore, this paper uses the Approximate Filter of (AMF ()) algorithm proposed by Lemke [16] to realize out-of-sample prediction of volatility. The iterative process of this algorithm is controlled by a parameter ; not only is it used effectively in the actual medium and long-term sequences, but also its prediction accuracy is very close to the exact filtering. -degree myopia filter algorithm is divided into three steps.

Step 1. When , for the first values of the observed sequence, we calculate its one-step prediction and update of each iteration process component by using the exact filtering algorithm and then get the mixed one-step prediction and update.

Step 2. When , the exact filtering algorithm is initialized, then the mixed one-step prediction and updating are initialized at to , and the exact filtering algorithm is run again. Similarly, we can get mixed one-step prediction and update at to .

Step 3. When , continue to repeat Step 2. We can get the volatility of gold price return at each time point and further predict the VaR at different time points; we can realize the simulation of dynamic risk value.

3. Dynamic Extreme VaR Modeling Based on SV-T-MN Model

A new method of extreme value theory (EVT), which is POT (Peaks Over Threshold) model, does not focus on the discussion of extreme values (maximum or minimum) but focuses on the excess. POT model can be used to capture the information of heavy tails in a random sequence, by investigating the observational behavior of a sample over a certain threshold. And the model can be used to obtain the variance of the global extreme by extrapolating the sample data in the case of unknown population distribution. It overcomes the limitations of the traditional method which cannot exceed the sample data for analysis. In the measurement of financial returns data tail risk has better application effect [27, 28]. The POT model and the stochastic volatility model are combined to predict the risk value of the gold return sequence.

Since the POT model requires the income series to have independent and identically distributed properties, McNeil (2000) [29] found that the standardized residuals of financial data can effectively satisfy the conditions of independent and identically distributed. Therefore, the data are normalized to obtain the independent and identically distributed standard residuals and tail-fitted with . The standard residual of gold return sequence is , where is the conditional mean of the return sequence and is the conditional variance. Suppose that the distribution function of the standard residual sequence is and is a sufficiently large threshold. When , there are and is the overthreshold conditional distribution function of the random variable [30].We can get . The conditional distribution of can be obtained from the generalized Pareto distribution (GPD) which is well approximated by the Pickands [31] limit theorem. The GPD distribution can be expressed as

There are two unknown parameters in the GPD distribution, which are the shape parameter and the position parameter , where and when , ; when , . VaR is the extreme quantile of loss distribution function. When is known, is also available. The density function can be deduced from generalized Pareto distribution and log-likelihood function [30] is expressed as

The maximum likelihood method is used to estimate the shape parameter and the position parameter . However, it is possible to estimate the shape parameter and the position parameter more accurately, on the assumption that an appropriate threshold is selected. Because when the threshold is too high, the sample size is too small to meet the needs of parameter estimation and it is too low to meet the distribution of “thick tail” feature, the average excess function method [3234] is often used to select the threshold. When a high threshold is assumed, the excess return is subject to the generalized Pareto distribution, where , the excess mean value exceeding the threshold is , and for any , we define the overaveraging function . For any , . The excess mean function is a linear function of for fixed , and the excess mean function [34] is as follows:where is the number of samples exceeding the threshold and is the corresponding sample value. When , the excess mean function graph is linear with respect to , and if the overfunction graph is tilted upwards, the data is subject to the GPD distribution with the shape parameter being positive, and the distribution is thick tail distribution. When distribution of is known, can be used to obtain the risk value of the income sequence at the given confidence level . The proposed model based on EVT and SV-T-MN model is as follows:The detailed modeling procedure is shown in Figure 1. There are three main steps in the proposed model. The first step is to estimate the parameters with MCMC and EM. The second step is to calculate the volatility based on EKF or AMF. The final step is to calculate the excess mean value function by using maximum likelihood estimation.

4. Empirical Analysis

4.1. Sample Selection and Description

In this paper, the Shanghai Gold Exchange AU99.99 daily closing price data is chosen as a representative samples, and the sample price of gold from November 1, 2002, to November 2, 2015, a total of 3165 closing price data, will be analyzed. The data comes from the RESSET database. Because logarithmic returns on financial assets are very close to their percentage change rate, this paper uses logarithmic returns in data analysis. For logarithmic conversion data, they are usually very small, then 100 times of the logarithmic return data are used to fit the presented model. In this paper, the data are divided into two phases. The first stage is from November 1, 2002, to December 31, 2013 (2719 observations). This phase is used as a sample to estimate the parameters of the model. The second stage is for the remaining 445 observations; the model is used to evaluate the test. The statistical analysis of the income sequence, as shown in Figure 2, is as follows.

Figure 2 shows the timing chart of the gold price logarithmic return sequence. It can be seen that the gold market has significant volatility aggregation; that is, one large fluctuation of the return sequence during a certain period of time follows another fluctuation, among which, the left side of the reference line is the parameter estimation sample, and the right side is the prediction training sample.

Figure 3 is the - plot of the gold price logarithmic return series. The solid line is the standard normal distribution reference line. The slope of the straight line is standard deviation 1 and the intercept mean value is zero. It can be seen from the figure that the scatter point does not completely fall near the reference line or even partially deviate from the reference line, indicating that the sequence does not obey the normal distribution.

Table 1 gives the description of the statistical characteristics of the logarithmic yield series of gold. The mean logarithmic yield of the gold sample is 0.0323, the standard deviation is 1.1054, the skewness is −0.4215, and the kurtosis is 10.42138, which is obviously higher than the kurtosis value of the normal distribution 3, which shows that the logarithmic yield series has obvious peak and thick tail characteristics; JB normality test also confirmed that distribution of the data is significantly different from normal distribution; that is to say, the gold yield is not satisfied with normal distribution. So other distribution functions are needed to better reflect the characteristics of the sample data.

In Table 2, ADF is used to test the stability of the gold yield sequence. The value is −58.44200 and the value is 0.0001. Therefore, it can be considered that the sequence of profit does not have unit root and is stationary. Based on the above descriptive statistics, this paper chooses the SV model to characterize the “peak-tail” volatility of the yield.

4.2. SV-T-MN Model Fitting

First of all, for the unknown parameters’ estimation of SV-T model, the Gibbs sampling of the MCMC is 30000 times, and the first 15000 “burning periods” are discarded because the marginal distribution of the states can not be considered stable until the Markov converges, so after the 15000 sampling values for the parameters of the estimated stable distribution sampling, the parameter estimation results shown in Table 3.

It can be seen from Table 3 that the estimated values of in the SV-T and SV-N models are 0.978 and 0.910, respectively. Obviously, the volatility persistence of the income sequence is strong. The standard deviations and the Monte Carlo errors of all parameters are small, and the parameter estimates are considered to be valid. The value of the degree of freedom is 7.184, indicating that the distribution of the yield does not obviously obey the normal distribution and has strong tail-tail characteristics. After obtaining the estimate of the degree of freedom of parameter , the density function of logarithmic -squared distribution is known, then the samples with the sample size of 10000 are generated based on the density function, and the Gaussian mixture approximation parameter of logarithmic -squared distribution is estimated by EM algorithm. The optimal parameters are shown in Table 4.

Table 4 gives the Gaussian Mixture Approximate Parameter Value for the normal factor = 7. When the convergence error is 0.01, the convergence iterations and the log-likelihood estimates are 758 and −788973.622, respectively. Where the proportionality parameter can be interpreted as the logarithmic -square distribution of the innovation process approximately % from the th normality factor, the mean factor and the variance of the normal factor are and , respectively.

4.3. Modeling of Standard Residuals Fitted to GPD

In the POT model based on GPD, the first step of parameter estimation is to determine the appropriate thresholds for the standard residuals of independent identically distributed distributions. In this paper, the threshold is determined using the excess mean function graph. Experience has shown that the number of transcendences can be made about 5% of the total number of samples by selecting the threshold [35]. The uppermost mean function is linear with respect to when is greater than a certain threshold. Figure 4 has shown that the reference line is = 1.328, and when > 1.328, the graph of excess mean function shows a linear upward slope, so select the threshold = 1.328 which is reasonable, and then get the estimated parameters of the standard residual tail GPD distribution through maximum likelihood estimation method, which is shown in Table 5.

Figure 5 shows the diagnostic plot of the negative logarithmic rate of return of the gold price fitted to the GPD, which is the excess volume distribution (a), the residual scatterplot (c), the tail distribution (b), and map (d). Figure 5(a) shows the situation of the GPD distribution fit for the excess volume distribution, Figure 5(c) shows the distribution of the tail estimates, and the solid lines are reference lines in both two figures. For Figure 5(b), the fit curve crosses the scatter plot, while the scatter plot in Figure 5(d) encircles the rectilinear distribution, both reflecting a good fit.

4.4. Back to the Test

Extreme value model is a kind of statistical forecasting model based on certain statistical parameters and distributions of historical data. Different parameter estimation methods and sample selection will lead to different prediction results. So it is very important to verify the accuracy of extreme model for risk management. Back to the test is a statistical test method.

Nielsen (2014) [34] proposed a likelihood ratio test based on failure rate and construct a likelihood ratio (LR) statistic:

In the above formula, is the number of days to be inspected, is the number of failures, the failure frequency is , and the expected probability of failure is . The original hypothesis is , so that the assessment of the accuracy of the VaR model is translated into the test failure frequency which is significantly different from . Under the null hypothesis, the statistic LR obeys the distribution with the freedom of 1, and the lower the number of failures in the confidence domain, the better the prediction. But the number of failures is too low, which means the model is too conservative. Followed by a confidence level of 0.95, 0.975, and 0.99 for the posterior test, the test results are shown in Table 6.

The results of VaR return test for two different models are shown in Table 6. The failure times of extreme value distribution model are all within a reasonable region. The SV-N model does not predict the failure times of VaR results. Therefore, it is considered that the dynamic VaR test model based on EVT-POT-SV-T-MN model is reasonable and effective for gold market risk measurement.

Figure 6 shows VaR of the yield under the assumption of different distributions at the 95% confidence level. For a more clear comparison of the observations, Figure 7 will enlarge the part of the image in Figure 6, and we can see that the residual sequence is directly obtained by using SV-N. The extreme value theory for fitting GPD to estimate the VaR of the residual series is more accurate.

5. Conclusions

In this paper, SV-T-MN model is selected to describe the characteristics of volatility persistence and “peak thick tail” of financial income series volatility. And an out-of-sample prediction of high precision volatility is obtained by an approximate filtering algorithm. Finally, a new financial risk forecasting model based on the EVT-POT-SV-T-MN dynamic VaR model is established, which is combined with extreme value theory. Although the calculation of the presented model is complex, it can be resolved through the existing statistical software. Through the Shanghai Gold Exchange AU99.99 daily closing price data, empirical analysis is done, and the research results show the following. The proposed model can accurately describe the volatility characteristics of the gold market and effectively measure and forecast the risk value of the gold market. At the same time, the posterior test results show that the combination model is more effective and reasonable than the VaR model based on SV-N. The new method proposed in this paper can greatly improve the accuracy of forecasting financial market risk, and it is beneficial to the deep and comprehensive management of financial risks. It also provides a more practical solution for the risk measurement of most financial assets with heteroskedasticity, stochastic volatility, thick tail, and other characteristics.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by National Social Science Research Foundation (Grant no. 15CJY006).