Abstract

In this paper, we propose an extreme conditional quantile estimator. Derivation of the estimator is based on extreme quantile autoregression. A noncrossing restriction is added during estimation to avert possible quantile crossing. Consistency of the estimator is derived, and simulation results to support its validity are also presented. Using Average Root Mean Squared Error (ARMSE), we compare the performance of our estimator with the performances of two existing extreme conditional quantile estimators. Backtest results of the one-day-ahead conditional Value at Risk forecasts are also given.

1. Introduction

Correct specification of a loss/returns distribution is key to the accuracy of a risk measure such as Value at Risk. As noted in [9], the major difference among many estimators of Value at Risk lies in estimation of the distribution of returns. Complexity in modeling financial data is due to its failure to exhibit standard statistical properties such as normality, independence, and identical distribution [18]. Statistical tests have revealed that returns exhibit fat-tails, time-varying volatility, and volatility clustering. Moreover, [7] showed that returns exhibit serial correlation over long time horizons. Models based on mean autoregression coupled with results from extreme value theory such as the AR(1)-GARCH(1,1) model in [30] incorporated most of the aforementioned characteristics of financial data but suffer from lack of robustness due to the effect of extreme observations on the mean. Extreme quantile autoregression in [2, 24, 25] among others leads to a more robust model. This is because they combine regression quantiles introduced by [17] in an autoregressive fashion while using extreme value techniques on the resulting residuals to capture the tail behaviour. A major challenge of this approach is possible quantile crossing.

The challenge of quantile crossing has been addressed by smoothing suggestions in [5, 6, 16] among others in a nonparametric setting. Equally, the conditional location-scale model used in obtaining Restricted Regression Quantiles (RRQ) in [11] averts possible crossing in extreme quantiles but can suffer from the same when estimating the median. To avert quantile crossing even at the middle, [1] added a forced ordering constraint in the estimation of multiple quantiles. Simulation results revealed that, based on standard error of the estimates, noncrossing quantile regression in [1] produces better estimates in the middle. However, the RRQ estimator produced better estimates at the tails, especially when the sample size was large.

The lack of monotonicity in estimation of conditional quantiles is addressed in [3] through sorting of originally estimated nonmonotone quantile curves using a functional delta method. The monotonic quantile functions obtained were found to be closer to the true quantile than the nonmonotonic quantiles. Function limit theory for the rearranged estimators was also derived. The resulting monotonic quantile functions were then used in estimating economic functions using Vietnam data. However, the model was not extended to extreme cases to cover heavy tails beyond the sample.

Parametric quantile regression is used in [27] to estimate percentiles in positive valued datasets. Specifically, a linear quantile regression model was used with the error term assumed to follow a generalized gamma distribution. The idea of quantile regression was achieved by allowing parameters for the error distribution to depend on the univariate covariate leading to a location-scale model. The four-, five-, and six-parameter generalized conditional gamma distributions were considered and likelihood ratio test was used in selecting the best-fit model for each dataset. Asymptotics for the three resulting models were also derived. However, the use of generalized gamma distribution limits the model to applications where the covariate is greater than zero. This together with the fact that some financial datasets have heavier tails than the Gamma distribution limits the application of the model in finance.

We seek to improve the extreme conditional quantile estimator in [25] using an interquantile dispersion from the central conditional quantile.

2. Methods and Estimation

Let be a real valued financial time series on a complete probability space . We assume that and it is -measurable where is an increasing sequence of -algebras representing information available up to time t. In particular, let be the value of a portfolio at trading time t. The return on the portfolio at time t, used to quantify the gain in value of the portfolio from trading time to trading time t, is given byso thatis the corresponding loss return of .

Definition 1. (Risk Measure). A risk measure is a function from a set of risks in a financial position (in this case, the loss distribution) to ; that is, .
We assume that can be expressed using a linear heteroscedastic model of the following form:where is the conditional mean function of given and it is defined as . are errors and is a d-dimensional process which is -measurable. In particular, has 1 as the first element and a collection of the last observed returns up to time ; that is, . To ensure that the model is smooth and obeys some of the financial norms such as clustering of shocks, we further assume that can be decomposed intowhere are independent and identically distributed random variables and is the conditional volatility. In this case, is said to assume a location-scale model of the formThe corresponding quantile of under this formulation is given bywhere is the -quantile of . Let us now define a conditional quantile autoregressive model on of the formwhere is the central conditional -quantile of and are errors with zero -quantile. Let , where is the central conditional scale of and are i.i.d. residuals. Using an approach similar to Points Over Threshold (POT), we propose an extreme conditional quantile of the form given in equation (8) and refer to it as the adjusted extreme conditional quantile. That is, suppose that we are interested in an extreme quantile, , for some ; the idea is to estimate the central quantile, , and scale , for some level in the middle and approximate the extreme quantile aswhere are the -quantile of , respectively, for . If the parametric distribution of is known, then and are easily determined as the inverse of the cumulative distribution of at probability levels and , respectively; otherwise, appropriate estimates are determined. Note that is in conditional quantile autoregressive form. Observe that when , equation (8) reduces towhich is the central conditional quantile of given . This confirms that indeed has a zero -quantile. From equation (8), we obtain the following estimator for the extreme conditional quantile:where and are appropriate estimates of the -quantiles of the i.i.d. residuals, respectively. We compare this estimator withproposed in [25], andwhere is the resulting coefficient from quantile regression of against at . Note that is the estimator proposed in [11].

2.1. Estimation of Central Quantiles

Let be an unknown smooth function and define the loss function :where and represent absolute positive and negative values, respectively, and is the indicator function. Assuming that the conditional quantile process is well defined, we expectSo the -conditional quantile of is given by

Note that equation (14) can be used to check whether the conditional quantile process is correctly specified or not. We impose the following regularity assumptions to ensure consistency of the conditional quantiles.

Assumption 1. are identically distributed with the joint probability density and a continuous conditional probability density of given .

Assumption 2. There exists such that .

Assumption 3. , where is a compact subset of .

Assumption 4. , where is the conditional probability density of given .
From the sample analog of equation (15), we obtainwhich is the -conditional quantile estimator for a sample of size n. To overcome the limitation of quantile crossing, we used the approach in [1] where required quantiles were estimated simultaneously with a noncrossing constraint using the optimization problem:for some weight function . A conveniently used practical choice of the weight function which was also adopted for this study is .

Lemma 1 (consistency central quantiles’ estimator). Given that Assumptions 1, 2, 3, and 4 hold, .

2.1.1. The Scale Function

To still maintain dependence and ensure positivity of the scale function, this study incorporated a scale function in the form of a quantile autoregressive (QAR) function on the absolute of the nonstandardized residuals. This was achieved by replacing with its corresponding estimate in equation (7) so that andwhere is the central conditional -quantile of , is the conditional scale of , are i.i.d. residuals, and are errors with zero -quantile. Similarly, as is the case in Section 2.1, we let be an unknown smooth function and define the loss function . We assume that the QAR process in equation (18) obeys the four regularity assumptions given earlier so that, using noncrossing quantile regression approach, we obtain the following estimate of the scale:where is a loss function of the form given in equation (13).

Lemma 2 (consistency of the scale estimator). Given that the QAR process defined in equation (18) satisfies Assumptions 1, 2, 3, and 4, .

2.2. Extreme Value Theory (EVT)

Most of the financial datasets are heavy-tailed [12]. Therefore, it is fundamentally important to incorporate extreme value theory in the estimation of extreme quantiles. A basic requirement for the application of EVT is independence in the particular distribution. The study in [25] observed that it is appropriate to assume that, at high (low) levels of , the standardized residuals in equation (7) given bywhere is the estimate of the scale, are approximately independent, which allows us to apply EVT. Let , follow a common distribution function . Consider a sample from which and is such that (a.s) where . Pursuant to Fisher–Tippett’s theorem in [8], the random variable (or alternatively the distribution of ) is said to belong to the Maximum Domain of Attraction (MDA) of the extreme value distribution if there exist norming constants and such that

Consequently, is referred to as the Generalized Extreme Value (GEV) distribution.

This study applied Points Over Threshold (POT) method because it uses more data leading to better estimates compared to the Block Maxima method. POT models the distributionof all excesses above a particular threshold , where .

Theorem 1 (Pickands–Balkema–de Haan) (see [22]). We can find (positive-measurable function) such thatif and only if , and is the Generalized Pareto Distribution (GPD) given bywith when and when . and are the shape and scale parameters, respectively.

A major challenge in POT is accuracy in choosing a threshold to separate extreme observations from the center of the distribution [30]. Among the methods discussed in [32], most authors such as those of [21, 28, 30] prefer the conventional method in which a threshold that ensures that between 5% and 10% of the sample data is classified as extreme observations is chosen. Although conventional method is subjective, the choice of the threshold can be checked for appropriateness using a mean excess plot. The mean excess function for the GPD is given bywhere . This implies that an optimal threshold corresponds to start of approximate linearity of the mean excess plot with the sign of the slope, , indicating the specific family of the GPD. A positive sign corresponds to the Frechet family, while a negative sign implies the Weibull family [8].

For convenience in making inferences on variability of the estimated quantiles, a recommendation in [13] on the use of Probability Weighted Moments (PWM) method in estimating parameters of the GPD was adopted. Using the first and second PWMs, we obtain the corresponding parameters estimates aswhere and are obtained by replacing for and inwhich is the PWM of GPD with . See [10, 23] for details on PWM method. For a sample of size n, the corresponding PWM estimates are given bywhere is the ordered sample and for suitable constants and . As recommended in [20], and .

The overall distribution of the standardized residuals was obtained by splicing the GPD with the empirical bulk distribution at the threshold using the approach in [21, 29, 30] among others to obtainfor and the Generalized Pareto Distribution. When is approximated empirically, we obtain the following estimate of :where is the sample of size, is the number of exceedances above the threshold , and together with are the estimated GPD parameters.

Lemma 3 (consistency of Probability Density Function (PDF) estimator). Let be i.i.d. random variables from a Cumulative Distribution Function (CDF) belonging to the MDA of . Suppose that has a right endpoint at ; then as .

From equation (30), we get the following estimate for the quantile of the standardized residuals at level :

Lemma 4 (consistency of error quantiles). Let be i.i.d. random variables from a CDF F belonging to the MDA of satisfying for any . Then, for every and

Using equation (10), we obtained the one-step VaR predictions aswhere and are the corresponding one-step -quantile and scale estimates, respectively, from the linear conditional quantile process.

Theorem 2 (consistency of extreme quantile estimator). Given that the QAR processes defined satisfy Assumptions 1, 2, 3, and 4 and belongs to the MDA of , .

3. Simulations

To evaluate the accuracy of our estimators, a sample of size T = 4250 was generated using the model , where follows Student’s t-distribution with 4 degrees of freedom. The sample was partitioned into design data of size n and test data of size . Figure 1 represents a sample path of the model superimposed with the median. Note that, for simulation purposes, was used in estimating the central quantile.

Clearly, from the sample path, there is some level of volatility clustering, which is common in most financial data. An ACF plot of the resulting standardized residuals in Figure 2 confirms that indeed they are independent.

was chosen as the threshold to ensure that 10% of the resulting ordered standardized errors were classified as extremes. This was confirmed by approximate linearity of the mean excess plot after the threshold as shown in Figure 3.

The corresponding shape and scale parameter estimates from the GPD fit were 0.1156182 and 1.272937, respectively. Table 1 outlines the sample statistics of the estimates of the various quantiles together with the data.

Figure 4 shows the corresponding quantile estimates at different levels of .

The accuracy of our extreme quantile estimator was evaluated using the Average Root Mean Squared Error (ARMSE). The RMSE seeks to return the Mean Squared Error (MSE) to the original scale of the sample. For k sample paths of simulations of size n of the extreme quantiles, the average RMSE is given bywhere is replaced by or when considering extreme conditional quantiles or restricted regression quantiles, respectively. To check how our model behaves under different choices of central quantile, we computed ARMSE for the extreme conditional quantile at , where were considered. 1000 sample paths, each of sizes 250, 500, 1000, 2000, and 4000, were used in the computation of ARMSE. Table 2 reports the obtained ARMSE values.

We note that, for a large enough sample (2000 observations and above), ARMSE is lowest when and thus this choice of theta is maintained in investigating the accuracy of our estimator and forecasting the one-day-ahead VaR. Table 3 outlines the obtained ARMSE for the extreme conditional quantile at under three different models. The sample sizes and number or replications are still maintained.

Based on ARMSE, both RRQ and ECQ perform better than AECQ for small samples. However, as the sample size increases, AECQ outperforms both RRQ and ECQ. The decreasing ARMSE with increase in the sample size for AECQ and ECQ confirms that both are consistent estimators of the extreme conditional quantile. Also, for sample size above 2000, the rate of convergence of the AECQ estimator is higher than that of the ECQ estimator. It was not possible to comment on the consistency of the RRQ estimator, since its ARMSE fluctuated with increase in sample size. The consistent reduction in ARMSE when noncrossing constraint is added during estimation, confirming that indeed this constraint increases accuracy of resulting estimators.

4. Evaluating VaR Forecasts

In Section 3, we evaluated accuracy of our in-sample quantile estimates. In this section, we extend this by evaluating the out-of-sample VaR forecasts from our quantile estimator. To achieve this, we carry out backtests on 250 one-day-ahead VaR forecasts (as recommended in the Basel Accord) using coverage tests in [4, 31]. Coverage tests were adopted due to their popularity in literature and practice [26]. Consider the failure process , where I is the indicator function such thatand . By Lemma 1 in [4], which is tested using the conditional coverage test that combines both the unconditional coverage test in [19] and the test for independence in [4] to under the null hypothesis . The likelihood under the null hypothesis iswhere is the number of VaR exceedances and . Now consider the first-order Markov chain generated by the transition probabilities of :where . This has an approximate likelihood function:where is the number of times observation i is followed by j in the failure process . From equation (38), we obtain the maximum likelihood estimate of as

Therefore, the conditional coverage hypothesis can be assessed using the likelihood ratio:

Table 4 reports the obtained values for the likelihood ratios of the three tests in [4]. The tests were conducted on 250 one-day-ahead 5% VaR forecasts from the three considered models.

denotes likelihood ratio for the unconditional coverage test. denotes likelihood ratio for the independence test. denotes likelihood ratio for the conditional coverage test. Models accepted at 5% level of significance are highlighted in bold. Note that is size of the sample used in estimation, while testing was done using a sample of size 250 for all n.

Observe that, as a consequence of consistency, the accuracy of ECQ and AECQ forecasts improves with increase in sample size. It can also be seen that all the three models perform poorly under and due to dependence in autoregression. Based on , the RRQ model performed poorly when it comes to forecasting. This can be attributed to the failure to incorporate extreme value theory in estimating residual quantiles in the RRQ model.

5. Conclusions and Recommendations

We have derived the extreme conditional quantile estimator and used it to obtain the one-step-ahead conditional Value at Risk forecast for a simulated financial distribution. Consistency of our estimators has been proved and illustrated through Monte Carlo simulations. We noticed that adding the noncrossing restriction during estimation improves accuracy of the resulting extreme conditional quantile estimator. Backtesting results from the one-step-ahead conditional Value at Risk forecasts indicate that independence and conditional coverage tests in [4] are not appropriate for our estimators due to dependence in autoregressive models.

6. Proofs

Proof of Lemma 1. Let , , and . Note that, by Assumption 1, does not depend on . Since does not depend on , then . We need to show that the objective function satisfies the following conditions for application of Theorem 12.2 in [34]:(1) is measurable for each (2) is continuous on almost surely(3) a measurable function such that(i)(ii)(4) has a unique minimum at The functional form of and Assumption 3 guarantees measurability of . To prove condition (2), we first show that is Lipschitz continuous. By definition,Considering the possible range , we have the following:(i)For , it follows that and ; hence, equation (41) reduces toSince and , we haveand so is bounded above by either or .(ii)Similarly, when , ; hence,(iii); then ; hence,Combining equations (43)–(45), we haveThus, is Lipschitz continuous for and hence is differentiable almost everywhere by Rademacher’s theorem in [14] which implies that is continuous everywhere.
To prove Condition 3(i), let , where . Existence of is guaranteed by Assumption 3. Clearly, is measurable and . Assumption 2 ensures that Condition 3(ii) is satisfied.
To verify Condition 4, let ; we need to show that for any . By Knight’s identity in [15], we haveThus, for all by monotonicity of the CDF. Therefore, by Theorem 12.2 in [34].

Proof of Lemma 2. The proof proceeds in a similar way to the proof of Lemma 1.

Proof of Lemma 3. Observe thatNote that as by Glivenko-Cantelli theorem in [33] and as heuristically from consistency of GPD parameters, Lemma 4.1 in [25], and asymptotic normality of PWM estimators in [13]. Therefore, since and , we have the result.

Proof of Lemma 4. Let and and note that, for any CDF F defined on , if , thenwhich tends to 0 as by Lemma 3.

Proof of Theorem 2. Combining Lemma 1, Lemma 2, and Lemma 4, we have the desired result.

Data Availability

The data used in the article were simulated, and the data generating process (DGP) is included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors acknowledge the Pan-African University of Basic Sciences, Technology and Innovation (PAUSTI) for funding this research.