Journal of Probability and Statistics

Volume 2019, Article ID 7691841, 6 pages

https://doi.org/10.1155/2019/7691841

## Bootstrapping Nonparametric Prediction Intervals for Conditional Value-at-Risk with Heteroscedasticity

Department of Mathematics, Pan African University, Institute of Basic Sciences, Technology, and Innovation, Kenya

Correspondence should be addressed to Emmanuel Torsen; gn.ude.hcetuam@nesrot

Received 24 December 2018; Revised 16 April 2019; Accepted 23 April 2019; Published 7 May 2019

Academic Editor: Dejian Lai

Copyright © 2019 Emmanuel Torsen and Lema Logamou Seknewna. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Using bootstrap method, we have constructed nonparametric prediction intervals for Conditional Value-at-Risk for returns that admit a heteroscedastic location-scale model where the location and scale functions are smooth, and the function of the error term is unknown and is assumed to be uncorrelated to the independent variable. The prediction interval performs well for large sample sizes and is relatively small, which is consistent with what is obtainable in the literature.

#### 1. Introduction

The field of prediction intervals (PIs) has spanned for so many decades and could be traced back to the work of Baker in 1935 (see [1] for details). Three most often used intervals in statistics, namely, the confidence interval, the prediction interval, and the tolerance interval, were reviewed by [2]. A single future observation from a population that is contained in a predictive interval is usually specified with a coverage probability; such a future observation is assumed to have a particular distribution in statistical modeling. The distribution could have a finite number of unknown parameters (e.g., normal distribution, t distribution), and hence if those parameters are correctly estimated, then the prediction interval is obtained. Clearly, this methodology relies heavily on the underlying distribution of the future observation. The question is, what happens if the distributional assumption is not correct? The prediction interval may be incorrect. Optionally, the nonparametric methodology could be used to overcome the above-mentioned problem [3], which is the focus of this paper. In the prediction intervals literature, a good review is given on when the future observation is not dependent on the sample under consideration for both parametric and nonparametric methods (see [4] for details).

In their paper, the authors in [5] examined the problem of constructing NPIs for a future sample median, where asymptotic relative efficiency was performed to compare the NPIs and their parametric prediction intervals (PPIs) counterpart; the results showed NPIs to have better properties. Reference [3] studied the problem of constructing NPIs for future observations for mixed linear models, what the authors therein called distribution-free PIs in mixed linear models. They assumed that the future observation is independent of the present ones. The authors showed that, for standard mixed linear models, construction of PIs works well based on regression residuals. However, for the nonstandard mixed linear models case, it is more involving since one has to estimate the distribution of the random effect. Also, [6] considered the problem of constructing NPIs for a nonparametric autoregressive (NAR) model structure; the authors compared two strategies, namely, the Gaussian and Bootstrap strategies. The latter was found to outperform the former.

Future values (observations) forecasting is one of the most well-known concepts in time-series modeling. To check the forecast accuracy, the error of prediction is defined and treated as a measure of uncertainty of the forecast. Construction of PIs for future observations in general is a closely related problem to the one mentioned above [6].

Value-at-Risk (VaR), as defined by [7], estimates the maximum worst loss that could happen in a specified period of time and confidence level. The importance of VaR as a risk measure in the financial industry cannot be overemphasized; it is currently used globally by many financial (e.g., banks, investment funds, and brokerage firms) and even nonfinancial companies.

Hence, in this paper, we have considered the problem of constructing NPIs for Conditional Value-at-Risk (CVaR) which admit a location-scale model with heteroscedasticity, when the distribution of the innovation is assumed to be unknown using bootstrap method.

#### 2. The Nonparametric Predictive Intervals (NPIs) for CVaR

We assumed that the sequence satisfies a certain weak dependence condition. Precisely, we consider the concept of strong mixing coefficients by [8].

*Definition 1 ( (strong mixing)). *Let be the of events generated by for . The coefficient introduced by [8] is defined asThe series is said to be ifThe dependence described by the is the weakest as it is implied by other types of mixing.

*Definition 2 (pivotal quantity (pivot)). *Let be random variables with unknown joint distribution , and let denote a real-valued parameter. A random variable is a pivot if the distribution of is independent of all parameters. That is, ; then has the same distribution .

Consider the function estimator in (5); the asymptotic distribution of a pivotal quantity is used to construct confidence intervals (CIs). Let us define to be the pivotal statistic given aswhere is the variance of the function estimator defined in (26). Similarly, one can construct PIs by simply replacing the standard deviation by the standard deviation of the prediction (see [6] for details). The problem of bias correction is discussed in Section 2.2, where we talked about how to correct the bias.

##### 2.1. The Location-Scale Model Structure

Consider to be a stochastic process representing the returns on a given stock, portfolio, or market index, where indexes a discrete measure of time and denote the conditional distribution of given . The vector normally includes lag returns , , for some , as well as other relevant conditioning variables that reflect economic or market conditions.

Here, we assume that processes admit a location-scale representation given aswhere is the unknown nonparametric regression curve and is a conditional scale function representing heteroscedasticity, defined on the range of , is independent of , and is an independent and identically distributed (iid) innovation process with , , and distribution function is assumed to be unknown.

From (3),where is the conditional quantile associated with and is the quantile associated with unknown .

It follows from (4) thatThe asymptotic properties (6) have been studied in [9]. Furthermore, the estimators and in (5) have been studied extensively in [10, 11] (see the aforementioned articles for details). See also [12] for detailed discussion on the estimator in (5). Nevertheless, we briefly mentioned the results of the work by the authors mentioned above; [10] considered the problem of estimating in (4) the same as Local Linear Regression (LLR), estimating the intercept . Suppose that the second derivative of exists in a small neighborhood of ; thenNow, let us consider a sample and LLR: find and to minimizeLet and be the solution to the Weighted Least Square (WLS) problem in (7). ThenwhereandThus, the LLR estimator for is defined aswhere the mean, variance, and bias of (11) areandNext, we briefly state the estimation of in (4); the procedure proposed by [11] is outlined below; the estimator of iswhereNow, with the estimator in (12), we have the sequence of squared residuals . Therefore,whereandHence, the smooth estimator for iswith mean, variance, and bias given asandThe estimators in (12) and (20) are then used to get a sequence of standardized nonparametric residuals (SNR) , whereThe conditional cumulative density estimator of is obtained based on these SNR:which is the unconditional cumulative distribution estimator for , the error innovation (see [12] for more details).

Now [12] estimated by , the sample -quantile of :To construct the NPIs discussed, we compute the mean and variance of the estimator in (5).

The mean isFor simplicity, we used the variance of the variance function instead of its standard deviation (6) and the variance of the estimator iswhereThe covariance function in (28) is zero because the arguments are mean and variance functions, respectively, and, hence, are independent (see [13]).

Next, we state the asymptotic distribution of the Local Linear Regression Estimator (6). Assume that is the density of the lag vector of returns at point . Then, by the central limit theorem, the asymptotic normal distribution of (6) is given asLooking at the asymptotic bias (27) shows that the second derivatives of the mean function and the variance function have to exist for the asymptotic normal distribution of (6) given in (30) to hold in the neighborhood of . Hence, we assumed that ; we also assume that and are both continuous and since both are in (28).

The estimation of second-order derivatives is required for consistent bias estimate and, consequently, may lead to large variability. Hence, as opined by [6], it is sensible to compute NPIs without the bias correction, in our case (26).

With the proposed estimator of the residual distribution (23) by [12] and the estimators ((12) and (20)) of the mean and variance functions with their asymptotic properties, respectively, and the asymptotic normal distribution of (6), one can obtain NPIs for CVaR.

##### 2.2. Bootstrap Method

This strategy consists of estimating the distribution of the pivotal quantity given below:using the bootstrap method. The distribution of (31) was approximated by the distribution of the bootstrapped statisticswhere denotes the bootstrap counterparts of the estimates. Hence, we have the following nonparametric prediction intervals with asymptotic coverage probability.The algorithm for the bootstrap procedure is given as follows:

*Algorithm (Bootstrap)*(1)Generate data sets from the unknown probability model of the data generation process in (37), with independently identically distributed random errors form some unknown probability distribution function (pdf) .(2)Calculate and . The estimated errors are as in (23), from which there is the estimator in (24).(3)Compute for each process , (4)Compute the average function given by(5)The standard error between the curves is(6)The lower and upper bounds of the NPIs at level are therefore given by * *where is the quantile for the empirical cumulative distribution function of the error in (24) above. For instance, .

#### 3. Simulation Study

To illustrate the NPIs, we conducted a simulation study considering the following data generating location-scale model:whereand and are set to zero (0) initially; then is generated recursively from (37) above. The data-generating process was also considered by [14] and also used by [12]. We simulated observations and samples for the purpose of bootstrapping.

As can be seen in Figures 2 and 3, the nonparametric prediction intervals by bootstrap method for Conditional Value-at-Risk perform well. Clearly, the 95% Conditional Value-at-Risk is contained within the prediction bounds. Figure 1 shows the time-series plot of the simulated daily returns generated using the data-generating process in (37). The procedure for NPIs discussed in this paper works well for large sample sizes.