Abstract
The use of a BDF method as a tool to correct the direction of predictions made using curve fitting techniques is investigated. Random data is generated in such a fashion that it has the same properties as the data we are modelling. The data is assumed to have “memory” such that certain information imbedded in the data will remain within a certain range of points. Data within this period where “memory” exists—say at time steps —is curvefitted to produce a prediction at the next discrete time step, . In this manner a vector of predictions is generated and converted into a discrete ordinary differential representing the gradient of the data. The BDF method implemented with this lower order approximation is used as a means of improving upon the direction of the generated predictions. The use of the BDF method in this manner improves the prediction of the direction of the time series by approximately 30%.
1. Introduction
In this brief note we show how a BDF method can be used as a corrector for predictions. BDF methods are backward differentiation formulae which are a family of multistep implicit methods. They are designed to solve initial value ordinary differential equations. The derivative of a function is approximated using information computed from earlier time steps, thereby increasing the accuracy of the approximation. This characteristic makes BDF methods ideal for our purposes where we seek to improve upon already existing data in the form of predictions. The predictions aim at accurately reflecting the direction of random data and are made using curve fitting techniques. This research comes out of a project undertaken to predict the direction of a subset of the South African market. The approach taken in analysing the data assumes that the data has a “memory.” More precisely this presupposes that a time series will have certain periods when the data has the same inherent information and dynamics. This allows us to conclude that the same information embedded in a previous selection of data points is still contained within the data point to be predicted from that set. While the data we generate in this paper is random with zero mean we are still able to show how an application of a BDF method improves the degree of accuracy in predicting the direction the data takes. The BDF formulae are constructed by satisfying the differential equation exactly at one point and then interpolating previous points. A Lagrangian interpolation is typically used. The initial prediction of direction is made using linear/spline curve fitting. The implementation of the BDF is not done directly; rather it is combined with a lower order approximation of the gradient of the data vector which leads to the difference equation we aim to use. The novelty of the approach taken here is that we iterate the difference equation structured from the BDF formula and the lower order approximation of the gradient to convergence.
The random walk hypothesis has had its fair share of attention as a means of explaining stock price movements. This financial theory states that stock market prices evolve according to a random walk and thus the prices of the stock market cannot be predicted. While the work undertaken in this paper concurs with this theory with regard to the random walk followed by the relevant data, we still maintain an assumption that there exists embedded information within the relevant data which can be modelled. Thus we assume a consistent underlying dynamic which can be identified and used as a means of extrapolating beyond known data points, that is, predict future movements in market prices. Theoretical developments in mathematics of finance have centred around the random walk hypothesis [1, 2] and the fact that the market cannot be predicted when using this hypothesis. Various modifications to this hypothesis have been proposed with the development of the theory of Martingales [3, 4]. The validity of the random walk hypothesis has been questioned in many studies. Most prominent among these is the book by Lo and MacKinlay [5].
While the mathematical theory used to develop the mathematical aspects of finance has not really focused on predicting returns there has been a strong interest in developing tools which can give a sense of the direction the market is going to take or when possible turning points will occur. The inclusion of ideas from the social sciences in financial mathematics has heralded the potential development of tools that can be used to aid the prediction of market trends. Among these ideas are aspects of behavioral science [6, 7] which studies the influence of psychology on the behaviour of financial practitioners and the subsequent effect on markets [8]. This theory suggests that, since the irrational behaviours of traders impact price movements, a time series of prices contains information which does not reflect what could be termed logical or mathematical dynamics. Behavioural science brings our attention to the possibility that “noise” may have been incorporated into the data obtained from price movements which makes it difficult to determine some identifiable characteristics which can be used to predict future movements. This theory is related to the random walk hypothesis in the sense that both indicate some irrational behaviour in the data. In this paper we have assumed that there is however some rational underlying dynamics which can be investigated mathematically which allows us to make predictions of future price movements. In some sense the impact of “noise,” due to the irrational behaviour of traders, on price movements is an obstacle which we believe we have overcome by being able to improve upon the direction of our predictions. Other tools considered as aids for predicting market trends are notions of overreaction, underreaction, and contrarian strategies [7, 9–12]. Berman [13] has attempted one of the first studies to analyse the Global Real Estate Securities market using aspects of these contrarian ideas.
The paper is set out as follows. In Section 2 we develop and motivate the algorithm. Convergence properties of the algorithm are discussed in Section 3. Results and concluding remarks are presented in Section 4.
2. Algorithm Description
The first part of this analysis is to generate random data that may be used to simulate an actual financial time series. Here we use the MATLAB function randn that generates pseudorandom scalar values drawn from a normal distribution with mean zero and standard deviation one. The code used to generate this data is presented in Algorithm 1. We start out with an element that has value zero and then start stepping along the axis. The direction in which we step is dependent on whether the number outputted by randn is greater or less than a half. The magnitude of the step is determined by the magnitude of the number outputted by randn. We continue stepping in this way a finite number of times; 100 steps were chosen in this instance. The last data point is put into a vector which we use to simulate our times series of returns. Only the last point is chosen since the vector which has just been created consists of points with relative small errors between them; that is, this vector is not a good representation of the data we are trying to simulate. We continue running the random walk until we have a vector of returns. We subtract the mean and divide by the variance in order to produce data that is more reflective of financial returns obtained from price movements, that is, data within the range . The return data generated in this way has zero mean with standard deviation smaller than one. The standard deviation is not one since the data has not been normalised; that is, we divided by the variance and not the standard deviation. This was done in order to maintain the strict range of for the data. In Figure 1 we plot three simulations of return data obtained in this way. When we compare the generated data to an actual times series of returns from the South African market we find that our simulation is fairly accurate since the movements exhibited by the returns of the South African market are similar to those depicted in Figure 1.

We then choose an appropriate length of data to indicate what we term “memory” in the data. For the purposes of this paper we assume that using four initial data points is sufficient to account for the memory. Given the fact that the behaviour observed in Figure 1 is highly oscillatory a short memory span of four points seems sensible for a data series of 100 points as was chosen here. This indicates that any rational underlying dynamics are not observed in the long term but rather in the short term. We then curvefit through the initial four points of the vector which represents our actual known data points, , , , and . We use this fitted curve to predict the value of the fifth point, . We then use data values two to five, , , , and , to predict the sixth value, . Continuing in this way we end up with a vector of actual known values and a vector of predicted values—four data points shorter than the original.
Our aim is to predict—and improve upon—the direction of the data, that is, whether the quantitative value is positive or negative, which is indicative of whether market prices are moving up or down. Since most forecasting systems are far from accurate when predicting the quantitative value, predicting direction is far easier and can be equally profitable. For instance, for a system that draws its conclusion of how to trade tomorrow from the closing price of today’s action, getting the direction is vitally important. We are not looking at this from a multisignal/asset point of view which would require the determination of how much to invest in each asset. This would be along the lines of an efficient frontier in Modern Portfolio Theory which would take factors like standard deviation and error in the forecast into account. Rather, we are simply wishing to trade on the back of successfully predicting direction. Thus while considering the distribution of the time series itself and the relevant mean and standard deviation may be more accurate quantitatively, trading on the predictions of the direction the prices are moving in can in itself be very profitable.
As a consequence the success of our methodology is calculated by considering the percentage of times the sign of the predicted data matches the sign of the actual originally generated data. To improve the accuracy of predicting direction we make use of the fact that the direction is just the gradient of the data. By creating a vector of gradients we have the numerical representation of an ordinary differential equation. We then use the structure of a BDF method to numerically solve the ordinary differential equation. BDF methods are appropriate because they depend on previous values. Some examples of BDF methods for solving the first order ordinary differential equation are given by where we have used the conventions and .
Using a forward difference approximation to the derivative we find that where is the value we are trying to improve. The BDF method (3) becomes Similarly, by approximating the derivative by a central difference approximation we obtain
As stated in their names, BDF methods are backward approximations of the first order derivative in a first order ordinary differential equation. In this instance, however, we are not applying the BDF method to an ODE but rather to actual discrete data points. Equations (2)–(4), being discrete, are applicable when considering actual data instead of an ODE as is usually the case. Since we do not have a function , as per (1), we take an approximation of the first order derivative as per (5). The BDF method approximates a differential equation whereas finite difference approximations could be said to be a means of approximating the gradient making it a convenient way of approximating . This means that a lower order approximation of a gradient, that is, the finite difference approximation of , is incorporated into a higher order approximation of what we can term our discrete ODE.
Thus (6) mixes first and second order approximations of the first order derivative. Equation (7) mixes a centred approximation at position and a backward approximation at position which is equivalent to a 0 order approximation. The reason why all the terms do not cancel out in formulae (6) and (7) and why this approach is still relevant is due to the fact that the formulae mix different order approximations. Our purpose is to improve upon already obtained predictions—or an already obtained solution. More precisely our predictions are our function and it is exactly the function as discrete points which we aim to improve upon in the same way predictor correctors are used to improve upon solutions obtained via other numerical methods such as Euler's method. Hence we are in fact not solving the ODE again, only improving on already known data in a pointwise fashion.
It is important to note that we are not implementing the BDF method in a direct fashion. We are incorporating a lower order approximation into the method in order to obtain a means of improving on already generated data. In the computational implementation of this work, the BDF method given by (6) is iterated to convergence. The initial value is obtained from either a linear or spline interpolation. The value of can be obtained using a neural network as discussed in other literature [14–16].
3. Convergence Properties
In this section we investigate the convergence properties of the difference equations (6) and (7). We want to show that by considering the general solution to (6) which is By considering (9) as follows we find that as (8) holds. In a similar fashion it is relatively easy to show that (8) is satisfied for (7), indicating convergence. In fact, scheme (7) converges faster than (6) since the coefficient of is smaller.
When either (6) or (7) is iterated to convergence we have , where is the tolerance. For both (6) and (7) we find that at convergence It is in fact the right hand side of (11) which indicates the “correction” made to the original prediction. By implementing a lower order approximation within a higher order approximation as we have done in the previous section we have been able to obtain this necessary difference equation which is the manner in which the predictions are improved upon.
The results and concluding remarks are presented in the next section.
4. Results and Concluding Remarks
As a means of evaluating the effectiveness of the BDF method as implemented above to improve upon the prediction of the direction of data we compare the accuracy of the originally fitted data and the corrected data. Direction success rate can be seen as a simple bimodal result. If we let be the true value then Choosing a sign based criterion is a standard method in financial analysis. The criteria are appropriate within the context of stock market modelling given the growing interest in developing tools which can give a sense of the direction in which the market may move or when possible turning points or shocks in returns will occur. This information is critical irrespective of the actual numerical value assigned to the movement given that the numerical values of the data used and analysed within this context are not necessarily useful in identifying characteristics which can be used to predict future movements. Hence, if the return on a stock decreases from 0.15 to 0.14, that is, by 0.01, for example, a prediction of −1 is of more value than a prediction of +0.01. This is due to the fact that the former prediction indicates the market trend whereas the latter only indicates a numerical value which within the context of market modelling loses value since it indicates a rise in the return at the next time step which is false. Thus the numerical value does not hold as much value in market trend trading as does the direction indicated by the sign of our prediction. In terms of our criterion, we have simply followed convention based upon methods currently employed by traders who trade according to market directions or trends.
In Table 1 we compare the percentage accuracy of predictions obtained through fitting a linear curve or spline through four points and the improved predictions found via (6). On average, this simple approach ensures that we get the direction correct 87% of the time in comparison to 48% for a linear curve and 86% when compared to 52% for a spline curve. These percentages have been calculated by simply counting the amount of times the sign of the actual point is generated, such as the three samples represented in Figure 1, and either the fitted or corrected point is the same and divided by the overall number of points. The number of points used—as indicated before—was 100. This is an arbitrary choice simply done for convenience.
Table 1 reflects the implementation of (6) obtained from the BDF method (3) and by using a forward difference approximation for the first derivative. If we instead incorporate a central difference approximation as per (7) we are able to improve a direction success rate of 50% for a linear curve to 80%. A comparison done against the spline fitting showed an improvement from 50% to 79%. We have also considered (4) with a forward and central difference approximation for the first derivative, respectively. The former choice showed an increase from approximately 48% to 80% for the linear comparison and 49% to 79% for the spline comparison. The latter case indicated an increase in the accuracy of the direction from 50% to 81% when the original predictions are obtained via a linear fitting and 49% to 81% when a spline is implemented.
In this paper we have shown how the implementation of the BDF method with a lower order approximation of the gradient of discrete data can improve the accuracy of predicting the direction of random data. The motivation for using a numerical ordinary differential equation solver comes from the fact that direction is just a gradient. We form a discrete ordinary differential equation from our vector of predictions. This discrete ordinary differential equation is solved with the implementation of a lower order approximation of the gradient with a BDF method. We find that the accuracy of our predictions improves from an accuracy of approximately 50% to an accuracy of approximately 87%.
An advantage of the approach taken in this paper is that the BDF scheme is a marching scheme. Irrespective of how big the data set is, the scheme will march through the data accordingly. The “speed” of the algorithm on very large data sets can be improved upon by using a computer with a faster CPU.
Acknowledgment
E. Momoniat and C. Harley acknowledge support from the National Research Foundation of South Africa.