Abstract
This paper proposes a multivariate and online prediction of stock prices via the paradigm of kernel adaptive filtering (KAF). The prediction of stock prices in traditional classification and regression problems needs independent and batchoriented nature of training. In this article, we challenge this existing notion of the literature and propose an online kernel adaptive filteringbased approach to predict stock prices. We experiment with ten different KAF algorithms to analyze stocks’ performance and show the efficacy of the work presented here. In addition to this, and in contrast to the current literature, we look at granular level data. The experiments are performed with quotes gathered at the window of one minute, five minutes, ten minutes, fifteen minutes, twenty minutes, thirty minutes, one hour, and one day. These time windows represent some of the common windows frequently used by traders. The proposed framework is tested on 50 different stocks making up the Indian stock index: Nifty50. The experimental results show that online learning and KAF is not only a good option, but practically speaking, they can be deployed in highfrequency trading as well.
1. Introduction
Prediction has applications in a multitude of areas such as economics [1], business planning and production [2], and weather forecasting [3]. However, accurately predicting the value of a variable is one of the very basic and nontrivial problems of the literature. In this article, we focus our attention on financial timeseries prediction and its application to stock price forecasting. Stock market is often considered as a chaotic [4], complex [5], volatile [6], and a dynamic mixture of forces driving the movement of a stock. Undoubtedly, its prediction is one of the significant challenges of the literature [7]. Moreover, the Efficient Market Hypothesis [8] states that stock prices reflect all current information, and any new information leads to unpredictability in stock prices. Naturally, significant work has been done in this area. Nevertheless, research clearly specifies that prediction of stocks, especially the nonlinear and nonstationary financial timeseries forecasting, is still challenging [9]. In this regard, several models have been developed; for instance, studies focused on volatility [6, 10], option pricing [11], classification of stock movements [12], predicting prices [13], and so on. In addition to this, studies have used a plethora of techniques, for example, support vector machine (SVM) [14], neural network (NN) [15], and genetic algorithm [16]. Nevertheless, a true solution is yet to be found. Moreover, during our literature survey, we found that the paradigm of KAF is not thoroughly investigated. Although, there are a few papers on the topic, e.g., [17, 18], a comprehensive investigation conducted at a large scale eludes the literature. The existing literature focuses on the multiplekernel learning method and solves different issues such as kernel size and step size. We follow the same line of thought and take the existing methods [17, 19, 20] as the foundation of the proposed work to propose a KAFbased approach for closeprice prediction.
We pointed in the previous paragraph that work has ignored KAF as an effective tool for financial time series forecasting. In this context, working with KAF has several advantages. First, it is one most favoured tools of the literature to predict a time series [21, 22]. The techniques in KAF have achieved tremendous accuracy in terms of predictive capability. Second, the convergence speed of KAFbased algorithms is excellent. In other words, they achieve convergence in fewer iterations. Third, they have universal function approximation properties [23]. This has the mathematical property desired for predicting a financial time series. Owing to these reasons, we focus on predicting the financial time series via the paradigm of KAF. Despite these advantages, one of the issues with existing work is Batch Learning. We would argue that Batch Learning is an ineffective tool in financial timeseries forecasting. The rationale here is backed by the fact that the data of a financial time series is nonstationary. Therefore, relying on models trained in an offline manner and expecting them to perform well in real market scenarios is a rather strong assumption. To fix this, online learning is proving to be a highly efficient approach [24–27]. In this method, the basis is selected during samplebysample training. Moreover, changing circumstances are quickly incorporated, and the algorithm changes its weight vector to make accurate predictions. Hence, we complement the idea of using KAF with online learning to predict a financial time series.
In light of the challenges and the potential solution specified in this section, we propose the paradigm of online KAF for stock price prediction. Thus, this study aims to predict stock movements in an online manner. Although the issue of financial series forecasting is challenging, the goal of this article however is to take one more step towards addressing the issue and to try and lay the groundwork for future work. To do this, we use the National Stock Exchange (NSE), Nifty50 dataset, which contains 50 leading stocks. In order to describe the contribution of this paper, the following points summarize the essence of the article in brief:(i)We propose the use of onlineKAF techniques for stock price prediction.(ii)The data is collected at multiple time windows, i.e., one day, sixty minutes, thirty minutes, twenty five minutes, twenty minutes, fifteen minutes, ten minutes, five minutes, and one minute. The proposed idea is applied to each of these time windows to try and find the best window for stock price prediction.(iii)The main objective is to predict the closing price of a stock. To do that, we apply ten different KAFbased algorithms and present a comprehensive discussion detailing every aspect of the analysis. With numerical testing performed on all fifty stocks of the main index (Nifty50), we show the work’s efficacy in this article.(iv)We experiment with two different years. First, we try to predict stock prices for the year 2020. Second, we apply the same set of parameters on the most recent data (2021) and try to show the efficacy of the work. Through experiments performed on these two different years, we have found the method proposed in the paper outperforms similar methods in the literature.(v)Lastly, we also try to show that although the KAF class of algorithms is new in the arena of stock prediction, they nevertheless are a practically viable candidate.
The rest of the paper is organized as follows: In Section 2, we discuss the related work. A discussion on different KAF algorithms is presented in Section 3. The experimental results are described in Section 4. Finally, the conclusion is given in Section 5.
2. Related Work
Stock price prediction is one of the nontrivial problems of the literature. Numerous studies have attempted to explain that stock price prediction is difficult to implement because of inherent nonstationarity in the data [28, 29]. Previous research has shown that stock market prediction is noisy and chaotic and follows nonlinearity [4, 6, 30]. Nonlinear modelling methodologies have been proven to be effective in modelling systems in a variety of domains such as in [31, 32]. Various applications found different modelling approaches to solve nonlinearity problems such as in [33–35]. In traditional prediction, the techniques were based on technical analysis with standards of resistance, support, and indicators using past prices [36]. Previous research has also studied various linear techniques such as moving averages, autoregressive models, discriminating analyses, and correlations [37, 38]. Much of the current literature on stock market prediction pays particular attention to machine learning (ML) techniques. ML has emerged as another popular area for time series prediction. Among the available popular techniques, machine learning methods are researched mostly due to their capabilities for recognizing complex pattern in stock prices [39–42].
Based on the timevarying and nonlinearity aspect of time series, there is a massive demand for online prediction algorithms. It follows the idea of sequential calculation and generates faster and accurate outcomes [26]. To date, various methods have been developed, such as neural network (NN), kernel adaptive filter (KAF) algorithms, and online support vector regression (SVR) [22]. However, neural networks suffer from slow convergence and significant computing needs. In addition, SVR and kernel methods have no problem of falling into the local optimum. SVR has a strong generalization ability [43], but it is just suitable for smaller data. In addition, a multifilter neural network (MFNN) is also used to predict stock price movement. The performance of the MFNN was found to be better than other NN approaches, SVM, and random forests [44]. In [45], the authors combined support vector machines for regression (SVR) and kernel principal component analysis (KPCA) to enhance prediction accuracy that may help investors for shortterm decisions. However, the high dimension of input variables makes the learning process long, and the final model computational complexity becomes very large. These machine learning methods had a drawback of large time consumption during the learning process.
To reduce the computational burden, kernelbased online learning algorithms have become gradually popular [44, 46]. In this respect, recurrent kernel online learning is applied to predict the transaction price of specific products. It was observed that the model was stable with a low dependency to parameter settings [47]. Similarly, convolutional neural networks (CNNs) are also suggested for predicting the nextday prices [48]. In all, there is sufficient literature that suggests that modelling the movement of a stock price is nontrivial. In this respect, adaptive filtering has proved to be a standard option for prediction model for streaming data with nonstationary properties [49–51]. KAF can therefore be used for sequential prediction of stock prices by exploiting the market interdependence. KAF are preferred because they are nonparametric, have low computational complexity, and converge very fast [21, 52–55]. In this domain, multiple algorithms are proposed for nonstationary data. They are preferred due to insensitivity towards design parameters [49]. Multistep predictions for stocks using metacognitive recurrent kernel online learning is proposed in [56]. The advantage of the KAF method is that it solves various problems in balancing efficiency and prediction accuracy.
Currently, the use of KAF approaches in the stock price prediction is limited [19, 20]. In [19], a multiplekernel learning method was proposed to address KAF’s two main issues: kernel size and step size. In [20], the idea of the local models was proposed to learn the behavior from different stock markets and compare with other online learning methods such as LSTM, quantized kernel least mean square (QKLMS), nearest instance centroid estimation (NICE), vector autoregression (VAR), and vector errorcorrection model (VECM) for daily closing price prediction. In another research, the study in [57] proposed the idea of adaptive stock trading strategies with deep reinforcement learning methods focused on extracting informative financial features via two methods: gated deep Qlearning trading strategy (GDQN) and gated deterministic policy gradient trading strategy (GDPG). This paper proposes an online KAFbased learning approach. The selection of basis functions can be done during samplebysample training in online kernel learning, which is a more efficient option. This method can be incredibly efficient and successful because they only require one pass over the training data.
To the best of our knowledge, the work presented in this article is the first wherein we comprehensively analyze the price of a stock on multiple time windows and comprehensively test the application of the KAF class of algorithms in stocks. Consequently, to discuss the novel contribution, the following points summarize the fundamental differences between this paper and existing work:(i)To the best of our knowledge, we are the first to use KAF algorithms with multiple time windows to analyze and predict stock prices.(ii)Stock prediction using existing online methods requires a lot of computation time. The article aims to present a general framework wherein the price prediction can be made in a significantly less amount of time.(iii)Stock traders can quickly sell and buy specific stocks with numerous time windows using the proposed strategy, resulting in larger earnings.
3. Methodology
As discussed in Section 1, we have worked with KAFbased techniques. Furthermore, we use online prediction methods. In this regard, KAFs work by selftuning, where the inputoutput mapping is formulated according to an optimization criterion usually determined by the error signal. There are two types of adaptive filters: linear and nonlinear. In linear filters, the traditional system follows a supervised technique and depends upon errorcorrection learning. The filter adaptively adjusts weights, , where denotes the discretetime interval. Here, the input signal is mapped to an actual response . Correspondingly, an error is denoted by . The error signal adjusts weights by incremental value denoted by . At the next iteration, becomes the current value of the weight to be updated. This process is continuously repeated until the filter reaches convergence; this generally occurs when the weight adjustment is small enough. Linear adaptive filters do not give satisfactory performance for the nonlinear system due to the results varied in a nonintuitive manner. In realworld problems, where data patterns are more complex, classes may not be separated easily by hyperplanes. Consequently, we have to look to nonlinear methods. In this paradigm, data is projected into highdimensional linear feature space and prediction is done in this highdimensional space. Comparing with other existing techniques for regression and classification, KAF has the following advantages:(i)KAFs are universal approximators.(ii)KAFs handle the complexity issues in terms of computation and memory. Moreover, they follow the property of no local minima.(iii)KAFs follow the idea of online learning and handle nonstationary conditions well.
It was discussed that nonlinear adaptive techniques are well suited for realworld problems. In this regard, kernel methods transform data into a set of points in the RKHS (Reproducing Kernel Hilbert Space). The main idea of KAF can be summarized as the transformation of input data into a highdimensional feature space G, via Mercer kernel. For this, the problem can be solved via inner products. There is no need to do expensive computations in highdimensional space, owing to the famous “kernel trick.” Considering KAF, suppose we have an inputoutput mapping as , based on a wellknown sequence (). Here, is the system input with i = 1, …, n and is equivalent to desired response. The goal is to estimate from data. In KAFs, generally, the computation involves the use of a kernel. An example of a kernel is given as follows:
Here, denotes the kernel and denotes the kernel width.
3.1. Discussion on KAF Algorithms
In this section, we discuss some of the most popular methods in KAF. For reasons of brevity, we keep the discussion short.
3.1.1. Least Mean Square (LMS) Algorithm
According to [46], the main aim of LMS algorithm is to minimize the following empirical risk function:
Applying stochastic gradient descent (SGD), equation (2) can be represented aswhere is step size and is known as prior error.
The weightupdate equation results in the following form:
Representing the idea in terms of inner product, we get
3.1.2. Kernel Least Mean Square Algorithm (KLMS)
KLMS [21] is an extension of LMS algorithm, the main difference is input is transformed to in the highdimensional space RKHS. Applying LMS algorithm at new sequences , we getwhere is the prediction error, is the weight vector in , and is the step size.
Using the kernel trick, KLMS can now be written as
KLMS assigns new unit for every input as the center with as its coefficient. Following the radial basis function (RBF), the algorithms are represented as follows:
3.1.3. Kernel Recursive Least Square Algorithm (KRLS)
According to [21], in KRLS, the objective function is complemented via a regularization parameter. This can be represented as follows:where stands for regularization vector.
It is shown that , where ; also, , , and .
Complementing the previous equation with RBF, we get
The whole idea here can now be summarized as
Following the sequential property of KRLS, we have
KRLS updates all previous coefficients through , whereas KLMS never updates previous coefficients. Here, is the component of . The computational complexity of KRLS is .
3.1.4. Kernel Affine Projection Algorithms (KAPAs)
KAPA [58] derives the idea of KLMS while reducing boosting performance and gradient noise. In KAPA, we formulate with sequences and to minimize the cost function and estimate with weight vector .
Using stochastic gradient descent, we replace covariance matrix and cross covariance vector by local approximation directly from the data. Hence, we get the following equations:where and K is the observation and regressor.
3.1.5. Quantized Kernel Least Mean Square Algorithm (QKLMS)
QKLMS is a famous algorithm proposed in [50]. It is an extension of KLMS algorithm to deal with the issue of data redundancy. Using quantization operator the core idea can be written aswhere, in feature space G, denotes the quantization. The learning rule for QKLMS is
QKLMS and KLMS have almost the same computational complexity. The only difference between the two algorithms is that QKLMS deal with the issue of data redundancy to locally update the coefficients of closest center.
In short, the central theme of QKLMS is given in Algorithm 1.

3.1.6. Kernel Normalized Least Mean Square Algorithm (KNLMS)
According to [49], KNLMS algorithm is used for dictionary designing with coherence criterion. Here, we discuss KNLMS from the point of view of MKNLMSCS (multikernel normalized least mean square algorithm with coherence based sparsification).
Assume , where is a set of M distinct kernels.
Consider to be the dictionary .
Here, is the size of dictionary. The filter works as per the following set of rules:where . The estimated error of can be written aswhere
Let the initial dictionary be indicated as . This makes to be an empty size matrix. Following algorithm, we only add a new point into if the following condition holds:where is the threshold. Let denote the step size and denote the regularization parameter. The update rule is given as follows:(i)If equation (20) is satisfied, . Also,(ii)If equation (20) is not satisfied, . Also,where and with where is the zero vector. For KNLMS, the value of M is 1.
3.1.7. Probabilistic Least Mean Square Algorithm (PROBLMS)
The probabilistic approach to the LMS filter is an efficient approximation method. It provides an adaptable stepsize LMS algorithm together with a measure of uncertainty about the estimation. In addition, it also preserves the linear complexity of the standard LMS. Some of the advantages of probabilistic models are as follows: (1) they force the designer to specify all the assumptions of the model, (2) they provide a clear separation between the model and the algorithm used to solve it, and (3) they usually provide some measure of uncertainty about the estimation. It is assumed observation models to be Gaussian with this distribution:where = parameter vector, = variance for observation noise, and = regression vector.
3.1.8. Kernel Maximum Crossentropy Criterion (KMCC)
The algorithm’s main aim is to maximize crossentropy between desired and actual output [55]. Using MCC criterion and SGD, the algorithm can be written aswhere is the kernel width and is the step size.
The complete prediction and error calculation can be summarized as
3.1.9. Leaky Kernel Affine Projection Algorithm (LKAPA)
The LKAPA [58] is the extension of KAPA discussed in Section 3.1.4. According to equation (14), weight updation is a difficult task in highdimensional space. Here equation (14) is modified. This can be done as follows:where
The weight vector is computed using the following criterion:
From the perspective of empirical risk minimization, we minimize the following objective function:
Then, we getwhere
Finally, coefficient is updated as
3.1.10. Normalized Online Regularized Risk Minimization Algorithm (NORMA)
NORMA [58] is one of the kernelbased version of LKAPA described in Section 3.1.9. It is also correlated with the KLMS algorithm summarized in Section 3.1.2. NORMA includes the regularization and nonlinear functional approach. It allows to reject old values ones in a sliding window manner.
3.2. Problem Formulation
In this section, we discuss the results of stock prediction using all the ten discussed algorithms. The purpose of stock prediction is to determine the future values of a stock depending upon the historical values. As discussed in the Introduction section, our main aim is to predict the close price. To this end, we calculated the percentage change in close price. Subsequently, we apply the idea of autoregression of the order to predict the future change in the stock price. An autoregressive (AR) model forecasts future behavior using data from the past. When there is a correlation between the values in a time series and the values that precede and succeed them. In such situations, AR models have shown tremendous potential. In context of the work presented here, the problem is formulated as
is the close price in the highdimensional space, and is the weight vector. Since we follow the AR model, it is imperative to estimate the weight vector. To estimate the weight vector, KAF techniques discussed in the previous section are used. A sample of the formulation is shown in Table 1. In this table, we have shown problem formulation by considering the daywise closing price. This type of procedure is followed commonly in multivariate time series prediction, e.g., [59, 60]. It should be noted here that the procedure was followed for all the window sizes. Subsequently, the problem became autoregressionbased next percentage prediction. The actual closing can then be computed from the percentage change easily. The overall framework followed in the article is shown in Figure 1. The experiments were performed on the Nifty50 dataset, and the data used in the experimentation is available at shorturl.at/lnvF2. The kernel adaptive filtering (KAF) algorithms that were used in this work is available at https://github.com/steven2358/kafbox.
4. Experimental Results
4.1. Dataset Description
In this section, we have described the experimental details of the Nifty50 dataset. Nifty50 is the largest stock exchange in India according to the rate of total and average daily turnover for equity shares. We collected the data of all stocks from 9 : 15 to 3 : 30. In addition, we collected the data for two different periods of years. First, we try to predict stock prices for the year 2020 from January 01, 2020, to December 31, 2020, and second, from the most recent data (2021) between January 01, 2021, and May 31, 2021. The original data was available for oneminute open, high, low, and close (OHLC) prices. From this granular data, we clubbed the OHLC quotes to get the data for other time windows. In particular, we created and preprocessed the dataset according to nine prediction windows (one day, sixty minutes, thirty minutes, twenty five minutes, twenty minutes, fifteen minutes, ten minutes, five minutes, and one minute). Recall that we focused on predicting the percentage changes in close price. To that end, we also normalized the data between the range of 0 to 1. Then, ten distinct KAF algorithms were applied to the final preprocessed data for every stock. Finally, it is worth noting that the experimental findings obtained with the KAF algorithm on the Nifty50 dataset demonstrate the work’s superiority and could serve as a new benchmark for the field’s future state of the art.
4.2. Evaluation Criterion
To evaluate and compare the performance of various KAF algorithms, we use standard error evaluation metrics such as mean squared error (MSE), mean absolute error (MAE), and directional symmetry (DS). The metrics are elaborated in the following text.
4.2.1. Minimum Square Error (MSE)
MSE is also known as mean squared deviation (MSD) which calculates the average squared difference between the actual and predicted observation:
4.2.2. Mean Absolute Error (MAE)
MAE calculates the average magnitude between actual and predicted observations in a set of predictions, without observing their directions, i.e., the average prediction error.
4.2.3. Directional Symmetry (DS)
Directional symmetry in terms of time series analysis measures the model’s performance to predict positive and negative trends from one time period to the next.wherewhere n is the timestep, represents the actual values, and represents the predicted output. In the following procedure, we discuss the details to compute error values.
4.3. Procedure: Error Computation
(1)We worked with Nifty50 firms with 2020 and 2021 datasets, as mentioned in Section 1. Moreover, it was also pointed out that we work with ten different algorithms. The parameters listed in Table 2 were tuned manually. In order to find the optimal values of the parameters, multiple experiments were performed.(2)To compute the error values for each stock and every algorithm, we formulated the problem as an autoregressive problem (see Section 3.2) and computed the error values for all 50 stocks. In total, we get 50X3 error values, one for MSE, MAE, and DS. Moreover, we pointed out that we have nine different prediction windows. Hence, error estimation was done for all stocks, all windows, and all ten algorithms.(3)Subsequently, for a particular algorithm, and for a single time window, we take the average of all 50error metrics (one for every stock) to come up with the final number. The number is presented in the article. This number shows the overall predictive capability of the model on all fifty stocks.4.4. Prediction, Convergence, and Residual Analysis
In this section, we analyze the performance of KAF algorithms for close price prediction. In this regard, the prediction graph for one stock (Reliance) with KRLS is considered. Figure 2 shows the results for 2020 and 2021 datasets. It is visible from the figure that we are getting good results. It should be noted here that we have presented the result for one stock (Reliance) and one prediction window (sixty minutes). Similar results were obtained for other companies in the dataset. It is also visible from the figure that although the prediction is not cent percent accurate, it is close. It, therefore, implies the superior performance of KAF algorithms in prediction. It should be noted here that although we are getting good results, there are always chances of overfitting. In this article, since we are using online learning, the architecture itself naturally minimizes the chances of overfitting, but it is possible that the superior results might be due to overfitting.
(a)
(b)
It is expected from any machine learning algorithm that it should converge as we train the model with more instances; in other words, the error as we progress through the training should decrease to an acceptable range. In this regard and in addition to presenting the results for prediction in Figure 2, we have also presented the result of convergence in Figure 3 for 2020 and 2021 datasets, respectively. Similar to the previous case, we have only plotted the result considering single stock (Reliance) and one prediction time window. The convergence graphs of the algorithm were plotted taking MSE as the error metrics. Figure 3 shows the error convergence graph for both the datasets and KRLS algorithm for the Reliance stock. In Figure 3, xaxis shows the number of instances and yaxis shows the MSE. It can be seen from Figure 3 that the algorithm reached convergence very quickly. In fact, the algorithm reached convergence at 1000th data point. Convergence is very important in KAF as it shows the ability of the algorithm to adapt itself and learn from the data quickly. Though, there are minor fluctuations in the end, but it nevertheless is acceptable as there will always be minor changes in the new data.
(a)
(b)
To complement the prediction results, we have also presented the distribution of error residuals in Figure 4 for 2020 and 2021 datasets, respectively. As visible from the figure, the residuals are following a normal distribution. This type of behavior is excellent as there are very few outliers. Moreover, the overall variance of the residuals is also less, showing the excellent prediction potential of the algorithm.
(a)
(b)
4.5. Comprehensive Evaluation of KAF Algorithms
In contrast to batch learning techniques, which generate the best predictor by learning on the full training dataset at once, we employ an online learning concept in which data becomes available in a sequential order (sample by sample training) and is used to update the best predictor for future data at each step. As we have used ten different algorithms, it is logical to compare the performance of all algorithms. In this regard, we have shown the result in two different datasets. First, we attempt to forecast stock prices for 2020. Second, we use the same set of parameters on the most recent data (2021) to demonstrate the work’s efficacy. To evaluate the performance of KAFbased methods, we try to experiment with different values of (the embedding dimension). We vary the underlying dimensions from 2 to 7 with a step size of 1, i.e., . With this setup, the results are presented in Tables 3 and 4. It is visible from the table that once again, KRLS performed well in terms of error minimization. The best number for the embedding dimension is two when we consider MSE and MAE. However, when it came to DS, the numbers and the algorithms are different because a market trend is a term used to describe how a market moves over time. A trend can generally move upward or downward. For instance, considering daily data (1 day in the table), the best performing algorithm is LKAPA with embedding dimension . In fact, for this metric (DS), we see much conflict in terms of the best algorithm. Nevertheless, the experimentation revealed the superiority of KRLS, PROBLMS, and LKAPA.
4.6. Comparison with Other StateoftheArt Methods
We compared our result with other learning methods such as [61–63], among other learning approaches. The deep learning (DL) algorithms were taught and assessed over a period of 25 epochs utilizing an 80 : 20 split. The amount of time taken to train and make prediction was recorded. Based on the architecture details and hyperparameter settings provided in the relevant articles, the DLbased method [61–63] stocks were reimplemented. All of the techniques were trained on the Nifty50 dataset. We chose fifty equities for the sixtyminute time periods to maintain uniformity across different ways for experimentation. In terms of MSE, RMSE, and execution time, all of the approaches were then compared to the suggested KAF method (KRLS). For the 2020 and 2021 datasets, Tables 5 and 6 summarize the comparative outcomes learning approaches. The results in Tables 5 and 6 show that the proposed approach outperforms previous stock prediction methods in the literature.
We must point it out here that since all the models belong to the same category of kernel adaptive filtering, the complexity of all the models is almost similar. For neural networks used in the article, we collect the architecture from their respective papers [61–63]. It should be noted here that KAF is also analogous to the neural network architecture with a single layer. Furthermore, even though it has a single layer, it is giving good results.
4.7. Experimentation with Dictionary Size
In addition to the experiment conducted in the previous section, we have also experimented with the dictionary size of KAF algorithms. The result for this experiment is presented in Table 7. As visible, increasing the dictionary size decreases the performance of the system. Moreover, increasing the dictionary size also increased the execution time. It should be noted here that the execution time for predicting the next closing price for a single stock with dictionary size 500 is 0.675 seconds. This figure (0.675 seconds) clearly shows the applicability of the KAF class of algorithms in highfrequency trading, where latency is a key factor.
4.8. Important Note: Error Minimization and Profitability
From Tables 3 and 4, we can see that KRLS performed well in minimizing error. Moreover, the lowest error (MSE) that we get is in the order of . It should be noted here that we got this error for the time window of one minute. In this regard, it is common sense that if we minimize the error, we can get close to the actual values, which is indeed true. However, considering the time window of one minute, there is an issue. In this interval, the fluctuation in the price is low. This means that minimizing error will not result in too much profit. In other words, the volatility in one minute is less. Hence, predictions are very close. However, the chances of taking a position and getting profit in a low volatile environment is also very less. Therefore, one has to maintain a balance between error minimization and profitability.
5. Conclusion
This paper introduces a framework to predict stock prices using KAF. We comprehensively analyzed the Indian Financial Sector, Nifty50, and showed the predictive results of all 50 stocks in the main index. We experimented with ten different algorithms belonging to the KAF class of algorithms. Experimentation was performed on nine different windows starting at one minute and progressing to one day. This is the first time, to our knowledge, that numerous KAF algorithms have been implemented at such granular levels. The evidence offered in the Experimental Results section demonstrated the work’s overall predictive capability. It was discovered that the KAF class of algorithms not only outperformed other algorithms in terms of error minimization but also had a very short execution time, underlining its usefulness in the field of highfrequency trading.
For future work, we would test the framework via the application of hyperparameter optimization. This would be beneficial because KAF algorithms must deal with a wide range of hyperparameter settings. We will also use several hyperparameter optimization strategies to improve the model’s accuracy.
Data Availability
The datasets and all related materials are available for download from the following website: shorturl.at/lnvF2.
Conflicts of Interest
The authors declare that there are no conflicts of interest in this research.