Abstract

The aim of this paper is to present modified neural network algorithms to predict whether it is best to buy, hold, or sell shares (trading signals) of stock market indices. Most commonly used classification techniques are not successful in predicting trading signals when the distribution of the actual trading signals, among these three classes, is imbalanced. The modified network algorithms are based on the structure of feedforward neural networks and a modified Ordinary Least Squares (OLSs) error function. An adjustment relating to the contribution from the historical data used for training the networks and penalisation of incorrectly classified trading signals were accounted for, when modifying the OLS function. A global optimization algorithm was employed to train these networks. These algorithms were employed to predict the trading signals of the Australian All Ordinary Index. The algorithms with the modified error functions introduced by this study produced better predictions.

1. Introduction

A number of previous studies have attempted to predict the price levels of stock market indices [1–4]. However, in the last few decades, there have been a growing number of studies attempting to predict the direction or the trend movements of financial market indices [5–11]. Some studies have suggested that trading strategies guided by forecasts on the direction of price change may be more effective and may lead to higher profits [10]. Leung et al. [12] also found that the classification models based on the direction of stock return outperform those based on the level of stock return in terms of both predictability and profitability.

The most commonly used techniques to predict the trading signals of stock market indices are feedforward neural networks (FNNs) [9, 11, 13], probabilistic neural networks (PNNs) [7, 12], and support vector machines (SVMs) [5, 6]. FNN outputs the value of the stock market index (or a derivative), and subsequently this value is classified into classes (or direction). Unlike FNN, PNN and SVM directly output the corresponding class.

Almost all of the above mentioned studies considered only two classes: the upward and the downward trends of the stock market movement, which were considered as buy and sell signals [5–7, 9, 11]. It was noticed that the time series data used for these studies are approximately equally distributied among these two classes.

In practice, the traders do not participate in trading (either buy or sell shares) if there is no substantial change in the price level. Instead of buying/selling, they will hold the money/shares in hand. In such a case it is important to consider the additional class which represents a hold signal. For instance, the following criterion can be applied to define three trading signals, buy, hold, and sell.

Criterion A. buyifπ‘Œ(𝑑+1)β‰₯𝑙𝑒,holdif𝑙𝑙<π‘Œ(𝑑+1)<𝑙𝑒sellifπ‘Œ(𝑑+1)≀𝑙𝑙,,(1.1) where π‘Œ(𝑑+1) is the relative return of the Close price of day (𝑑+1) of the stock market index of interest, while 𝑙𝑙 and 𝑙𝑒 are thresholds.

The values of 𝑙𝑙 and 𝑙𝑒 depend on the traders' choice. There is no standard criterion found in the literature how to decide the values of 𝑙𝑙 and 𝑙𝑒, and these values may vary from one stock index to another. A trader may decide the values for these thresholds according to his/her knowledge and experience.

The proper selection of the values for 𝑙𝑙 and 𝑙𝑒 could be done by performing a sensitivity analysis. The Australian All Ordinary Index (AORD) was selected as the target stock market index for this study. We experimented different pairs of values for 𝑙𝑙 and 𝑙𝑒 [14]. For different windows, different pairs gave better predictions. These values also varied according to the prediction algorithm used. However, for the definition of trading signals, these values needed to be fixed.

By examining the data distribution (during the study period, the minimum, maximum, and average for the relative returns of the Close price of the AORD are βˆ’0.0687, 0.0573, and 0.0003, resp.), we chose 𝑙𝑒=βˆ’π‘™π‘™=0.005 for this study, assuming that 0.5% increase (or decrease) in Close price of day 𝑑+1 compared to that of day 𝑑 is reasonable enough to consider the corresponding movement as a buy (or sell) signal. It is unlikely that a change in the values of 𝑙𝑙 and 𝑙𝑒 would make a qualitative change in the prediction results obtained.

According to Criterion A with 𝑙𝑒=βˆ’π‘™π‘™=0.005, one cannot expect a balanced distribution of data among the three classes (trading signals) because more data falls into the hold class while less data falls into the other two classes.

Due to the imbalance of data, the most classification techniques such as SVM and PNN produce less precise results [15–17]. FNN can be identified as a suitable alternative technique for classification when the data to be studied has an imbalanced distribution. However, a standard FNN itself shows some disadvantages: (a) use of local optimization methods which do not guarantee a deep local optimal solution; (b) because of (a), FNN needs to be trained many times with different initial weights and biases (multiple training results in more than one solution and having many solutions for network parameters prevent getting a clear picture about the influence of input variables); (c) use of the ordinary least squares (OLS; see (2.1) as an error function to be minimised may not be suitable for classification problems.

To overcome the problem of being stuck in a local minimum, finding a global solution to the error minimisation function is required. Several past studies attempted to find global solutions for the parameters of the FNNs, by developing new algorithms (e.g., [18–21]). Minghu et al. [19] proposed a hybrid algorithm of global optimization of dynamic learning rate for FNNs, and this algorithm shown to have global convergence for error backpropagation multilayer FNNs (MLFNNs). The study done by Ye and Lin [21] presented a new approach to supervised training of weights in MLFNNs. Their algorithm is based on a β€œsubenergy tunneling function’’ to reject searching in unpromising regions and a β€œripple-like’’ global search to avoid local minima. Jordanov [18] proposed an algorithm which makes use of a stochastic optimization technique based on the so-called low-discrepancy sequences to trained FNNs. Toh et al. [20] also proposed an iterative algorithm for global FNN learning.

This study aims at modifying neural network algorithms to predict whether it is best buy, hold, or sell the shares (trading signals) of a given stock market index. This trading system is designed for short-term traders to trade under normal conditions. It assumes stock market behaviour is normal and does not take unexceptional conditions such as bottlenecks into consideration.

When modifying algorithms, two matters were taken into account: (1) using a global optimization algorithm for network training and (2) modifying the ordinary least squares error function. By using a global optimization algorithm for network training, this study expected to find deep solutions to the error function. Also this study attempted to modify the OLS error function in a way suitable for the classification problem of interest.

Many previous studies [5–7, 9, 11] have used technical indicators of the local markets or economical variables to predict the stock market time series. The other novel idea of this study is the incorporation of the intermarket influence [22, 23] to predict the trading signals.

The organisation of the paper is as follows. Section 2 explains the modification of neural network algorithms. Section 3 describes the network training, quantification of intermarket influence, and the measures of evaluating the performance of the algorithms. Section 4 presents the results obtained from the proposed algorithms together with their interpretations. This section also compares the performance of the modified neural network algorithms with that of the standard FNN algorithm. The last section is the conclusion of the study.

2. Modified Neural Network Algorithms

In this paper, we used modified neural network algorithms for forecasting the trading signals of stock market indices. We used the standard FNN algorithm as the basis of these modified algorithms.

A standard FNN is a fully connected network with every node in the lower layer linked to every node in the next higher layer. These linkages are attached with some weights, 𝑀=(𝑀1,…,𝑀𝑀), where 𝑀 is the number of all possible linkages. Given weight, 𝑀, the network produces an output for each input vector. The output corresponding to the 𝑖th input vector will be denoted by π‘œπ‘–β‰‘π‘œπ‘–(𝑀).

FNNs adopt the backpropagation learning that finds optimal weights 𝑀 by minimising an error between the network outputs and given targets [24]. The most commonly used error function is the Ordinary Least Squares function (OLS):

𝐸OLS=1𝑁𝑁𝑖=1ξ€·π‘Žπ‘–βˆ’π‘œπ‘–ξ€Έ2,(2.1) where 𝑁 is the total number of observations in the training set, while π‘Žπ‘– and π‘œπ‘– are the target and the output corresponding to the 𝑖th observation in the training set.

2.1. Alternative Error Functions

As described in the Introduction (see Section 1), in financial applications, it is more important to predict the direction of a time series rather than its value. Therefore, the minimisation of the absolute errors between the target and the output may not produce the desired accuracy of predictions [24, 25]. Having this idea in mind, some past studies aimed to modify the error function associated with the FNNs (e.g., [24–27]). These studies incorporated factors which represent the direction of the prediction (e.g., [24–26]) and the contribution from the historical data that used as inputs (e.g., [24, 25, 27]).

The functions proposed in [24–26] penalised the incorrectly predicted directions more heavily, than the correct predictions. In other words, higher penalty was applied if the predicted value, π‘œπ‘–, is negative when the target, π‘Žπ‘–, is positive or viceversa.

Caldwell [26] proposed the Weighted Directional Symmetry (WDS) function which is given as follows:

𝑓WDS(𝑖)=100𝑁𝑁𝑖=1𝑀𝑑𝑠||π‘Ž(𝑖)π‘–βˆ’π‘œπ‘–||,(2.2) where

π‘€π‘‘π‘ ξƒ―ξ€·π‘Ž(𝑖)=1.5ifπ‘–βˆ’π‘Žπ‘–βˆ’1π‘œξ€Έξ€·π‘–βˆ’π‘œπ‘–βˆ’1≀0,0.5,otherwise,(2.3) and 𝑁 is the total number of observations.

Yao and Tan [24, 25] argued that the weight associated with 𝑓WDS (i.e., 𝑀𝑑𝑠(𝑖)) should be heavily adjusted if a wrong direction is predicted for a larger change, while it should be slightly adjusted if a wrong direction is predicted for a smaller change and so on. Based on this argument, they proposed the Directional Profit adjustment factor:

𝑓DP⎧βŽͺβŽͺ⎨βŽͺβŽͺβŽ©π‘(𝑖)=1ξ€·ifΞ”π‘Žπ‘–Γ—Ξ”π‘œπ‘–ξ€Έ>0,Ξ”π‘Žπ‘–π‘β‰€πœŽ,2ξ€·ifΞ”π‘Žπ‘–Γ—Ξ”π‘œπ‘–ξ€Έ>0,Ξ”π‘Žπ‘–π‘>𝜎,3ξ€·ifΞ”π‘Žπ‘–Γ—Ξ”π‘œπ‘–ξ€Έ<0,Ξ”π‘Žπ‘–π‘β‰€πœŽ,4ξ€·ifΞ”π‘Žπ‘–Γ—Ξ”π‘œπ‘–ξ€Έ<0,Ξ”π‘Žπ‘–>𝜎,(2.4) where Ξ”π‘Žπ‘–=π‘Žπ‘–βˆ’π‘Žπ‘–βˆ’1, Ξ”π‘œπ‘–=π‘œπ‘–βˆ’π‘œπ‘–βˆ’1, and 𝜎 is the standard deviation of the training data (including validation set). For the experiments authors used 𝑐1=0.5, 𝑐2=0.8, 𝑐3=1.2, and 𝑐4=1.5 [24, 25]. By giving these weights, they tried to impose a higher penalty the predictions whose direction is wrong and the magnitude of the error is lager, than the other predictions.

Based on this Directional Profit adjustment factor (2.4), Yao and Tan [24, 25] proposed Directional Profit (DP) model [24, 25]:

𝐸DP=1𝑁𝑁𝑖=1𝑓DPξ€·π‘Ž(𝑖)π‘–βˆ’π‘œπ‘–ξ€Έ2.(2.5) Refenes et al. [27] proposed Discounted Least Squares (LDSs) function by taking the contribution from the historical data into accounts as follows:

𝐸DLS=1𝑁𝑁𝑖=1π‘€π‘ξ€·π‘Ž(𝑖)π‘–βˆ’π‘œπ‘–ξ€Έ2,(2.6) where 𝑀𝑏(𝑖) is an adjustment relating to the contribution of the 𝑖th observation and is described by the following equation:

𝑀𝑏1(𝑖)=1+exp(π‘βˆ’2𝑏𝑖/𝑁).(2.7) Discount rate 𝑏 denotes the contribution from the historical data. Refenes et al. [27] suggested 𝑏=6.

Yao and Tan [24, 25] proposed another error function, Time Dependent directional Profit (TDP) model, by incorporating the approach suggested by Refenes et al. [27] to their Directional Profit Model (2.5):

𝐸TDP=1𝑁𝑁𝑖=1𝑓TDPξ€·π‘Ž(𝑖)π‘–βˆ’π‘œπ‘–ξ€Έ2,(2.8) where 𝑓TDP(𝑖)=𝑓DP(𝑖)×𝑀𝑏(𝑖).𝑓DP(𝑖) and 𝑀𝑏(𝑖) are described by (2.4) and (2.7), respectively.

Note. Refenes et al. [27] and Yao and Tan [24, 25] used 1/2𝑁 instead of 1/𝑁 in the formulas given by (2.5), (2.6), and (2.8).

2.2. Modified Error Functions

We are interested in classifying trading signals into three classes: buy, hold, and sell. The hold class includes both positive and negative values (see Criterion A in Section 1). Therefore, the least squares functions, in which the cases with incorrectly predicted directions (positive or negative) are penalised (e.g., the error functions given by (2.5) and (2.8), will not give the desired prediction accuracy. For example, suppose that π‘Žπ‘–=0.0045 and π‘œπ‘–=βˆ’0.0049. In this case the predicted signal is correct, according to Criterion A. However, the algorithms used in [24, 25] try to minimise error function as Ξ”π‘Žπ‘–Γ—Ξ”π‘œπ‘–<0 (refer (2.8). In fact such a minimisation is not necessary, as the predicted signal is correct. Therefore, instead of the weighing schemes suggested by previous studies, we proposed a different scheme of weighing.

Unlike the weighing schemes suggested in [24, 25], which impose a higher penalty on the predictions whose sign (i.e., negative or positive) is incorrect, this novel scheme is based on the correctness of the classification of trading signals. If the predicted trading signal is correct, we assign a very small (close to zero) weight and, otherwise, assign a weight equal to 1. Therefore, the proposed weighing scheme is

𝑀𝑑(𝑖)=𝛿ifthepredictedtradingsignaliscorrect,1,otherwise,(2.9) where 𝛿 is a very small value. The value of 𝛿 needs to be decided according to the distribution of data.

2.2.1. Proposed Error Function 1

The weighing scheme, 𝑓DP(𝑖), incorporated in the Directional Profit (DP) error function (2.5) considers only two classes, upward and downward trends (direction) which are corresponding to buy and sell signals. In order to deal with three classes, buy, hold, and sell, we modified this error function by replacing 𝑓DP(𝑖) with the new weighing scheme 𝑀𝑑(𝑖) (see (2.9). Hence, the new error function (𝐸𝐢𝐢) is defined as

𝐸𝐢𝐢=1𝑁𝑁𝑖=1π‘€π‘‘ξ€·π‘Ž(𝑖)π‘–βˆ’π‘œπ‘–ξ€Έ2.(2.10) When training backpropagation neural networks using (2.10) as the error minimisation function, the error is forced to take a smaller value, if the predicted trading signal is correct. On the other hand, the actual size of the error is considered in the cases of misclassifications.

2.2.2. Proposed Error Function 2

The contribution from the historical data also plays an important role in the prediction accuracy of financial time series. Therefore, Yao and Tan [24, 25] went further by combining DP error function (see (2.5) with DLS error function (see (2.6) and proposed Time Dependent Directional Profit (TDP) error function (see (2.8).

Following Yao and Tan [23, 24], this study also proposed a similar error function, ETCC, by combining first new error function (𝐸𝐢𝐢) described by (2.10) with the DLS error function (𝐸DLS). Hence the second proposed error function is

𝐸𝑇𝐢𝐢=1𝑁𝑁𝑖=1𝑀𝑏(𝑖)Γ—π‘€π‘‘ξ€·π‘Ž(𝑖)π‘–βˆ’π‘œπ‘–ξ€Έ2,(2.11) where 𝑀𝑏(𝑖) and 𝑀𝑑(𝑖) are defined by (2.7) and (2.9), respectively.

The difference between the TDP error function (see (2.8) and this second new error function (2.11) is that 𝑓DP(𝑖) is replaced by 𝑀𝑑(𝑖) in order to deal with three classes: buy, hold, and sell.

2.3. Modified Neural Network Algorithms

Modifications to neural network algorithms were done by (i) using the OLS error function as well as the modified least squares error functions; (ii) employing a global optimization algorithm to train the networks.

The importance of using global optimization algorithms for the FNN training was discussed in Section 1. In this paper, we applied the global optimization algorithm, AGOP (introduced in [28, 29]), for training the proposed network algorithms.

As the error function to be minimised, we considered 𝐸OLS (see (2.1) and 𝐸DLS (see (2.6) together with the two modified error functions 𝐸𝐢𝐢 (see (2.10) and 𝐸𝑇𝐢𝐢 (see (2.11). Based on these four error functions, we proposed the following algorithms:

(i)𝑁𝑁OLSβ€”neural network algorithm based on the Ordinary Least Squares error function, 𝐸OLS (see (2.1);(ii)𝑁𝑁DLSβ€”neural network algorithm based on the Discounted Least Squares error function, 𝐸DLS (see (2.6);(iii)𝑁𝑁𝐢𝐢—neural network algorithm based on the newly proposed error function 1, 𝐸𝐢𝐢 (see (2.10);(iv)𝑁𝑁𝑇𝐢𝐢—neural network algorithm based on the newly proposed error function 2, 𝐸𝑇𝐢𝐢 (see (2.11).

The layers are connected in the same structure as the FNN (Section 2). A tan-sigmoid function was used as the transfer function between the input layer and the hidden layer, while the linear transformation function was employed between the hidden and the output layers.

Algorithm 𝑁𝑁OLS differs from the standard FNN algorithm since it employs a new global optimization algorithm for training. Similarly, 𝑁𝑁DLS also differs from the respective algorithm used in [24, 25] due to the same reason. In addition to the use of new training algorithm, 𝑁𝑁𝐢𝐢 and 𝑁𝑁𝑇𝐢𝐢 are based on two different modified error functions. The only way to examine whether these new modified neural network algorithms perform better than the existing ones (in the literature) is to conduct numerical experiments.

3. Network Training and Evaluation

The Australian All Ordinary Index (AORD) was selected as the stock market index whose trading signals are to be predicted. The previous studies done by the authors [22] suggested that the lagged Close prices of the US Sβˆ–&P 500 Index (GSPC), the UK FTSE 100 Index (FTSE), French CAC 40 Index (FCHI), and German DAX Index (GDAXI) as well as that of the AORD itself showed an impact on the direction of the Close price of day 𝑑 of the AORD. Also it was found that only the Close prices at lag 1 of these markets influence the Close price of the AORD [22, 23]. Therefore, this study considered the relative return of the Close prices at lag 1 of two combinations of stock market indices when forming input sets: (i) a combination which includes the GSPC, FTSE, FCHI, and the GDAXI; (ii) a combination which includes the AORD in addition to the markets included in (i).

The input sets were formed with and without incorporating the quantified intermarket influence [22, 23, 30] (see Section 3.1). By quantifying intermarket influence, this study tries to identify the influential patterns between the potential influential markets and the AORD. Training the network algorithms with preidentified patterns may enhance their learning. Therefore, it can be expected that the using quantified intermarket influence for training algorithms produces more accurate output.

The quantification of intermarket influence is described in Section 3.1, while Section 3.2 presents the input sets used for network training.

Daily relative returns of the Close prices of the selected stock market indices from 2nd July 1997 to 30th December 2005 were used for this study. If no trading took place on a particular day, the rate of change of price should be zero. Therefore, before calculating the relative returns, the missing values of the Close price were replaced by the corresponding Close price of the last trading day.

The minimum and the maximum values of the data (relative returns) used for network training are βˆ’0.137 and 0.057, respectively. Therefore, we selected the value of 𝛿 (see Section 2.2) as 0.01. If the trading signals are correctly predicted, 0.01 is small enough to set the value of the proposed error functions (see (2.10) and (2.11) to approximately zero.

Since, influential patterns between markets are likely to vary with time [30], the whole study period was divided into a number of moving windows of a fixed length. Overlapping windows of length three trading years were considered (1 trading year ≑ 256 trading days) . A period of three trading years consists of enough data (768 daily relative returns) for neural network experiments. Also the chance that outdated data (which is not relevant for studying current behaviour of the market) being included in the training set is very low.

The most recent 10% of data (the last 76 trading days) in each window were accounted for out of sample predictions, while the remaining 90% of data were allocated for network training. We called the part of the window which allocated for training the training window. Different number of neurons for the hidden layer was tested when training the networks with each input set.

As described in Section 2.1, the error function, 𝐸DLS (see (2.6), consists of a parameter 𝑏 (discount rate) which decides the contribution from the historical data of the observations in the time series. Refenes et al. [27] fixed 𝑏=6 for their experiments. However, the discount rate may vary from one stock market index to another. Therefore, this study tested different values for 𝑏 when training network 𝑁𝑁DLS. Observing the results, the best value for 𝑏 was selected, and this best value was used as 𝑏 when training network 𝑁𝑁𝑇𝐢𝐢.

3.1. Quantification of Intermarket Influences

Past studies [31–33] confirmed that the most of the world's major stock markets are integrated. Hence, one integrated stock market can be considered as a part of a single global system. The influence from one integrated stock market on a dependent market includes the influence from one or more stock markets on the former.

If there is a set of influential markets to a given dependent market, it is not straightforward to separate influence from individual influential markets. Instead of measuring the individual influence from one influential market to a dependent market, the relative strength of the influence from this influential market to the dependent market can be measured compared to the influence from the other influential markets. This study used the approach proposed in [22, 23] to quantify intermarket influences. This approach estimates the combined influence of a set of influential markets and also the contribution from each influential market to the combined influence.

Quantification of intermarket influences on the AORD was carried out by finding the coefficients, πœ‰π‘–,  𝑖=1,2,… (see Section 3.1.1), which maximise the median rank correlation between the relative return of the Close of day (𝑑+1) of the AORD market and the sum of πœ‰π‘– multiplied by the relative returns of the Close prices of day t of a combination of influential markets over a number of small nonoverlapping windows of a fixed size. The two combinations of markets, which are previously mentioned this section, were considered. πœ‰π‘– measures the contribution from the 𝑖th influential market to the combined influence which is estimated by the optimal correlation.

There is a possibility that the maximum value leads to a conclusion about a relationship which does not exist in reality. In contrast, the median is more conservative in this respect. Therefore, instead of selecting the maximum of the optimal rank correlation, the median was considered.

Spearman’s rank correlation coefficient was used as the rank correlation measure. For two variables 𝑋 and π‘Œ, Spearman’s rank correlation coefficient, π‘Ÿπ‘ , can be defined as

π‘Ÿπ‘ =𝑛𝑛2ξ€Έβˆ‘π‘‘βˆ’1βˆ’6𝑖2βˆ’ξ€·π‘‡π‘₯βˆ’π‘‡π‘¦ξ€Έ/2𝑛𝑛2ξ€Έβˆ’1βˆ’π‘‡π‘₯𝑛𝑛2ξ€Έβˆ’1βˆ’π‘‡π‘Œξ€Έ,(3.1) where 𝑛 is the total number of bivariate observations of π‘₯ and 𝑦, 𝑑𝑖 is the difference between the rank of π‘₯ and the rank of 𝑦 in the 𝑖th observation, and 𝑇π‘₯ and 𝑇𝑦 are the number of tied observations of 𝑋 and π‘Œ, respectively.

The same six training windows employed for the network training were considered for the quantification of intermarket influence on the AORD. The correlation structure between stock markets also changes with time [31]. Therefore, each moving window was further divided into a number of small windows of length 22 days. 22 days of a stock market time series represent a trading month. Spearman's rank correlation coefficients (see (3.1) were calculated for these smaller windows within each moving window.

The absolute value of the correlation coefficient was considered when finding the median optimal correlation. This is appropriate as the main concern is the strength rather than the direction of the correlation (i.e., either positively or negatively correlated).

The objective function to be maximised (see Section 3.1.1 given below) is defined by Spearman’s correlation coefficient, which uses ranks of data. Therefore, the objective function is discontinuous. Solving such a global optimization problem is extremely difficult because of the unavailability of gradients. We used the same global optimization algorithm, AGOP, which was used for training the proposed algorithms (see Section 2.3) to solve this optimization problem.

3.1.1. Optimization Problem

Let π‘Œ(𝑑+1) be the relative return of the Close price of a selected dependent market at time 𝑑+1, and let 𝑋𝑗(𝑑) be the relative return of the Close price of the 𝑗th influential market at time 𝑑. Define π‘‹πœ‰(𝑑) as

π‘‹πœ‰(𝑑)=π‘—πœ‰π‘—π‘‹π‘—(𝑑),(3.2) where the coefficient πœ‰π‘—β‰₯0, 𝑗=1,2,…,π‘š measures the strength of influence from each influential market 𝑋𝑗, while π‘š is the total number of influential markets.

The aim is to find the optimal values of the coefficients, πœ‰=(πœ‰1,…,πœ‰π‘š), which maximise the rank correlation between π‘Œ(𝑑+1) and π‘‹πœ‰(𝑑) for a given window.

The correlation can be calculated for a window of a given size. This window can be defined as

𝑇𝑑0ξ€Έ=𝑑,𝑙0,𝑑0+1,…,𝑑0ξ€Ύ+(π‘™βˆ’1),(3.3) where 𝑑0 is the starting date of the window, and 𝑙 is its size (in days). This study sets 𝑙=22 days.

Spearman's correlation (see (3.1) between the variables π‘Œ(𝑑+1), π‘‹πœ‰(𝑑), π‘‘βˆˆπ‘‡(𝑑0,𝑙), defined on the window 𝑇(𝑑0,𝑙), will be denoted as

𝐢(πœ‰)=Corrπ‘Œ(𝑑+1),π‘‹πœ‰ξ€·π‘‘(𝑑)‖𝑇0,𝑙.(3.4) To define optimal values of the coefficients for a long time period, the following method is applied. Let [1,𝑇]={1,2,…,𝑇} be a given period (e.g., a large window). This period is divided into 𝑛 windows of size 𝑙 (we assume that 𝑇=𝑙×𝑛, 𝑛>1 is an integer) as follows:

π‘‡ξ€·π‘‘π‘˜ξ€Έ,𝑙,π‘˜=1,2,3,…,𝑛,(3.5) so that,

π‘‡ξ€·π‘‘π‘˜ξ€Έξ€·π‘‘,π‘™βˆ©π‘‡π‘˜β€²ξ€Έ,𝑙=πœ™forβˆ€π‘˜β‰ π‘˜ξ…ž,π‘›ξšπ‘˜=1π‘‡ξ€·π‘‘π‘˜ξ€Έ=[].,𝑙1,𝑇(3.6) The correlation coefficient between π‘Œ(𝑑+1) and π‘‹πœ‰(𝑑) defined on the window 𝑇(π‘‘π‘˜,𝑙) is denoted as πΆπ‘˜ξ€·π‘Œ(πœ‰)=Corr(𝑑+1),π‘‹πœ‰ξ€·π‘‘(𝑑)β€–π‘‡π‘˜,𝑙,π‘˜=1,…,𝑛.(3.7) To define an objective function over the period [1,𝑇], the median of the vector, (𝐢1(πœ‰),…,𝐢𝑛(πœ‰)), is used. Therefore, the optimization problem can be defined as

𝐢Maximise𝑓(πœ‰)=Median1(πœ‰),…,𝐢𝑛,(πœ‰)s.t.π‘—πœ‰π‘—=1,πœ‰π‘—β‰₯0,𝑗=1,2,…,π‘š.(3.8) The solution to (3.8) is a vector, πœ‰=(πœ‰1,…,πœ‰π‘š), where πœ‰π‘—,𝑗=1,2,…,π‘š denotes the strength of the influence from the 𝑗th influential market.

In this paper, the quantity, πœ‰π‘—π‘‹π‘—, is called the quantified relative return corresponding to the 𝑗th influential market.

3.2. Input Sets

The following six sets of inputs were used to train the modified network algorithms introduced in Section 2.3.

(1)Four input features of the relative returns of the Close prices of day 𝑑 of the market combination (i) (i.e., GSPC(𝑑), FTSE(𝑑), FCHI(𝑑), and GDAXI(𝑑)β€”denoted by GFFG.(2)Four input features of the quantified relative returns of the Close prices of day 𝑑 of the market combination (i) (i.e., πœ‰1 GSPC(𝑑), πœ‰2 FTSE(𝑑), πœ‰3 FCHI(𝑑), and πœ‰4 GDAXI(𝑑)β€”denoted by GFFG-q.(3)Single input feature consists of the sum of the quantified relative returns of the Close prices of day 𝑑 of the market combination (i) (i.e., πœ‰1 GSPC(𝑑) +πœ‰2 FTSE(𝑑) +πœ‰3 FCHI(𝑑) +πœ‰4 GDAXI(𝑑)β€”denoted by GFFG-sq.(4)Five input features of the relative returns of the Close prices of day 𝑑 of the market combination (ii) (i.e., GSPC(𝑑), FTSE(𝑑), FCHI(𝑑), GDAXI(𝑑), and AORD(𝑑)β€”denoted by GFFGA.(5)Five input features of the quantified relative returns of the Close prices of day 𝑑 of the market combination (ii) (i.e., πœ‰π΄1 GSPC(𝑑), πœ‰π΄2 FTSE(𝑑), πœ‰π΄3 FCHI(𝑑), πœ‰π΄4 GDAXI(𝑑), and πœ‰π΄5 AORD(𝑑)β€”denoted by GFFGA-q.(6)Single input feature consists of the sum of the quantified relative returns of the Close prices of day 𝑑 of the market combination (ii) (i.e., πœ‰π΄1 GSPC(𝑑) + πœ‰π΄2 FTSE(𝑑)+πœ‰π΄3 FCHI(𝑑)+πœ‰π΄4 GDAXI(𝑑)+πœ‰π΄5 AORD(𝑑)β€”denoted by GFFGA-sq.

(πœ‰1, πœ‰2, πœ‰3, πœ‰4) and (πœ‰π΄1, πœ‰π΄2,πœ‰π΄3, πœ‰π΄4) are solutions to (3.8) corresponding to the market combinations (i) and (ii), previously mentioned in Section 3. These solutions relating to the market combinations (i) and (ii) are shown in the Tables 1 and 2, respectively. We note that πœ‰π‘– and πœ‰π΄π‘–, 𝑖=1,2,3,4 are not necessarily be equal.

3.3. Evaluation Measures

The networks proposed in Section 2.3 output the (𝑑+1)th day relative returns of the Close price of the AORD. Subsequently, the output was classified into trading signals according to Criterion A (see Section 1).

The performance of the networks was evaluated by the overall classification rate (π‘ŸπΆπ΄) as well as by the overall misclassification rates (π‘ŸπΈ1 and π‘ŸπΈ2) which are defined as follows:

π‘ŸπΆπ΄=𝑁0𝑁𝑇×100,(3.9) where 𝑁0 and 𝑁𝑇 are the number of test cases with correct predictions and the total number of cases in the test sample, respectively, as follows:

π‘ŸπΈ1=𝑁1π‘π‘‡π‘ŸΓ—100,𝐸2=𝑁2𝑁𝑇×100,(3.10) where 𝑁1 is the number of test cases where a buy/sell signal is misclassified as a hold signals or vice versa. 𝑁2 is the test cases where a sell signal is classified as a buy signal and vice versa.

From a trader's point of view, the misclassification of a hold signal as a buy or sell signal is a more serious mistake than misclassifying a buy signal or a sell signal as a hold signal. The reason is in the former case a trader will loses the money by taking part in an unwise investment while in the later case he/she only lose the opportunity of making a profit, but no monetary loss. The most serious monetary loss occurs when a buy signal is misclassified as a sell signal and viceversa. Because of the seriousness of the mistake, π‘ŸπΈ2 plays a more important role in performance evaluation than π‘ŸπΈ1.

4. Results Obtained from Network Training

As mentioned in Section 3, different values for the discount rate, 𝑏, were tested. 𝑏=1,2,…,12 was considered when training 𝑁𝑁DLS. The prediction results improved with the value of 𝑏 up to 5. For 𝑏>5 the prediction results remained unchanged. Therefore, the value of 𝑏 was fixed at 5. As previously mentioned (see Section 3), 𝑏=5 was used as the discount rate also in 𝑁𝑁𝑇𝐢𝐢 algorithm.

We trained the four neural network algorithms by varying the structure of the network; that is by changing the number of hidden layers as well as the number of neurons per hidden layer. The best four prediction results corresponding to the four networks were obtained when the number of hidden layers equal to one is and, the number of neurons per hidden layer is equal to two (results are shown in Tables 12, 13, 14, 15). Therefore, only the results relevant to networks with two hidden neurons are presented in this section. Table 3 to Table 6 present the results relating to neural networks, 𝑁𝑁OLS, 𝑁𝑁DLS, 𝑁𝑁𝐢𝐢, and 𝑁𝑁𝑇𝐢𝐢, respectively.

The best prediction results from 𝑁𝑁OLS were obtained when the input set GFFG-q (see Section 3.2) was used as the input features (see Table 3). This input set consists of four inputs of the quantified relative returns of the Close price of day t of the GSPC and the three European stock indices.

𝑁𝑁DLS yielded nonzero values for the more serious classification error, π‘ŸπΈ2, when the multiple inputs (either quantified or not) were used as the input features (see Table 4). The best results were obtained when the networks were trained with the single input representing the sum of the quantified relative returns of the Close prices of day t of the GSPC, the European market indices, and the AORD (input set GFFGA-sq; see Section 3.2). When the networks were trained with the single inputs (input sets GFFG-sq and GFFGA-sq; see Section 3.2) the serious misclassifications were prevented.

The overall prediction results obtained from the 𝑁𝑁OLS seem to be better than those relating to 𝑁𝑁DLS, (see Tables 3 and 4).

Compared to the predictions obtained from 𝑁𝑁DLS, those relating to 𝑁𝑁𝐢𝐢 are better (see Tables 4 and 5). In this case the best prediction results were obtained when the relative returns of day t of the GSPC and the three European stock market indices (input set GFFG) were used as the input features (see Table 5). The classification rate was increased by 1.02% compared to that of the best prediction results produced by 𝑁𝑁OLS (see Tables 3 and 5).

Table 6 shows that 𝑁𝑁𝑇𝐢𝐢 also produced serious misclassifications. However, these networks produced high overall classification accuracy and also prevented serious misclassifications when the quantified relative returns of the Close prices of day t of the GSPC and the European stock market indices (input set GFFG-q) were used as the input features. The accuracy was the best among all four types of neural network algorithms considered in this study.

𝑁𝑁𝑇𝐢𝐢 provided 1.34% increase in the overall classification rate compared to 𝑁𝑁𝐢𝐢. When compared with the 𝑁𝑁OLS, 𝑁𝑁𝑇𝐢𝐢 showed a 2.37% increase in the overall classification rate, and this can be considered as a good improvement in predicting trading signals.

4.1. Comparison of the Performance of Modified Algorithms with that of the Standard FNN Algorithm

Table 7 presents the average (over six windows) classification rates, and misclassification rates related to prediction results obtained by training the standard FNN algorithm which consists of one hidden layer with two neurons. In order to compare the prediction results with those of the modified neural network algorithms, the number of hidden layers was fixed as one, while the number of hidden neurons were fixed as two. These FNNs was trained for the same six windows (see Section 3) with the same six input sets (see Section 3.2). The transfer functions employed are same as those of the modified neural network algorithms (see Section 2.3).

When the overall classification and overall misclassification rates given in Table 7 are compared with the respective rates (see Tables 3 to 6) corresponding to the modified neural network algorithms, it is clear that the standard FNN algorithm shows poorer performance than those of all four modified neural network algorithms. Therefore, it can be suggested that all modified neural network algorithms perform better when predicting the trading signals of the AORD.

4.2. Comparison of the Performance of the Modified Algorithms

The best predictions obtained by each algorithm were compared by using classification and misclassification rates. The classification rate indicates the proportion of correctly classified signals to a particular class out of the total number of actual signals in that class whereas, the misclassification rate indicates the proportion of incorrectly classified signals from a particular class to another class out of the total number of actual signals in the former class.

4.2.1. Prediction Accuracy

The average (over six windows) classification and misclassification rates related to the best prediction results obtained from 𝑁𝑁OLS, 𝑁𝑁DLS, 𝑁𝑁𝐢𝐢, and 𝑁𝑁𝑇𝐢𝐢 are shown in Tables 8 to 11, respectively.

Among the best networks corresponding to the four algorithms considered, the best network of the algorithm based on the proposed error function 2 (see (2.11) showed the best classification accuracies relating to buy and sell signals (27% and 25%, resp.; see Tables 8 to 11). Also this network classified more than 89% of the hold signals accurately and it is the second best rate for the hold signal. The rate of misclassification from hold signals to buy is the lowest when this network was used for prediction. The rate of misclassification from hold class to sell class is also comparatively low (6.22%, which is the second lowest among the four best predictions).

The network corresponding to the algorithm based on the proposed error function 1 (see (2.10) produced the second best prediction results. This network accounted for the second best prediction accuracies relating to buy and sell signals while it produced the best predictions relating to hold signals (Table 10).

4.3. Comparisons of Results with Other Similar Studies

Most of the studies [8, 9, 11, 13, 22], which used FNN algorithms for predictions, are aimed at predicting the direction (up or down) of a stock market index. Only a few studies [14, 17], which used the AORD as the target market index, predicted whether to buy, hold or sell stocks. These studies employed the standard FNN algorithm (that is with OLS error function) for prediction. However, the comparison of results obtained from this study with the above mentioned two studies is impossible as they are not in the same form.

5. Conclusions

The results obtained from the experiments show that the modified neural network algorithms introduced by this study perform better than the standard FNN algorithm in predicting the trading signals of the AORD. Furthermore, the neural network algorithms, based on the modified OLS error functions introduced by this study (see (2.10) and (2.11), produced better predictions of trading signals of the AORD. Of these two algorithms, the one-based on (2.11) showed the better performance. This algorithm produced the best predictions when the network consisted of one hidden layer with two neurons. The quantified relative returns of the Close prices of the GSPC and the three European stock market indices were used as the input features. This network prevented serious misclassifications such as misclassification of buy signals to sell signals and viceversa and also predicted trading signals with a higher degree of accuracy.

Also it can be suggested that the quantified intermarket influence on the AORD can be effectively used to predict its trading signals.

The algorithms proposed in this paper can also be used to predict whether it is best to buy, hold, or sell shares of any company listed under a given sector of the Australian Stock Exchange. For this case, the potential influential variables will be the share price indices of the companies listed under the stock of interest.

Furthermore, the approach proposed by this study can be applied to predict trading signals of any other global stock market index. Such a research direction would be very interesting especially in a period of economic recession, as the stock indices of the world’s major economies are strongly correlated during such periods.

Another useful research direction can be found in the area of marketing research. That is the modification of the proposed prediction approach to predict whether market share of a certain product goes up or not. In this case market shares of the competitive brands could be considered as the influential variables.