Abstract

Since the breakdown of the Bretton Woods system in the early 1970s, the foreign exchange (FX) market has become an important focus of both academic and practical research. There are many reasons why FX is important, but one of most important aspects is the determination of foreign investment values. Therefore, FX serves as the backbone of international investments and global trading. Additionally, because fluctuations in FX affect the value of imported and exported goods and services, such fluctuations have an important impact on the economic competitiveness of multinational corporations and countries. Therefore, the volatility of FX rates is a major concern for scholars and practitioners. Forecasting FX volatility is a crucial financial problem that is attracting significant attention based on its diverse implications. Recently, various deep learning models based on artificial neural networks (ANNs) have been widely employed in finance and economics, particularly for forecasting volatility. The main goal of this study was to predict FX volatility effectively using ANN models. To this end, we propose a hybrid model that combines the long short-term memory (LSTM) and autoencoder models. These deep learning models are known to perform well in time-series prediction for forecasting FX volatility. Therefore, we expect that our approach will be suitable for FX volatility prediction because it combines the merits of these two models. Methodologically, we employ the Foreign Exchange Volatility Index (FXVIX) as a measure of FX volatility. In particular, the three major FXVIX indices (EUVIX, BPVIX, and JYVIX) from 2010 to 2019 are considered, and we predict future prices using the proposed hybrid model. Our hybrid model utilizes an LSTM model as an encoder and decoder inside an autoencoder network. Additionally, we investigate FXVIX indices through subperiod analysis to examine how the proposed model’s forecasting performance is influenced by data distributions and outliers. Based on the empirical results, we can conclude that the proposed hybrid method, which we call the autoencoder-LSTM model, outperforms the traditional LSTM method. Additionally, the ability to learn the magnitude of data spread and singularities determines the accuracy of predictions made using deep learning models. In summary, this study established that FX volatility can be accurately predicted using a combination of deep learning models. Our findings have important implications for practitioners. Because forecasting volatility is an essential task for financial decision-making, this study will enable traders and policymakers to hedge or invest efficiently and make policy decisions based on volatility forecasting.

1. Introduction

Among various financial asset markets, the foreign exchange (FX) market has become increasingly volatile and fluid over the past decade. According to data released by BIS (Bank for International Settlements) in April of 2019, the global trading volume of FX commodity markets was $6.6 trillion per day, representing a 30% increase compared to April of 2016 ($5.1 trillion). With the advent of globalization and increased demand for overseas investment, the number of FX transactions has increased rapidly based on investments in companies in various countries. Additionally, FX rates significantly affect the estimation of currency risks and profits for international trades. Governments and policymakers are keeping a close watch on FX fluctuations to perform risk management. Therefore, FX is considered to be the most important financial index for international monetary markets (Huang et al. [1]).

In addition to FX rates, FX volatility has also been a significant source of concern for practitioners. FX volatility is defined by fluctuations in FX rates, so it is also known as a measure of FX risk. Because FX risk is directly linked to transaction costs related to international trade, it is of great importance for multinational firms, financial institutions, and traders who wish to hedge currency risks. In this regard, FX volatility has affected the external sector competitiveness of international trade and the global economy.

In particular, financial asset price volatility is a crucial concern for scholars, investors, and policymakers. This is because volatility is important for derivative pricing, hedging, portfolio selection, and risk management (see Vasilellis and Meade [2], Knopf et al. [3], Brownlees and Gallo [4], Gallo and Otranto [5], and Bollerslev et al. [6]). Therefore, the forecasting and modeling of volatility have recently become the focus of many empirical studies and theoretical investigations in academia. Forecasting volatility accurately remains a crucial challenge for scholars.

Because many academics and practitioners are interested in volatility, many studies on volatility prediction have been reported. In these studies, many approaches have been utilized for forecasting. The autoregressive conditional heteroscedasticity (ARCH) and generalized ARCH (GARCH) models proposed by Bollerslev [7] are mainly used to predict volatility (Vee et al. [8], Dhamija and Bhalla [9], Bala and Asemota [10], Kambouroudis et al. [11], and Köchling et al. [12]). Various characteristics of volatility, such as leverage effects, volatility clustering, and persistence (Cont [13] and Cont [14]), are the main reasons for employing GARCH-based models. Based on the recent development of artificial neural network (ANN) models, the use of ANN methods for forecasting volatility has increased (Pradeepkumar and Ravi [15], Liu [16], Ramos-Pérez et al. [17], and Bucci [18]). Previous studies have employed various ANN models, such as the random forest (RF) (Breiman [19]), support vector machine (SVM) (Cortes and Vapnik [20]), and long short-term memory (LSTM) (Hochreiter and Schmidhuber [21]). Several studies have shown that ANN methods outperform GARCH-based models for forecasting time series (see Pradeepkumar and Ravi [15], Liu [16], and Bucci [18]). Additionally, hybrid models based on ANNs and GARCH-type models have been introduced (Hajizadeh et al. [22], Kristjanpoller et al. [23], Kristjanpoller and Minutolo [24], Kim and Won [25], Baffour et al. [26], and Hu et al. [27]). Such models are reported to have advantages compared to using ANNs or GARCH-based models alone. Additional literature on this topic will be covered in Section 2.

Based on the discussion above, we focus on volatility forecasting based on FX volatility. As measures of FX volatility, we adopt three FX volatility indexes (FXVIXs), namely, the FX euro volatility index (EUVIX), FX British pound volatility index (BPVIX), and FX yen volatility index (JYVIX), which are equally weighted indices of the Chicago Board Option Exchange’s (CBOE’s) 30 day implied volatility readings for the euro (EUR), pound sterling (GBP), and Japanese yen (JPY), respectively. Because the three currency pairs of EUR/USD, USD/JPY, and GBP/USD are the three most heavily traded currency pairs on the FX market, we selected the three corresponding FXVIX indices. Additionally, these indexes reflect global economic trends (see Ishfaq et al. [28], Dicle and Dicle [29], and Pilbeam [30]). As mentioned previously, the forecasting of volatility in the FX market is important for global firms, financial institutions, and traders who wish to hedge currency risks (see Guo et al. [31], Abdalla [32], and Menkhoff et al. [33]).

Practically, the FX market consists of three associated components: spot transactions, forward transactions, and derivative contracts (Baffour et al. [26]). Additionally, because FX was originally defined by two currencies, FX has more observable factors that affect changes compared to other financial indices. Furthermore, according to Liu et al. [34], the periodic characteristics of the FX market are some of the main reasons why it is difficult to predict changes in the FX market. Therefore, we utilize ANN models as data-driven methods, rather than model-driven methods such as GARCH-type models, to forecast the three aforementioned FXVIXs. In particular, we employ the LSTM and autoencoder (Rumelhart et al. [35]) models as ANN techniques. We propose a hybrid neural network model based on these two models. To combine an autoencoder with LSTM, we apply LSTM as an encoder and decoder for sequence data inside an autoencoder network. Therefore, the proposed hybrid model can leverage the advantages of both the autoencoder and LSTM. A detailed discussion of this topic is presented in Section 3.

Methodologically, we adopt a machine learning algorithm (LSTM) to implement an autoencoder-LSTM model for forecasting FXVIXs from 2010 to 2019. We optimize the adopted algorithms using a grid search procedure provided by Full-Stack Python. Testing is also performed using subperiod analysis to investigate whether data deviations and outliers affect model training. Such subperiod analysis has been commonly implemented in previous studies (Sharma et al. [36], García and Kristjanpoller [37], Ramos-Pérez et al. [17], and Choi and Hong [38]). Specifically, we split the entire sample period into three subperiods called Period 1 (January, 2010 to December, 2015), Period 2 (January, 2016 to December, 2016), and Period 3 (January, 2017 to December, 2019). Period 2 exhibits uncertainty in the European market based on the Brexit movement. In this manner, we investigate the accuracy of prediction and model performance according to different data states.

There are two major aspects of this study that differ from previous studies. First, we use FXVIXs, which play key roles in the FX market. Although previous empirical studies have predicted various types of financial asset price volatility using various models, research on forecasting FXVIXs is scarce. Additionally, research on FX price prediction and volatility prediction using various approaches is being conducted, but research on the prediction of the FXVIX is relatively rare. Therefore, it is necessary to predict FXVIX volatility. Second, we propose a hybrid model based on an autoencoder and LSTM to forecast the three FXVIXs. LSTM is known to be good at forecasting time series (Fischer and Krauss [39], Kumar et al. [40], and Muzaffar and Afshari [41]), and one of the advantages of an autoencoder is that it can automatically extract features from input data (Phaisangittisagul and Chongprachawat [42], Zhang et al. [43], and Zeng et al. [44]). Therefore, the autoencoder technique has been widely used to predict time series data (Saha et al. [45], Lv et al. [46], Sagheer and Kotb [47], and Boquet et al. [48]). The proposed hybrid model has excellent potential as a novel method for forecasting the FXVIX and time series.

The main contributions of this paper can be summarized as follows:First, we expand upon previous studies by forecasting the FXVIX using ANN models. Our experiments were motivated by the observation that previous studies on the FX market have mainly focused on the FX rate, volatility of returns, or historical volatility. In particular, FXVIXs represent future FX risk measures for market participants. Therefore, our findings have important implications for practitioners managing FX risk exposure.Second, we propose a hybrid ANN model based on an autoencoder and LSTM. Forecasting performance results demonstrate that the proposed hybrid model outperforms traditional LSTM models. Consequently, this study contributes to the literature on developing ANN models by introducing a novel hybrid model.Our third major contribution is the optimization of model forecasting performance through subperiod analysis. Based on the empirical results of subperiod analysis, we can conclude that a wide distribution of input data and acceptable number of outliers improve forecasting performance.

The remainder of this paper is organized as follows. Section 2 presents a brief literature review on FX volatility and studies using machine learning in finance. Section 3 describes the data and methodology adopted in this study. Section 4 presents the results of empirical analysis for the full sample period and subperiod analysis. Finally, we provide concluding remarks in Section 5.

2. Literature Review

There is a vast body of literature on forecasting financial time series. In this section, we divide previous research into FX rate and FX volatility research according to the main focus of previous papers. Additionally, we also discuss literature on time-series forecasting using ANNs.

First, because the FX rate directly affects the income of multinational firms, many studies have focused on the forecasting FX rate and many studies have used ANN models to predict future FX rates. For example, Liu et al. [34] predicted EUR/USD, GBP/USD, and JPY/USD rates using a model based on a convolutional neural network (CNN). They demonstrated that such a model is suitable for processing 2D structural exchange rate data. Fu et al. [49] developed evolutionary support vector regression (SVR) models to forecast four Renminbi (RMB, Chinese yuan) exchange rates (CNY against USD, EUR, JPY, and GBP). They also demonstrated that the proposed model outperforms the multilayer perceptron (MLP) neural network, Elman neural network, and SVR models in terms of level forecasting accuracy measures. The authors of Sun et al. [50] introduced a novel ensemble deep learning approach based on LSTM and a bagging ensemble learning strategy to predict four major currencies (EUR/USD, GBP/USD, JPY/USD, and USD/CNY). According to their empirical results, their proposed model provided significantly improved forecasting accuracy compared to a traditional LSTM model.

As discussed in the previous section, FX volatility is also important for many academics and practitioners, so many studies have focused on FX volatility forecasting. In general, GARCH-based models have been used in many studies to predict FX volatility. Additionally, some studies have predicted FX volatility by incorporating different methodologies into GARCH models to improve forecasting power. For example, the authors of Vilasuso [51] predicted various FX rate volatilities (Canadian dollar, French franc, German mark, Italian lira, Japanese yen, and British pound) using a fractionally integrated GARCH (FIGARCH) model (Baillie et al. [52]). The empirical results of their study demonstrated that the FIGARCH model is better at capturing the features of FX volatility compared to the original GARCH model. The authors of Rapach and Strauss [53] demonstrated that structural breaks in the unconditional variance of FX rate returns can improve the forecasting performance of GARCH(1,1) models for FX volatility by incorporating the daily returns of the US dollar against the currencies of Canada, Denmark, Germany, Japan, Norway, Switzerland, and the UK. Pilbeam and Langeland [54] investigated whether various GARCH-based models can effectively forecast the FX volatility of the four currency pairs of the euro, pound, Swiss franc, and yen against the US dollar. In particular, their empirical results demonstrated that GARCH models perform better in periods of low volatility compared to periods of high volatility. You and Liu [55] employed the GARCH-MIDAS approach (Engle et al. [56]) to forecast the short-run volatility of six FX rates based on monetary fundamentals. They demonstrated that the forecasting power of daily FX volatility is significantly improved by including monthly monetary fundamental volatilities.

Various machine learning models have also been used to forecast time series originating from various fields, including engineering and finance. In finance, many studies have used machine learning to predict future stock prices. For example, Trafalis and Ince [57] compared SVR with backpropagation to a radial basis function network on the task of forecasting daily stock prices. Similarly, Henrique et al. [58] utilized SVR and a random walk (RW) method to predict daily stock prices in three different markets (Brazilian, American, and Chinese). Based on comparisons of the price prediction results of the SVR and RW models, they determined that SVR models may perform better than RW models in terms of predictive performance. Recently, various studies using machine learning methods and deep learning methodologies have been reported. For example, the authors of Selvin et al. [59] employed deep learning models, namely, a recurrent neural network (RNN), LSTM, and CNN to predict minute-wise stock prices. They determined that the CNN algorithm provided the best performance. Chong et al. [60] employed an autoencoder to extract features from stock data and constructed a deep neural network (DNN) to predict future stock returns. They determined that it is possible to extract features from a large set of raw data without relying on prior knowledge regarding predictors, which is one of the main advantages of DNNs. Pradeepkumar and Ravi [15] proposed a particle swarm optimization-trained quantile RNN to forecast FX volatility. Their model provides superior forecasting performance compared to the GARCH model. In [16] and [18], various ANN models were employed to predict the volatility of the S&P 500 stock index. According to the findings of these studies, ANN models are able to outperform traditional econometric methods, including GARCH and autoregressive moving average models. In particular, LSTM models seem to improve the accuracy of volatility forecasts. Additionally, Ramos-Pérez et al. [17] predicted S&P 500 index volatility using a stacked ANN model based on a set of various machine learning techniques, including gradient descent boosting, RF, and SVM. They demonstrated that volatility forecasts can be improved by stacking machine learning algorithms. Additionally, regardless of the volatility model adopted, high-volatility regimes lead to higher error rates.

Several studies have proposed hybrid models based on GARCH-based models and ANN models. For example, various GARCH-based models have been combined with ANNs based on MLPs and many hybrid models have been used to enhance the ability of GARCH models to forecast the volatility of stocks, gold, and FX rate returns (Hajizadeh et al. [22], Kristjanpoller et al. [23], Kristjanpoller and Minutolo [24], and Baffour et al. [26]). Additionally, some studies have proposed hybrids of LSTM and GARCH models and have used such models to predict the volatility of financial assets (Kim and Won [25] and Hu et al. [27]). According to empirical results, hybrid models based on GARCH and ANN techniques exhibit improved forecasting performance in terms of volatility accuracy.

In particular, we focus on studies using LSTM and autoencoder approaches for forecasting time series. LSTM, which was introduced by Hochreiter and Schmidhuber [21], has been widely used to forecast time series in many prediction studies. This method is mainly used to analyze time-series data because it can keep records of past data. Some studies have compared LSTM to traditional methods using neural networks or investigated such models by reconstructing both types of methods. As discussed by Siami-Namini et al. [61] and Ohanyan [62], as computing power improves, implementing deep learning models becomes more practical, and their performance exceeds that of traditional models. Additionally, Deorukhkar et al. [63] demonstrated that neural network models combined with autoregressive integrated moving average or LSTM models provide greater accuracy than either type of model individually. In [64], the method of applying preprocessed stock prices to an LSTM model using a wavelet transform was shown to be superior to traditional methods.

The autoencoder presented in [35] aims to generate a representation as close to an original input as possible from reduced encoding results. This method is a transformation of the basic model using stacked layers, denoising, and sparse representation and is used for financial time series prediction. Bao et al. [65] used LSTM and stacked autoencoders to forecast stock prices and demonstrated that this type of hybrid model is more powerful than an RNN or LSTM model alone. In [66], a stacked denoising autoencoder applied to gravitational searching was effective at predicting the direction of stock index movement, which is affected by underlying assets. Additionally, Sun et al. [67] explained that a stacked denoising autoencoder formed through the selection of training sets based on a K-nearest neighbors approach can improve the accuracy compared to traditional methods.

This study enhances the existing literature in two main aspects. We first propose a hybrid model that combines LSTM and an autoencoder to forecast FX volatility. There are other studies that have used hybrid models, but they have used models other than autoencoders and LSTM. Additionally, most studies have developed hybrid models based on GARCH models. However, as discussed above, LSTM and autoencoders perform well at time-series prediction, so we adopted these two types of models to forecast FX volatility. Second, as discussed in Section 1, FX volatility has great significance, but there is a significant lack of research on forecasting its changes. We contribute to the finance literature by forecasting FXVIXs using the proposed hybrid model.

3. Data Description and Methodologies

3.1. Data Description

The VIX was firstly implemented on the CBOE in 1993. This index is based on the real-time prices of options in the S&P 500 index. Because it is derived from the price inputs of S&P 500 index options, this index not only represents market expectations regarding 30 day forward-looking volatility but also provides a measure of market risk and investor sentiments. Subsequently, various VIXs with different basic assets were developed.

In this study, we investigated whether machine learning methods are suitable for forecasting FX volatility time-series data. Our data samples come from the CBOE. The CBOE is one of the world’s largest exchange holding companies, and it provides several derivatives related to implied VIXs. We adopted three currency-related volatility indices, namely, the BPVIX, JYVIX, and EUVIX. Similar to a VIX, FX volatility is calculated using a formula that averages the weighted prices of out-of-the-money puts and calls.

We collected 2520 daily time series FXVIX data from January of 2010 to December of 2019. Based on fluctuations caused by the Brexit movement, the data were divided into subsets from 2010 to 2015, 2016, and 2017 to 2019 based on instabilities in 2016. The first period represents the period of recovery following the subprime mortgage crisis and contains the most data (1514 daily data). As shown in Figure 1, the variability of the entire section appears to be large. This observation is confirmed by Table 1. The standard deviations of BPVIX, JYVIX, and EUVIX in this section are the largest among all periods, excluding BPVIX in 2016.

The second period represents the time around Brexit, which caused fluctuations in the global stock market, particularly in the European market. As shown in Figure 2, the UK index fluctuates the most, which affects the volatility of the European index. According to Table 1, BPVIX not only exhibits a high standard deviation but also has the largest difference between the maximum and minimum values.

The final period represents the time of uncertainty following the Brexit movement and recovery around the world. This period exhibits cyclic characteristics because the same problems arise repeatedly. Because intermediate trends between features of the first and second sections are visible, this section does not have any noteworthy features relative to the other sections. As shown in Figure 3, this period is longer than the second period, shorter than the first period, and less volatile than both periods, except JYVIX.

In this paper, for convenience, the three periods are referred to as Period 1, Period 2, and Period 3. Specifically, Period 1 ranges from 2010 to 2015, Period 2 covers 2016, and Period 3 ranges from 2017 to 2019. Similar subperiod analysis has been conducted in other studies (Gazioglu [68] and Grammatikos and Vermeulen [69]).

In machine learning, when constructing a model, performance evaluations are conducted. At this time, if a model trained on a particular training data set is evaluated on the same set, performance will be inflated by overfitting. Therefore, an original dataset should be divided into training and testing data, and a model should be trained on the training data. When evaluating performance, testing data, which were not used for training, are fed into the trained model. There is no ideal data allocation ratio for training and testing. With more training data, a model can see more examples and find better solutions, but overfitting may occur. Conversely, more testing data can lead to better generalization, but there underfitting may occur (Hastie et al. [70]).

According to Gu et al. [71], a simple data organization strategy generally uses of the data for training and of the data for testing. This strategy was applied to the development of the Cubist regression tree model. We organized our data to use of the data for training and of the data for testing to avoid overfitting. Various data divisions are summarized in Table 2.

Cross-validation techniques were also applied to prevent overfitting. However, when a cross-validation method that selects random samples (e.g., K-fold cross-validation) is applied to time-series data, past values are predicted using future values. Therefore, in this study, time-series nested cross-validation was adopted to maintain the temporal order of the dataset for gradual overlapping and learning. The proposed model was trained and tuned on training and validation sets in each fold and then evaluated on a testing set. This allowed errors to be averaged to obtain an unbiased error estimate (Varma and Simon [72]).

3.2. LSTM

An RNN is a representative neural network with a recurrent hidden layer. Through this hidden layer, updates are backpropagated to train the model. By taking the results of previous hidden nodes as input data, it is possible to learn continuous forms. Therefore, RNNs are often used to analyze or predict time-series sequential data, such as stocks.

LSTM, which is a specific case of an RNN, was proposed by Hochreiter and Schmidhuber [21]. This model is designed to overcome the vanishing gradient problem of RNNs, where early layers are not trained properly when a network becomes deeper. Figure 4 presents the flow of RNN and LSTM progression. In contrast to the hidden units of the RNN, the LSTM structure consists of memory blocks. There are three steps (layers) in the LSTM model: the forget layer, which is the main advantage of LSTM, the input layer, and output layer. First, the forget gate uses the sigmoid function, which is an activation function that converts current input data and the previous hidden state into numbers ranging from zero to one. Specifically, if an output is close to zero, it means that information cannot be passed to the next cell. In contrast, an output close to one means that information is passed to the next cell.

Second, the input gate is a sigmoid function and decides which information in and is stored in the cell state . At this step, there is also a layer which creates a vector of new candidate values () that could be added to the cell state . The cell state is updated by combining the outputs from the forget gate and input gate . By multiplying and , the amount of information from the previous time step cell that will be retained is determined. Furthermore, times represents the update information from the input gate.

Finally, the output gate applies the sigmoid function to the previous hidden state and current input to decide what the next hidden state should be. In addition, the current cell state is passed through a function. We multiply the output with the sigmoid output to decide what information the hidden state should carry. In summary, the LSTM transition equations are defined as follows:Gates:Input transformation:Memory update:where and are the weights and biases, respectively. The  denotes elementwise multiplication.

In Figure 5, the input values travel through three layers to overcome long-term dependencies using the following activation functions: (sigmoid) and . The sigmoid function outputs a number between zero and one, which is a measure of how much information each component should convey. helps keep the gradient as long as possible to prevent vanishing gradient problems.

3.3. Autoencoder
3.3.1. Basic Autoencoder

The autoencoder, which was first introduced in [35], utilizes a neural network consisting of an input layer, output layer, and hidden layers for self-supervised learning. Although this structure is similar to that of a typical neural network, the output and input layers have isomorphic vectors. The goal of this model is to derive a representation for an input dataset (e.g., dimensionality reduction) and make the reorganized data as close as possible to the input data. As shown in Figure 6, the encoder represents a stage at which the model can learn important characteristics of inputs and the decoder forms outputs similar to the inputs. The output represents a state in which the noise of the inputs is removed, resulting in more distinct characteristics. Based on these features, autoencoders are mainly used for image restoration or noise reduction.where is the weight between input an X and hidden representation Y, is the weight between a hidden representation Y and output , and b is the bias, and represent the encoder and decoder, respectively, accepts and compresses the input data () into a latent space (), and is responsible for accepting latent space () representations and reconstructing original inputs ().

This type of model is utilized in several methods to improve performance by manipulating hidden layers. A stacked autoencoder is used to solve the vanishing gradient problem by stacking hidden layers when a neural network is deep. Figure 7(a) presents a simple example of a stacked autoencoder. This structure increases the number of hidden nodes by stacking autoencoders hierarchically. A denoising autoencoder aims to extract stable structured data from dependent data by adding noise to input data and confirming that the output data correspond to pure input values. As shown in Figure 7(b), this model has a structure similar to that of a typical autoencoder, but it takes input data with added noise as new input data.

3.3.2. Autoencoder-LSTM

The autoencoder-LSTM model, which combines an autoencoder and advanced RNN, is implemented with an LSTM encoder and decoder for sequence data. This model has the same basic frame as an autoencoder, but is composed of LSTM layers, as shown in Figure 8(a). This model can learn complex and dynamic input sequence data from adjacent periods by using memory cells to remember long input sequence data.

The encoder and decoder components consist of two LSTM layers. To implement this structure, we adopted the “RepeatVector” tool provided by Keras, which is a deep learning API. Figure 8(b) presents the resulting structure.

3.4. Hyperparameter Optimization

A hyperparameter is a parameter that has a significant impact on the learning process. Maximizing model performance by finding optimal hyperparameter values to minimize a loss function is called hyperparameter optimization. This method is widely used in machine learning and deep learning. In this study, the well-known grid search method was adopted.

A grid search finds the best parameters among a parameter set defined by a user and applies several parameter candidates to the model sequentially to identify the cases with the best performance. If there are few parameter candidates, optimal values can be obtained rapidly. However, if there are many candidates, optimization requires exponentially more time.

In this study, we adopted the grid search algorithm because it is the simplest and most widely used algorithm for obtaining optimal hyperparameters (Schilling et al. [73]). Although a random search can perform much better than a grid search on high-dimensional problems according to Hutter et al. [74], our data represent a simple time series and the candidate parameter set is limited. These are the main reasons why we adopted the grid search algorithm (Sun et al. [75] and Thornton et al. [76]). The Python technological stack was used for our experiments. We implemented the machine learning algorithms and grid search using the Scikit Learn, Keras, and TensorFlow packages.

We used a grid search to identify and apply optimal parameters for each section of our model. The optimized parameters are the batch size, activation function, and optimizer function. Two or three candidate groups were defined for each parameter.

More parameters and candidate groups could be defined, but it would increase training time significantly. We divided the data into three intervals and attempted to compare two models, thereby limiting the candidate groups to make the most of our limited resources.

Next, we optimized three parameters for stochastic gradient descent. The candidate batch sizes were 50 and 100, the activation functions were linear and ReLU, and the optimization functions were Adam, rmsprop, and nadam. The learning rates were default values built into each activation function (sprosprop: 0.001, Adam: 0.001, and nadam: 0.002).

Finally, the autoencoder and autoencoder-LSTM models were unified into four layers: two encoding layers and two decoding layers. Based on the small amount of testing data, this small depth was determined to be sufficient.

4. Empirical Results

We used the aforementioned grid search to find optimal parameter combinations. Among a total of 12 parameter combinations, the best parameters were identified and six optimizations were performed for the two models (LSTM and autoencoder-LSTM) and three periods in the same manner. The results obtained via hyperparameter optimization are listed in Table 3.

The goal of this study was to obtain an accurate model for forecasting FXVIXs. We considered three FXVIXs with different distributions and outliers were different. We compare the forecasting performances of our models in terms of distributions and outliers. To this end, the forecasting results are split by period and separated by index. As methods for measuring error, the regression error metrics of mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) were adopted. Additionally, distributions were defined by variances and standard deviations. Outlier detection was applied using Tukey’s box plot method, which defines outliers as samples that do not fall within the scope defined below:where is the quantile and is an interquantile range defined as follows: . To identify extreme outliers, multiplication by 1.5 was replaced with multiplication by 3.

Our main findings can be summarized as follows. First, the opportunities to learn volatility and forecast accuracy have a proportional relationship. In other words, there are many sections that rise and fall in the training data and learning these trends can improve prediction accuracy. As shown in Figure 9, the distribution is broad and there are many outliers in the order of (a) outliers in Period 1, (c) outliers in Period 3, and (b) outliers in Period 2. Second, as shown in Tables 46 and Figures 1015, autoencoder-LSTM is affected more by variance and outliers than LSTM alone. In situations where variance and outliers exist in moderation, the LSTM model using an autoencoder, which can derive the features of inputs accurately, performs better than the model without an autoencoder. Third, among the deep learning methods, the autoencoder-LSTM exhibits the best prediction performance. In Tables 46, the results of the autoencoder-LSTM were analyzed to verify that the input data characteristics outperformed those of the general LSTM. Visual graphs of this trend are presented in Figures 1015.

5. Summary and Concluding Remarks

The goal of this study was to develop a hybrid model based on deep learning models for forecasting FX volatility. In particular, we utilized the three FXVIXs as measures of FX volatility. An FXVIX represents the relationship between the currency of a country and the US dollar. Therefore, this study is meaningful because the FXVIX, which is related to the US and the global economy, sensitively reflects international economic trends.

Data-driven methods are more powerful than model-driven methods for forecasting asset price time-series data (see Kim et al. [77]). In this study, we investigated how event-driven data, which focus on events such as outliers in data-driven analysis, contribute to model performance. According to Shahid et al. [78], events and outliers are different, but outliers can be considered as a type of event. Because there is only one type of outlier in the data considered in this study, comparing differences in model performance accordingly is meaningful.

Our empirical results provide several interesting conclusions with useful practical implications. Our main findings can be summarized as follows. First, the spread of data and presence of outliers increase the accuracy of forecasting performance of the proposed model. Second, improvements in prediction accuracy are more pronounced with autoencoder-LSTM than with LSTM. Finally, for predicting FXVIXs, the autoencoder-LSTM model is superior to the LSTM.

Based on the empirical findings in Section 4, some implications can be observed. First, because the neural network model is a model created by mimicking the human brain, the data to be learned are important. As shown in this study, the forecasting accuracy of the hybrid model is affected by the number of cases for which variability and outliers can be learned. However, extreme outliers in Period 2 degraded the model’s performance. Next, the use of an autoencoder, which can transform important properties of input data, similar to principal component analysis, is meaningful. Autoencoders are used for denoising images, watermark removal, dimensionality reduction, and feature variation among other tasks. In this study, we conceived the concept of feature variation. Additionally, several studies using autoencoders to predict time series have been recently published (Gensler et al. [79], Bao et al. [65], and Sagheer and Kotb [47]). Our study contributes to the literature by introducing a new approach called the autoencoder-LSTM for forecasting time series.

In practice, our findings can be helpful to researchers in economic research laboratories or policy managers who determine national economic policies because FXVIXs reveal important trends for FX that impact the global economy and volatility, meaning they can reveal market participant psychology. For example, Menkhoff et al. [33] demonstrated that global exchange volatility has a significant effect trading strategies based on financial data. Guo et al. [31] confirmed the effects of exchange rate volatility on the stock market. Similar experiments can be considered for future research on different financial indices, such as the S&P 500 and Dow Jones Industrial Average, which are important indices for understanding US and global markets (Ivanov et al. [80] and Liu et al. [81]). The US has the world’s largest financial market and plays an important role in determining the trends of the international financial market. Therefore, we expect that predicting these indices will be as meaningful as predicting FXVIXs. Additionally, we expect that we can improve prediction accuracy by learning and incorporating data that can affect each index based on the results of this study, where we only considered FXVIXs. Additionally, hyperparameter optimization was performed using only a grid search, which is a commonly used machine learning algorithm, but we could increase the reliability of prediction by considering additional optimization algorithms.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors have declared that there are no conflicts of interest.

Acknowledgments

The work of S. Y. Choi was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIT) (no. 2019R1G1A1010278).