Abstract

Financial text-based risk prediction is an important subset for financial analysis. Through automatic analysis of public financial comments, fundamentals on current financial expectations can be evaluated. A deep learning method for financial risk prediction based on sentiment classification is proposed in this paper. The proposed method consists of two steps. Firstly, the abstract of the financial message is extracted according to the seq2seq model. During the extraction process, the seq2seq model can cope with the situation of different input message lengths. After the abstraction, invalid information in the financial messages can be effectively filtered, thus accelerating the subsequent sentiment classification step. The sentiment classification step is performed through the GRU model according to the abstracted texts. The proposed method has the following advantages: (1) it can handle financial messages of different lengths; (2) it can filter out the invalid information of financial messages; (3) because the extracted abstract is more refined, it can speed up the subsequent sentiment classification step; and (4) it has better sentiment classification accuracy. The proposed method in this paper is then verified through financial message dataset from the financial social network StockTwits. By comparing the classification performances, it can be seen that compared with the classical SVM and LSTM methods, the proposed method in this paper can improve the accuracy of sentiment classification by 5.57% and 2.58%, respectively.

1. Introduction

Despite having new tools and algorithms, financial market analysis is still a complex subject. The main purpose for financial market analysis is to assist investors in making decisions by analyzing the price fluctuations of the financial products. Financial risk analysis is a subtopic for financial market analysis, which can be adopted to predict the future price trends of financial products, and it is a hot topic of current research. Traditional financial analysis methods can be divided into two categories: fundamental analysis [13] and technical analysis [46]. Among them, fundamental analysis is to predict the future by studying the basic attributes of the company, which is suitable for relatively long-term forecasts. Technical analysis does not explicitly consider the company's internal and external characteristics but directly predicts the future through price fluctuations. Technical analysts believe that price fluctuations include all fundamental factors. Technical analysis models price fluctuations into time series and transforms them into pattern recognition problems. With the in-depth application of machine learning in financial analysis in recent years, an effective financial analysis method has become more and more popular; that is, by mining important features from financial messages, such as financial news [7, 8], financial comments [9, 10], and social networks [11, 12], the sentiment tendency for a large number of users or authors can be evaluated, thereby capable of predicting financial prices. The mentioned method not only has the factors in fundamental analysis but also has the advantages of automatic technical analysis, which has been a research hotspot in recent years.

With the accumulation of data samples in the financial text field, financial message sentiment analysis based on deep learning becomes possible. Through the deep neural networks, the public’s sentiment tendency for individual stocks or the overall economic situation can be obtained. There are usually two categories of the sentiment tendency: bullish and bearish. The authors in [13] conducted a research on the correlation between financial sentiment tendency and financial conditions in reality. The conclusion points out that through the Pearson correlation test, the public's financial sentiment tendencies and the reality financial situation (including market closing price, trading volume, and so on) have a strong correlation, which can be adopted for financial risk prediction.

The methods of financial message sentiment analysis can generally be divided into two categories: one is the traditional machine learning based and the other is deep learning based. For the methods based on traditional machine learning, the authors in [14, 15] proposed to classify the financial message through the support vector machine (SVM). In the classification process, the message is represented by the bag-of-words (BoW) approach. Traditional classifiers also include multinomial Naïve Bayes (mNB), random forest (RF), and so on [1619]. Traditional machine learning-based methods only extract shallow features in the messages for classification, which is often not as good as that based on deep learning. For the sentiment classification method based on deep learning, a large number of labeled samples can be adopted to automatically extract more abstract and deeper level features from the financial messages, thus having more accurate sentiment estimation.

Financial message-based sentiment estimation methods adopting deep learning can be divided into three categories according to the different neural networks. The first type of method adopts the convolutional neural networks (CNNs) [20]. In the CNN, deep level feature information can be extracted through the convolutional layer, thus making it effective for a wide range of applications in deep image processing. However, as the CNN-based method cannot use the correlation or the context information in the messages, the accuracy for sentiment classification is limited. The second type of method is based on the recurrent neural network (RNN). In this type, the RNN is adopted, which can effectively make use of the context information for financial sentiment classification. The RNN-based method usually has better classification performance than the CNN-based ones. Hiew et al. [21] adopted the long-short term memory (LSTM) network to classify financial messages, which can obtain a better classification performance. The publication in [22] compared the classification results of various RNNs in detail, including the simple RNN, LSTM, stacked LSTM, gated recurrent unit (GRU), stacked GRU, bi-directional LSTM, bi-directional GRU, and so on. The following conclusions are obtained according to the comparisons: LSTM and its variations have better classification performance. The publication has also proved that in these classifiers, different optimizers, such as rmsprop, Adam, and so on, have similar accuracy. For the third type of method, financial message sentiment analysis is carried out through transfer learning. By using a large number of messages in other text domain with sentiment tags and a small number of messages in the financial domain, transfer learning can be adopted to solve the problem of lack of training samples. In [23], transfer learning was carried out by fine-tuning. The convolutional neural network is pretrained through source domain samples, and then the financial domain messages are adopted to fine-tune the parameters. In actual implementation, all parameters can be recalculated during the pretraining process. For fine-tuning, only the high-level parameters are variable, while the low-level parameters are frozen. This can make the network quickly adapt to the samples in the financial message domain. The authors in [24] proposed to perform transfer learning through stacked denoising autoencoder (SDA). Firstly, the shared feature space in source message domains and financial message domains is extracted through the SDA. Then, the shared feature vector can be used to perform the financial message sentiment classification through other classifiers. In [25], transfer learning was carried out adopting adversarial learning. Through the gradient reversal module, the sentiment classifier and the data domain classifier are adopted for adversarial learning, so as to obtain shared feature representations of different data domains.

A new method for financial sentiment classification is proposed in this paper. The method is deep learning based and has two steps. Firstly, an abstract for the financial messages can be extracted according to the seq2seq model. During the extraction process, the situation of text input with different lengths can be dealt with. After the extraction, invalid information in financial messages can be effectively filtered out, which is favorable for the subsequent classification. The sentiment classification step is then performed through the GRU model according to the extracted abstracts. The proposed method has the advantage of coping with the situation of different message lengths. The extracted abstract can be regarded as a refinement of the raw message, which can filter out the invalid information of financial messages. As a result, it can speed up the subsequent sentiment classification process and can bring better sentiment classification accuracy. The proposed method in this paper is verified through financial message dataset from the financial social network StockTwits. By comparing the classification performances, the results show that compared with the classical SVM and LSTM methods, the proposed method in this paper can improve the accuracy of sentiment classification by 5.57% and 2.58%, respectively.

2. Methods

The overall structure of the proposed method in this paper is shown in Figure 1. It can be seen that the proposed method mainly consists of two steps. (1) Firstly, the abstract of the input financial message is extracted through the seq2seq model, noting that the extracted abstract has a fixed length. (2) The abstract is then adopted for sentiment classification through the GRU model. The results for the classification are twofold: bullish and bearish. The two-step method in this paper has the following advantages: (1) the seq2seq model can effectively be adopted to extract valid information from the financial messages, and the redundant information irrelevant to classification can be filtered out; and (2) the extracted abstract has a limited length and thus can accelerate the subsequent recognition. According to the above two steps, the proposed method will be discussed as follows.

2.1. Seq2seq Model-Based Message Abstraction

The seq2seq model has a relatively mature application in the translation task in the field of natural language processing (NLP). Since there exists similarity between the translation task and the abstract extraction task, the seq2seq model can also be adopted for abstract extraction. In this section, the characteristics of the seq2seq model are introduced firstly, and then the application of the seq2seq model in abstract extraction is introduced in detail.

The seq2seq model was proposed in [26], which can cope with the situation of different input and output sequence lengths. Therefore, it is suitable for translation tasks in NLP. The basic structure of the seq2seq model is shown in Figure 2, which has an encoder-decoder structure. In this structure, the input of one language is encoded into a vector through the RNN, and then the vector is used as the input of the decoding RNN. Then, the output obtained is another language after translation. The biggest advantage of this model is that it can cope with the input of varying lengths and can learn the mapping relationship between two different domains. Therefore, it is also suitable for abstract extraction herein, that is, from the original financial message domain to the financial abstract domain.

In our implementation, the adoption of the seq2seq model to financial abstract extraction mainly has the following key points:(1)Word embedding: due to the huge number of words in the financial message domain, direct one-hot encoding may result in large representation dimensions and a huge amount of training parameters, which makes it unsuitable for practical applications. Therefore, it is necessary to introduce the word embedding technology for reducing the dimensionality of the word representation vector. Traditional word embedding methods include the word2vec method, which uses words in the corpus to represent words according to a certain model, such as continuous bag of words (CBOW). After adopting the CBOW model, the obtained word has a lower dimensionality and can better reflect the similarity of the words in the word vector space [27, 28]. Figure 3 shows a schematic diagram of the training samples in the CBOW model. It can be seen that the training samples effectively contain the relevant information of the context, so that the word embedding after training can indicate the relevant information of the context.(2)Encoder-decoder structure: the basic encoder-decoder model is shown in Figure 2. In the encoder model, the output is a context vector, which has a low dimension and can effectively summarize the input sequence (in this paper, the input denotes the financial message sequence). The decoder can then output another sequence according to this vector. This sequence can make full use of the context vector information to generate a correlation with the original sequence. Generally speaking, the mentioned encoder and decoder have a similar structure and can both adopt the long-short term memory (LSTM) units to form the RNN. This can make the mentioned RNN capable of memory learning, according to its sequence input and output.(3)Attention mechanism: the seq2seq model based on the attention mechanism herein is shown in Figure 4. The seq2seq model based on the attention mechanism has been described in detail in [29]. The attention mechanism can solve the problem of information loss in the traditional seq2seq model, that is, when the input sequence is long while the context vector dimension is limited, the context vector from the encoder can lose more information, resulting in inaccurate output.In the above figure, denotes the word embedding representation of the input financial message sequence, which is pretrained according to the prementioned CBOW model. represents the output abstract word embedding representation, and denotes the context vector, which can be considered as the encoded information of the input sequence. The subscript i denotes the index of the input. Among them, the input length is variable, while the length of the abstract is fixed. In the model, both the encoder and decoder only contain one LSTM layer. The key to the seq2seq model based on the attention mechanism is to calculate the new context vector according to the following formula:where denotes the hidden variable corresponding to each output of the encoder, the subscript j denotes the output index, and denotes the attention weight, which is determined by :where is jointly determined by the hidden variables of the encoder and decoder at different times, which can be expressed aswhere the function of is a scoring function, and different strategies can correspond to different scoring functions.(4)Predication mechanism: after the training of the seq2seq model, it is necessary to predict the output of the abstract through the input. In the seq2seq model, the output of the previous state will affect the subsequent output. Therefore, if the greedy algorithm is always adopted to calculate the word with the highest output probability, errors may exist, which will affect the overall quality of the abstract. The BeamSearch algorithm [30] can be used to calculate several sequences with the largest continuous output probability to avoid accidental output errors of the decoder from affecting the overall sequence output.

2.2. GRU-Based Sentiment Classification

The GRU-based model is essentially an instance of the RNN model. The RNN model can obtain the input correlations of sequence input by expanding the calculation graph in the time domain. In the expansion, as the input at different times is processed by the same RNN calculation unit, which shares the same weights, the network is capable of context learning. However, due to the weight sharing and the time expansion characteristics, the RNN tends to have serious gradient disappearance and gradient explosion problems. In order to solve the aforementioned problem and to make the RNN more practical, gated recurrent neural network (gated RNN) was proposed in [31]. In the gated RNN, the calculation through time expansion is redesigned, so that the network can not only accumulate the information of the previous time but also gradually forget the less important information at previous times, thereby reducing redundancy. The most commonly adopted gating unit is the LSTM gating unit. As shown in Figure 5, each LSTM gating unit contains three gates: input gate, forget gate, and output gate. The input gate processes the input data as follows:where represents the output of the gating unit from the previous moment t − 1, which can also be regarded as the context vector, represents the input at the current moment t, denotes the sigmoid activation function, and and denote the parameters of the cell. The output gate separately processes the output of the current unit and the context vector from the current unit to the next unit:where represents the state obtained from the forget gate. The processing of the forget gate can be expressed by the following formulas.

The LSTM gating unit is widely adopted in RNNs because it can effectively avoid the problem of gradient disappearance or explosion. However, since one LSTM includes the above three different gate operations, the training cost is still relatively high.

In this paper, gated recurrence unit (GRU) [32] is adopted to replace the traditional LSTM unit. The GRU is an improvement to the LSTM. It contains only two gate-based processes: reset gate and update gate. Its block diagram is shown in Figure 6. The update gate of the GRU contains the functions of the input gate and the forget gate in the LSTM unit.

The restart gate:

The update gate:

The output of the GRU is

Among them, can be denoted as

3. Results and Discussion

In order to verify the effectiveness of the proposed method, the text messages from the financial social network StockTwits is adopted for verification. In this platform, users can post short messages and annotate the messages. There are two types of tags: bullish and bearish. In order to facilitate method comparisons, similar to the experiments in [22], messages from May to September in 2019 are selected to form the corresponding dataset, which has 12 stocks. The dataset contains a total of about 55,000 labeled samples, of which about 39,000 are classified as bullish and about 16,000 are classified as bearish. In our implementation, 80% of the samples are randomly selected to be the training dataset, and the remaining 20% are selected to be the testing dataset.

3.1. The Effects of Different Parameters over Classification Accuracy

According to the proposed method, the abstract is firstly extracted through the seq2seq model, and then sentiment classification is performed according to the extracted abstract. In the above process, different parameter choices will produce different performances. In this section, experiments are conducted on different parameters, and the impact of three different parameter selections on classification accuracy is studied. The parameters are threefold: (1) The dimension of the vector in the word embedding, the chosen dimension is 16, 32, and 64, respectively, noting that these dimensions are commonly seen vector representation dimensions for word embedding. (2) The length of the extracted abstract, the sequence length, varies from 8 to 64, with a step of 8. (3) The number of hidden units in the GRU cell. The following describes the experimental results of the three different parameter selections.

Table 1 shows the classification accuracy corresponding to the three different dimensions of the word embedding representation vectors. It can be seen that in the experiment, when the dimension of the representation vector is 64, the recognition accuracy rate can reach the maximum, which is 81.33%. Compared with the dimensions of 32 and 16, the recognition rate has improved by 0.10% and 15.43%, respectively. It can be seen that when the dimension of the representation vector changes from 32 to 64 dimensions, the effect on the classification accuracy is not obvious. Based on the comprehensive consideration of the classification accuracy and the computational burden, the dimension of the word embedding representation vector is selected as 32.

Figure 7 shows the effect of different lengths of the abstract sequence on the classification accuracy. Note that here the length of the extracted abstract herein is fixed. In the experiment, the sequence length is changed from 8 to 64, with a step of 8. As can be seen from the figure, generally speaking, the classification accuracy increases with the increase of the sequence length. When changing range is from 8 to 40, the increase in recognition accuracy is more obvious. When the dimension of the sequence length is greater than 40, the increase in recognition accuracy becomes less significant. When the sequence length is 40, compared to the sequence length of 8, the recognition rate is increased by 43.63%. When the sequence length is 64, compared to the sequence length of 40, the recognition rate is increased by 2.43%. In this paper, considering the limited increase in the recognition rate and the increase in the computational cost, the final abstract sequence length is selected as 40.

Table 2 gives the influence of the number of hidden units in the GRU cells on the recognition accuracy. It can be seen that when the number of hidden units is 64, the recognition accuracy rate is significantly improved compared to the hidden unit number of 16 and 32. But when compared to the number of 128, the improvement in recognition rate is insignificant. Therefore, the number of hidden neurons adopted herein is selected as 64.

3.2. Method Comparisons

The proposed method extracts the abstract of financial messages, which can reduce the influence of redundant information on subsequent classification on one hand, and on the other hand, sentiment classification can be accelerated since the input length is reduced. The proposed method in this paper is compared with two representative methods. One of the methods is based on the traditional SVM classifier. The other method is based on deep learning, which has two obvious differences from the proposed method: (1) the original financial message is directly adopted as the input without abstract extraction and (2) in the sentiment classification step, the gating unit used is the LSTM unit. The correct rate of classification corresponding to the three methods is shown in Table 3. It can be seen that the proposed method has a relatively obvious improvement in the accuracy. Compared with the SVM and LSTM methods, the accuracy of sentiment classification is improved by 5.57% and 2.58%, respectively.

In order to fully illustrate the effectiveness of the proposed method, it is compared with the following two methods with different strategies. In strategy one: without abstract extraction, the GRU gating unit is directly adopted for classification. In strategy two, where abstract extraction is performed, LSTM is adopted for classification instead of GRU. The results of the comparison are shown in Table 4. It can be seen that after abstract extraction, the classification accuracy has been significantly improved, which has increased by approximately 2.87%. This can fully illustrate the effectiveness of the abstract extraction strategy, which can filter out redundant information in the original financial message and extract effective information for sentiment classification. Using GRU instead of LSTM for classification can have a similar classification recognition rate. However, the model using GRU is easier to train than using LSTM. In this paper, the model training time has reduced by about 48%.

4. Conclusions

A new method for financial sentiment classification based on deep learning is proposed in this paper. The proposed method has two steps. (1) The abstract in the financial messages is extracted according to the seq2seq model. The extraction process can deal with the situation of text input with different lengths. After extraction, invalid information in financial messages can be effectively filtered out, and the subsequent classification can be accelerated. (2) After the abstract extraction, sentiment classification is performed through the GRU model according to the abstracts. The proposed method in this paper is verified through financial message dataset from the financial social network StockTwits. The results show that compared with the classical SVM and LSTM methods, the proposed method in this paper can improve the accuracy of sentiment classification by 5.57% and 2.58%, respectively. Moreover, the effectiveness of the proposed method can be proved with the strategy of adopting abstract extraction and adopting GRU. Compared with the strategy of not extracting abstract, the recognition rate is improved by 2.87%. After adopting GRU instead of LSTM, the training time of the model has reduced by about 48%.

Data Availability

The data adopted in the paper are available on the following website: https://stocktwits.com.

Conflicts of Interest

The authors declare that they have no conflicts of interest.