#### Abstract

The product online review text contains a large number of opinions and emotions. In order to identify the public’s emotional and tendentious information, we present reinforcement learning models in which sentiment classification algorithms of product online review corpus are discussed in this paper. In order to explore the classification effect of different sentiment classification algorithms, we conducted a research on Naive Bayesian algorithm, support vector machine algorithm, and neural network algorithm and carried out some comparison using a concrete example. The evaluation indexes and the three algorithms are compared in different lengths of sentence and word vector dimensions. The results present that neural network algorithm is effective in the sentiment classification of product online review corpus.

#### 1. Introduction

In the field of natural language processing, emotion analysis has always been a hot research field. With the development of the Internet, a large number of business reviews have emerged on various platforms, most of which are mixed with users’ personal opinions on commodities. Therefore, the discriminative research on the emotional polarity of these review texts can help enterprises better understand the customer satisfaction of their products or services [1]. Based on the emotional polarity of comments, we could mine the advantages and disadvantages of products. Then, we could obtain suggestions for product promotion and improvement. The traditional method of discriminating emotional polarity of text is based on machine learning since the 1990s. Traditional machine learning methods are mainly divided into two steps. The first step is to construct the word vector feature manually to obtain the required text information. The second step requires the construction of a classifier to classify the emotional polarity of the text. Classical reinforcement learning methods can be basically used in text classification, such as support vector machine, random forest, Naive Bayesian, neural network, and other algorithms.

##### 1.1. Construct Word Vector Features Manually

In this step, the traditional method of generating word vectors is Bag-of-word (BOW) [2], which converts each word into a one-hot vector based on the pre-established dictionary. One disadvantage of this method is that the text vector obtained has the characteristics of high dimension and lack of semantics. Therefore, methods such as TF-IDF and SVD model are used to reduce the dimension of word vector appearance. Karie and Venter calculate semantic similarity of the returned results by means of external engines and input words into searching engines, expanding the semantic information of word vectors [3]. In order to enable vectors to represent context information, models such as LDA and word embedding have also been presented [4]. The word embedding model is an important research result that introduces deep learning algorithm into the field of natural language.

##### 1.2. Research on Text Classification Model Based on Naive Bayesian

Naive Bayesian model is one of the earliest classification algorithms used for text classification task. Its principle is very simple. Based on Bayes’ theorem, it assumes that the word vectors of each word are independent from each other. Then, the prior probability of each word in the corpus is calculated in the training set of the sample. The probability that the test set of the sample is summarized in each category is predicted [5]. Although Bayesian algorithm is simple, it depends on the prior probability of samples much more. Therefore, when the distribution of each sample category in the training set is different, the features of a small number of samples will be replaced by those of a large number of samples.

##### 1.3. Research on Text Classification Based on Support Vector Machine Model

Support vector machine (SVM, for short) model, first proposed by Vapnik in 1995, is based on a combination of the VC dimension theory and risk structure minimization theory. The sample information needed by support vector machine is very limited, so it performs well in solving the problem of small sample and nonlinear text classification. Support vector machine can solve dual problems and use linear method to solve nonlinear problems. Support vector machine can solve the problem of linear inseparability of samples in low-dimensional space by introducing a kernel function. Support vector machine learning algorithm is proposed, and it combines the features of words, parts of speech, and named entities for the text classification task with named entity elements which could achieve good results in the text classification task [6].

##### 1.4. Research on Text Classification Model Based on Deep Learning

In recent years, deep learning has been widely applied in the field of constructing classifiers to classify the emotional polarity of texts. Deep learning model can automatically extract features from the data [7, 8]. For example, Bengio et al. build a neural probabilistic model based on the idea of deep learning and use various deep neural networks to learn on a large-scale English corpus [9]. Deep learning can solve multiple tasks of natural language processing such as named entity recognition and syntactic analysis. In the industrial field, the controlled system usually has great nonlinearity [10–15]. Neural network models have been applied to the identification of nonlinear systems. Convolutional Netural Network (CNN, for short) and Recurrent Neural Network (RNN, for short) have been proved to be effective models for effective classification tasks in the nonlinear system. In terms of the emotional classification of the text, some models of cyclic neural network and convolutional neural network are used to classify the emotions of the short text, and excellent results were obtained [16–20]. However, due to the gradient explosion problem of RNN model, LSTM and GRU models based on the RNN model are more commonly used models [21–25]. Miyamoto et al. applied the LSTM model to text prediction [26]. Duyu et al. applied the LSTM model to emotion classification and the LSTM model achieved good results [27]. The LSTM model has good long-distance feature extraction ability and can extract the relationship between two sequences that are far apart. For classification, important information is not uniformly distributed in the text in the LSTM model. In order to solve this problem, researchers have put forward the attention mechanism [28]. Different weights of each element in the text were presented in the mechanism in the LSTM model, and weights of each element in the text were iteratively updated through training in the LSTM model.

#### 2. Reinforcement Learning of Text Sentiment Classification Algorithm

Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. Naive Bayesian algorithm, support vector machine, and neural network model are models of reinforcement learning for text sentiment classification.

##### 2.1. Naive Bayesian Algorithm

Naive Bayesian is a machine learning algorithm based on probability theory. Its core theory is Bayes’ Theorem. Suppose that, after word segmentation, a certain sentence corpus is composed of words: , , . is presented as an *n*-dimensional vector by cut-off or adding 0. There are categories in total, noted as , , . The category with the highest probability is obtained through the Bayesian classifier. The highest probability is computed using

For each sentence of the fixed corpus , is a certain value. So, formula (1) can be transformed into solving the maximum value of

Naive Bayesian classifier is based on an assumption that each dimension of the word vector is independent of each other. It means that each feature of the data is independent when it applied to statistics. So, formula (2) can be converted into a solution maximum value ofwhere is the prior probability, which represents the frequency of occurrence of word in a certain category aswhere indicates the number of times the current word appears in the current category and indicates the total number of words in the current category.

##### 2.2. Support Vector Machine

Support vector machine model is used to classify the data. is the set of sample data points. is the sentence corpus. is the corresponding label. Text sentiment classification defined the problem which is the optimization problem solved from

Because the data are linearly inseparable in the process of training, the kernel function and penalty factors of support vector machine are introduced to solve from is the optimal solution. Formula (7) is the decision function:

##### 2.3. Neural Network Model

We use the Gated Recurrent Unit (GRU, for short) model as the basic model of the text sentiment classification model in this paper. The GRU model can learn long-term dependence information. GRU usually acts as a recurrent unit that leverages a reset gate and an update gate to control how much information flow from the history state and the current input, respectively. In the GRU model, the current unit state is obtained by calculating and summing the previous unit state. The GRU model can obtain historical information and current information, which is very helpful for extracting the above information in language processing.

The hidden layer of the GRU model could do most of the work. The GRU has two gates, a reset gate and an update gate. The GRU model can learn the long-term dependence of the text. The reset gate is calculated from is the state of the previous time step. is the input of the current time step. is the weight. The update gate is calculated from

GRU unit status is update from

The unit output layer is calculated from

The input layer is noted as . The output of the hidden layer is noted as . The calculation method for each layer of the GRU model is as follows:

Here, and .

In order to evaluate the generalization ability of the model on the test set accurately, we use the ten-fold training method to test the model performance. The basic steps of the ten-fold training method are as follows: Step 1: first, we divide the test set data into ten parts. Step 2: then, we put nine pieces of data into the classification for training. The remaining piece of data is used as the test set. We calculate the accuracy and recall rate of the test set on the classifier after training. Step 3: we repeat Step 2. Here, we select one of the nine training sets that have been trained as the test set. Then, we convert the original test set to the training set for model training until ten sets of data are selected in turn. Step 4: we calculate the average value of the evaluation parameters, which is the final result.

#### 3. Training and Evaluation Parameters of Reinforcement Learning

We select accuracy and *F* value as evaluation parameters in this paper. First, we introduce the confusion matrix in information retrieval, which is shown in Table 1. Among them, TP is a pair of feature views that are correctly classified as positive emotions. FP is misclassified as negative emotions. About feature opinion pairs, FN is a feature opinion pair that is misclassified as positive emotion. TN is a feature opinion pair that is classified as negative emotion correctly.

The calculation formulas for accuracy and F value are as follows:

In the training process of the model, we use the grid search technique to find the optimal parameters of Naive Bayesian classifier support and the support vector machine classifier in this paper. The training parameters of each classifier are shown in Tables 2 and 3.

#### 4. Model Test of Reinforcement Learning

In order to evaluate the results of the classification model in different word vector dimensions and different sentence lengths, we train models in different dimensions of word vectors and different sentence lengths. The training set contains 7000 online comments of a certain brand of tablet computers (the data is product online review corpus in 2017-2018 which is crawled by us from an online mall. It can be downloaded from https://pan.baidu.com/s/16AYTrzjWZDKXJ0iPqlZUvw (Extraction code: wr01)). The training set and the test set are divided according to the ratio of 7 : 3. From the statistical information of the corpus, it can be seen that the number of words in most of the sentence is below 50 words. If the number of words in the sentence is too low, it would lose the meaning of training. Therefore, we select sentences in which the number of words is in an interval of [15, 50] and the word vector dimension to an interval of [100, 250].

The training result is shown in Tables 4–6. Table 4 is the classification results in different sentence lengths and different dimensions by the Bayesian classifier.

Table 5 is the classification results in different sentence lengths and different dimensions by support vector machine classifier.

Table 6 is the classification results in different sentence lengths and different dimensions by the GRU classifier.

From Tables 4–6, we find that the RNN model algorithm has a better classification effect and the RNN model algorithm effect is significantly better than the other two algorithms. From Table 4, we find the accuracy is stable at about 64% from the Naive Bayesian model. However, the *F* value continues to rise with the increase of sentence length and the highest value is still only about 67% from the Naive Bayesian model. We believe that the Bayesian model is a prior probability model and is more dependent on the big data. In this paper, the sample set is small or medium and its prior probability distribution is not accurate. So, the result is not well as expected from the Naive Bayesian model. In contrast, the result from the support vector machine model is much better. With the continuous increase of sentence length and word vector dimensions, the accuracy and *F* value from the support vector machine model are maintained at about 78%. It can be seen that when the sentence length is increased to 40, the accuracy has also decreased slightly, and similar accuracy rates are maintained before and after this length from the support vector machine. It can be considered that the model has converged at this length from the support vector machine. The result from the neural network model is the best. The accuracy is around 90% from the neural network model. Due to the powerful feature extraction capabilities, the neural network model is better than the other two models. It can be seen from Table 6 that the GRU model has the best classification effect on a sentence of which the number of words is 40 words and 200 dimensions of word vector.

#### 5. Conclusions

At present, more general scenarios for reinforcement learning and adaptive optimization present a major challenge in complex dynamic systems. The judgment of text sentiment tendency is a hot direction in the field of natural language. We study the sentiment classification algorithm of online reviews. Due to the remarkable effect of machine learning, we select three kinds of machine learning methods: Naive Bayesian, support vector machine, and neural network for comparative research. In order to evaluate the performance of the algorithm on different sentence lengths and word vector dimensions, we train these three models in different dimensions. Finally, using an experiment on an online, we find that the neural network algorithm is effective in classification.

#### Data Availability

Data used to support the findings of this study are available from the corresponding author upon request or can be downloaded from https://pan.baidu.com/s/16AYTrzjWZDKXJ0iPqlZUvw, Extraction code: wr01.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This work was funded by the National Philosophy and Social Science General Foundation of China (no. 19BGL234) and Ministry of Education of Humanities and Social Science Foundation of China (no. 17YJCZH199). The authors gratefully acknowledge the National Office for Philosophy and Social Sciences of China and Ministry of Education of China for financial support.