Opinion mining plays an important role in public opinion monitoring, commodity evaluation, government governance, and other areas. One of the basic tasks of opinion mining is to extract the expression elements, which can be further divided into direct subjective expression and expressive subjective expression. For the task of subjective expression extraction, the methods based on neural network can learn features automatically without exhaustive feature engineering and have been proved to be efficient for opinion mining. Constructing adequate input vector which can encode sufficient information is a challenge of neural network-based approach. To cope with this problem, a novel representation method that combines the different features with word vectors is proposed. Then, we use neural network and conditional random field to train and predict the expressions and carry out comparative experiments on different methods and features combinations. Experimental results show the performance of the proposed model, and the F value outperforms other methods in comparative experimental dataset. Our work can provide hint for further research on opinion expression extraction.

1. Introduction

Information retrieved from social network not only includes the objective facts of events but also contains the opinions expressed by the people, organizations, or the media. Sometimes, the relevant opinions of events are worthier than the events themselves. The main opinions on events often need to be tracked, analyzed, and extracted manually, which is time-consuming and laborious. It is difficult to quickly locate and extract the views conveyed in news events from huge amounts of information. If we can automatically mine and extract all relevant opinion information of these news events and then dig out the main viewpoints, it will be helpful to understand the attitudes and positions of related parties towards the events, so that public opinion monitoring and analysis can go deep into the viewpoint level from the facts level and then further improve the monitoring and understanding of public opinion on events. In a word, the opinion mining is of great significance to the monitoring and analysis of public opinion in social network.

The level of opinion mining can be divided into word level, sentence level, and text level. Word level methods tend to focus on the attitudes, opinions, and sentiment polarity contained in the words themselves. The common mechanisms are to search approximate words heuristically according to the existing opinion word seeds, expand the original opinion lexicon, and then judge view sentences according to the vocabulary. Text-level mining methods often evaluate the emotional tendency by extracting viewpoint sentences from the text and synthesizing multiple viewpoint sentences.

Sentence-level opinion mining methods not only make better use of context but also find out subjectivity expression elements, opinion holders, entities, view content, sentiment polarity, and emotional strength in more detail. Therefore, this paper uses sentence to carry out the research of opinion expression extraction.

In general, the elements of opinion sentences include subjective expression elements, opinion holders, entities, opinion content, sentiment polarity, and emotional intensity [1]. Among them, subjective expression is the core of opinion; it expresses the personal attitude and emotion towards related event of opinion holders, which is composed of words or phrases that contain emotional, evaluative, and speculative information. There may be more than one viewpoint in an opinion sentence. These viewpoints are marked by subjective expressive elements and have emotional tendencies and intensity. Moreover, the subjective expressive elements can be divided into direct subjective expressions (DSE) and expressive subjective expressions (ESE). Contrary to the subjective expressive elements, objective expression is used to state objective facts. Figure 1 shows the classification of expression elements.

The key difference between DSE and ESE is that the former explicitly gives a subjective expression and its source, while the latter presupposes a subjective expression and its source but does not introduce them. For instance, the sentence “She insists that the screen of the new phone is so cool” is typical DSE. When annotating and identifying these elements, they need to be closely combined with the context to avoid ignoring more implicit elements.

Due to the diversity of linguistic expressions and the missing of opinion elements, identifying and extracting freely or implicitly subjective expression is hard to tackle. Automatic recognition of subjective expression extraction is still a challenging task that needs to be further researched.

The main contributions of this paper are summarized as follows:(1)A subjective factor extraction model based on neural network and CRF is proposed, which can support DSE and ESE extraction, and achieves better recognition results on open data sets.(2)This paper proposes a semantic representation method based on word vectors, which combines lexical and named entity features. It can capture sentence information and input it into the neural network model for feature recognition.(3)Contrast experiments are carried out on the experimental data, and the differences, advantages, and disadvantages of different subjective factor extraction methods are fully compared and analyzed, which can provide reference for the follow-up study.

The article is organized as follows. Section 2 gives a brief introduction to the related research. Section 3 focuses on the methods proposed in this paper. In Section 4, the experiment and analysis are carried out, and the conclusion is given in Section 5.

News report is generally objective; it can describe the objective facts, the opinions of the parties concerned, and even the attitude and tendency of the authors themselves. Generally, a news report may include multiple opinions from many opinion holders on different targets. Extracting opinion sentences and searching for subjective expression elements, opinion holders, target objects, sentiment polarity, and intensity are the subtasks of opinion mining [25].

Kim and Hovy first use semisupervised method to mine opinion words, sentiment polarity, and opinion holders [6]. They use WordNet to find the closest synonyms from the seed vocabulary with three emotional polarities, identify the viewpoint words and their polarities in the text, and then use the maximum entropy classifier to identify the viewpoint holder according to the syntactic characteristics. Another work they contributed is to identify opinion expression elements, opinion holders, and content through semantic role tagging (SRT) [7]. First, they summarize the semantic roles of the three elements and construct corpus. Then, they determine the semantic roles of words in the text through maximum entropy classifier and then determine the three opinion elements in news text.

Weibe et al. define opinion sentences in news texts as sentences that express private state for opinion holders [1]. Private state is often expressed by words or phrases, which are called subjective expressive elements with certain sentiment polarity and intensity. According to the definition, they tagged Multiperspective Question Answering (MPQA) corpus of news text for opinion mining and made many subsequent studies based on MPQA. Breck et al. [8] transform the recognition of subjective expressive elements of views into sequential tagging task. Conditional random field (CRF) is used to identify subjective expressive elements by combining word features, part-of-speech features, lexical features, and WordNet. It is found that the characteristics of WordNet play a significant role in improving the recall rate, and subjective expression factors often appear in a pile.

Yang and Cardie [5] used Semi-Markov Conditional Random Field (Semi-CRF) to identify subjective expressive elements in sentences, which is innovative in dealing with phrase level rather than word level. Semi-CRF allows the acquisition of features from clause sequences. It can deal with the boundary better, but the time complexity of the method is relatively high.

Later research extends to a variety of opinion elements, not only to the identification of subjective expression elements, but also to the identification of elements such as sentiment polarity, intensity, opinion holders, and target entities. Fine-grained elements extraction further deepens the research level of opinion mining. Johansson and Moschitti combined syntactic and semantic features to identify subjective expression elements and their polarity [9]. With more and more research on extraction of multiple opinion elements, many joint extraction methods have been proposed. Choi et al. [10, 11] proposed two methods: one is to combine CRF with Integer Linear Programming (ILP) and the other is to use hierarchical parameter sharing technology CRF.

With the breakthrough of deep learning in the field of natural language processing, researchers have applied it to opinion mining. Irsoy et al. introduce recursive neural network (RNN) to identify subjective expression elements, opinion holders, and target objects. The experimental results show that the combination of bidirectional recursive neural network and bidirectional recurrent neural network can perform better in multiple subtasks. Then a multilayer cyclic neural network model [4] is proposed to identify the subjective expression elements, and it is found that the relatively narrow and deep cyclic neural network performs better.

Wang et al. [12] use bidirectional long short-term memory (BiLSTM) to identify subjective expression elements and explored the internal mechanism of long short-term memory (LSTM). It was found that LSTM has better adaptability to context than regular RNN. Du et al. used the model of multilayer BiLSTM combined with attention mechanism [13] to detect subjective expression factors. Experiments show that attention mechanism helps to link information in context. Zhang et al. [14] proposed a model based on encoder-decoder to jointly mine the relationship between opinion elements.

Compared with traditional machine learning methods, the method of extracting opinion elements based on neural network usually uses word vectors as features directly and does not need to construct a large number of complex features artificially, so as to reduce the dependence of the model on the features artificially constructed. Based on the compound model of bidirectional long-term and short-term memory network and conditional random field, this paper combines the advantages of both to extract more accurate subjective expression elements.

To sum up, whether based on CRF, neural network model, or combine different levels of lexical and syntactic features, subjectivity expression extraction will confront the problem of insufficient comprehensiveness and accuracy. How to make full use of the advantages of various methods and more rich features to improve extraction performance still needs to be further researched.

3. Subjective Expression Extraction Model

In this section, we go deep into the construction of subjective expression extraction model, using combination of BiLSTM and CRF with multiple features to predict the optimal tag sequence of opinion elements according to the context of each sentence in the document. The flow chart is shown in Figure 2. It is divided into four steps, which are described as follows:(1)Text preprocessing: the CoreNLP natural language processing tool of Stanford University is used for sentence cut, tokenization, and stemming of the documents in the corpus.(2)Feature construction: we adopt pretrained word vectors, part of speech (PoS), and Named Entity Recognition (NER) as sentence feature for semantic representation. The CoreNLP tool is used to identify PoS and NER types of words in text and transform them into corresponding feature vectors. In this paper, we use the 300-dimensional word vectors published by Google Word2vec, whose training corpus is the Google News dataset containing hundreds of billions of words.(3)Target vector construction: we take subjective expression extraction task as a sequential annotation problem and the input vector can keep the order information of sentence by concatenation of three features. By encoding each feature with numerical vector, we can convert the sentence of each text into the target vector for model training.(4)Model training: we use the constructed features and target vectors to fit the model and obtain the optimal parameters by iteration. Then, the optimal model is acquired and can be used to expression extraction task.

3.1. Feature Construction
3.1.1. Word Vector Feature

After text preprocessing, each sentence is composed of word sequences , representing the ith word in the sentence, and the total number of words in sentence is n. The word sequence is transformed into the word vector sequence by mapping the word vector matrix . The word vector of the first word in the vocabulary is calculated as follows:

The matrix represents all the word vectors in the corpus vocabulary with size ; each column represents the word vector of the jth word in the vocabulary with dimension 300; is a vector with size , except the index position j of in the corresponding vocabulary which is 1; the other rows are 0, that is, a one-hot vector. The word vector matrix is expressed as the parameters of the model, which can be obtained when training the model, and the pretrained word vector can be expressed as the initial word vector matrix so that word in a sentence is transformed into word vector in the above way.

However, not all words in the document are included in the vocabulary. This phenomenon is noted as out of vocabulary (OOV). The initial word vectors of these extra-set words will be represented by a unified UNK word vector. In this paper, a 300-dimensional word vector [−0.01, 0.01] is randomly generated with normal distribution to represent UNK words. In the process of training the model, all the word vectors in the corpus are trained and fine-tuned as model parameters.

3.1.2. POS and NER Feature

Word vectors are trained with large-scale corpus, which can effectively capture the characteristics of words. Moreover, the pretrained word vectors include a certain extent of contextual information, because the pretrained word vector model usually chooses contextual words as input, such as CBOW model [15]. However, limited by corpus and training method, it is difficult to represent enough semantic information, so extra features are introduced for promotion, and then experiments will be conducted to determine whether and which is necessary to introduce other feature vectors.

Part of speech determines whether a word is a verb, noun, adjective, or another part of speech based on context information. In this paper, the CoreNLP (http://stanfordnlp.github.io/CoreNLP/) POS Tagger Annotator tool of Stanford University is used to annotate the POS tag of each word in the corpus text. Table 1 shows the lexical information of subjective expression elements and opinion holder elements in MPQA corpus. From Table 1, we can see that the proportion of lexical parts of DSE subjective expression elements is as high as one-half of verbs. The proportion of ESE subjective expression elements is about 40% of verbs, adjectives, and adverbs. Therefore, it is reasonable to adopt POS for the identification of opinion elements.

The POS tags of each word are obtained from POS tagging result of the dataset. For example, “Bush rejected the Kyoto pact last March” is labeled as . Our research actually utilizes 45 types of POS tags.

The POS tags of each word are transformed into feature vectors using one-hot representation. Assume the corresponding part-of-speech feature vector of a word is

The calculation of each dimension of the vector is as shown in the following:

Named entity labels of each word are transformed into named entity feature vectors using one-hot representation. The feature vector of the named entity corresponding to the word is , the way in which each dimension of the vector is shown in the following:

NER can identify text fragments that represent specific types of entities, such as person, place, institution, and proper noun [16]. Opinion holders are people or organizations that express their opinions, so their words are more likely to belong to named entities. The CoreNLP NER tool is used to identify named entities in the corpus, the statistic proportion of named entity type about opinion holders in MPQA corpus is counted, and the results are numbered.

It can be found that nearly one-half of the opinion holder is named entity. Therefore, using type of named entity as a feature of words can be in favor of the recognition of opinion holder elements being a reasonable speculation. Twenty-four named entity types are actually used in this paper.

The table represents the number of named entity types, with value of 24 in this article. The sequence of named entity features of a sentence is expressed as .

The three lexical features of each word, word vector, POS feature, and NER are concatenated in series to construct the features of the word. By doing that, a sentence is represented as a feature sequence.

3.2. Target Vector Construction

In this paper, subjective expression extraction is considered as a sequential annotation problem. A sentence denotes a sequence in which each word has a corresponding tag for the opinion element, and nonviewpoint elements have their tag “O.” Most opinion elements are phrases or even clauses; therefore, the label of viewpoint elements not only identifies the category but also identifies the location of the word in the elements. In this paper, BIO tagging method [17] is used to represent the label of words. The type of label has B-prefix which represents the first word of opinion element and I-prefix which represents the nonheader word of opinion element. Figure 3 is an example of the BIO tag sequence corresponding to the word sequence of a sentence in DSE subjective expression element recognition, which is transformed into the target vector sequence. The sentence word sequence is “It is very regrettable that the United States has taken a negative stand, Masaaki Nakajima of Friends of the Earth Japan said,” in which “has taken a negative stand” and “said” are the subjective expression elements of DSE, while the whole sentence tag sequence is “O, O, O, O, O, O, B-DSE, I-DSE, I-DSE, I-DSE, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, B-DSE.”

After obtaining the tag sequence of the sentence, by transforming the sequence into the target vector in the same way with POS feature vectors and NER feature vectors construction, we can acquire the target output of the model. Each word is represented as a one-hot vector , and the sequence of target vectors of sentences can be expressed as a one-hot vector. DSE subjective expression elements and ESE subjective expression elements are considered as a three-class annotation problem, so the length is set as 3.

3.3. Model Construction

The opinion extraction model we proposed is based on neural network and conditional random field, and its structure is shown in Figure 4. In this paper, feature vector is used to represent sentence . Context information of a text is often bidirectional and target words are related not only to previous texts but also to subsequent texts. BiLSTM is more powerful than unidirectional long- and short-term memory network in many tasks. The forward LSTM network receives the feature sequence of the order, and the output of the time t is the reverse LSTM network receives the input of the feature sequence in reverse order, and the output of the time t is .

The output of BiLSTM at time t is obtained in series. The output sequence of BiLSTM network is . Then the context feature sequence is extracted through a linear layer using sigmoid as activation function using the following formula:

In the output layer of general BiLSTM model, the output of time t is decided by the current hidden unit and input but is independent with the output of other time. For moment t, the model expects finding the most probable label based on the current input and context information. However, the label at other moments has no impact on the label . If there is a strong dependency relationship (for example, B-DSE should be I-DSE after B-DSE), BiLSTM cannot model these kinds of constraints, and its ultimate effect will be limited. So linear-CRF is introduced to study the relationship between tag sequences.

Line-CRF can effectively capture the relationship between adjacent elements in the output sequence. Moreover, it can find a sequence with the highest probability based on optimization. The result is an optimal output sequence rather than the optimal stitching of each moment, which is not available in the recurrent neural network. Our model combines the advantages of both.

The conditional probability of linear-CRF under the condition of input feature sequence and target vector sequence is defined as follows:

The is eigenfunction of linear-CRF, represent the ith target vector in the target vector sequence , and is the predicted output sequence. In training phase, the maximum likelihood estimation is used to maximize the conditional probability of formula (6). When predicting, we use Viterbi algorithm to find the most probable tag sequence, and then we acquire the output sequence of opinion elements prediction of a sentence.

The super parameters of the model are set as follows: the number of hidden units of BiLSTM is 25, the dropout of hidden layer is 0.5, and the linear layer unit is 25. These parameters are optimized through many repeated experiments. The number of iteration rounds is set to 50; we use early stopping method to monitor for preventing overfitting and stochastic gradient descent algorithm for optimization.

4. Experiment Result and Analysis

4.1. MPQA Corpus Introduction

The dataset used in the experiment is MPQA1.2 opinion mining corpus, which is constructed jointly by the University of Pittsburgh and Cornell University. It contains 535 news articles from multiple media sources, about the news events occurring between 2001 and 2002. The text fragments of each document are labeled as multiple types of tags.

4.1.1. Corpus Analysis

Before conducting experiment, we investigate the subjective expression elements in the corpus. Table 2 presents some of their statistical information. DSE represents the subjective expression elements of DSE and ESE represents the subjective expression elements of ESE.

DSE and ESE subjectivity elements account for about half of all sentences. It can be seen that news corpus does convey a considerable amount of subjective opinions. The average number of subjective elements of DSE in opinion sentences is 1.56, and more than half of the general opinion sentences have more than one DSE element. We find that a sentence often conveys its opinion information by speech events, so the number of DSEs in an opinion sentence is more than 1. The expression of ESE is flexible, and its word length is from 1 to 40. More abundant and free expression forms bring more difficult problems that need to be handled.

4.1.2. Corpus Division

This paper divides 535 articles into 400 training validation sets and 135 test sets in the same way as [13]. The training verification set is divided into training set and verification set 10 times in a 9 : 1 way. The test set is used for evaluation effect of the model, which is calculated by the average value of the tenfold cross-validation model.

4.2. Evaluation Method

This paper evaluates the performance of the model by using the measurement method widely used in the field of opinion mining. In the process of annotating MPQA corpus, the annotator finds that the boundary of opinion elements cannot be determined strictly [8]. So Breck et al. [9] and Johansson and Moschitti [17] proposed two soft metrics: Binary Overlap and Proportional Overlap. Compared with the commonly used measurement methods of label boundary matching, these two methods are more efficient in evaluating the extraction effect.

In Binary Overlap metrics, if the boundaries of predictive tags and manual tags overlap, then this is a correct tag. The accuracy, recall, and F-measure of Binary Overlap are calculated by the following formulas:

Among them, C and P, respectively, represent the correct label set of manual labeling and the label set of model prediction and || represent the number of labels of a certain expression element in each set.

Proportional Overlap measurement pays more attention to the proportion of prediction and golden label overlap than Binary Overlap measurement. It starts with measuring the degree of overlap cvrg between two markers s and s′:

Here, |S| denotes the length of the tagged word. Then, the overlap rate CVRG of two tag sets S and S′ is calculated:

Finally, the accuracy, recall rate, and F value of Proportional Overlap measure are calculated by

In formulas (12) and (13), the number of markers for a certain element in each set is represented as | |.

4.3. Multifeature Contrast Experiment

We compare the effects of each feature on the recognition of each opinion element through experiments and pursue an effective combination of features suitable for each opinion element recognition. This paper briefly describes the meanings of some words and symbols in the following experiments:(1)+Pretrained word vector: model with pretrained word vector as feature.(2)+Fine-tuning the pretrained word vector: using the pretrained word vector as the feature but in the phase of model training, word vectors are updated as model parameters.(3)+Part-of-speech feature: the model of adding POS feature.(4)+Entity features: the model incorporates named entity features.

4.3.1. Subjective Expression Element Recognition of DSE

Firstly, the model is applied to the recognition of DSE subjective expression elements. Table3 shows the result of Binary Overlap and Property Overlap when the model is combined with word vector, part-of-speech feature, named entity feature, and part-of-speech entity feature. The bold numbers represent the highest score of an evaluation method.

By looking further into Table 4, we compare the impact of each feature as follows.(1)The influence of fine-tuning word vectors: the accuracy of pretrained word vectors is higher without fine-tuning, but the recall rate is lower and fine-tuning has reverse effect, higher recall rate, and lower accuracy. While in DSE recognition task, the fine-tuning word vector performs slightly better than that in the case of no fine-tuning. So the trained word vectors are fine-tuned in combination with other features in later experiment.(2)The influence of part-of-speech features: after fine-tuning the pretrained word vector and adding part-of-speech features, the recall rate is the highest among all feature combinations with the two measurement methods, the F value under the Binary Overlap measure is the highest, and the F value under the Proportional Overlap measure is the second highest. Part-of-speech features promote the recognition effect of DSE elements.(3)The influence of named entity features: after fine-tuning the pretraining word vector and adding named entity features, the recall rate and F value are slightly improved under the two measures. Compared with the part-of-speech feature, it plays a less important role.(4)The influence of combination of part of speech and named entity features: after fine-tuning the pretrained word vector and adding part of speech and named entity features, the recall rate slightly increased under the two measures, the F value under Proportional Overlap was the highest, and the F value under Binary Overlap was the second highest. It is similar to the function of part of speech.

In general, the combination of fine-tuning pretraining word vectors and part-of-speech features is adequate in the recognition of DSE subjective expression elements. The combination of lexical and entity features is similar to that of individual lexical features, while named entity features have little impact on DSE recognition. Moreover, considering the complexity of the model, the additional features in the recognition of DSE subjective expression elements are only part-of-speech features. It is presumed that the DSE elements are generally shorter and the part of speech is more obvious, so the part of speech features are effective after they are added.

4.3.2. ESE Subjective Expressive Element Recognition

This paper combines the model with a variety of feature combinations and applies them to the recognition of ESE subjective expression elements. Table 3 shows the effect of using Binary Overlap and Proportional Overlap measures when the model is combined with word vector, part of speech, and named entity features to identify ESE subjective expression elements. The bold numbers represent the highest score of an evaluation method. As shown in Table 5, the impact variety of each feature can be seen.(1)The influence of fine-tuning word vectors: the accuracy of pretrained word vectors is higher without fine-tuning, the recall rate is lower, and the reverse is true for fine-tuning. From the F value of the two measures, the fine-tuning word vector of ESE feature recognition task is slightly more than that of non-fine-tuning task by about 3 and 2 percentage points, respectively, and the recall rate is increased by about 6 to 7 percentage points. Finely tuning the pretraining words vector improves the recognition effect of ESE elements. So the trained word vectors are fine-tuned in combination with other features.(2)The influence of part-of-speech characteristics: on the basis of fine-tuning the pretrained word vector and adding part-of-speech features, the accuracy of the two measures has only increased by about 1 percentage point, while the others have decreased. It is not necessary to add part-of-speech features alone.(3)The influence of named entity features: on the basis of fine-tuning the pretraining word vector, adding named entity features, the accuracy of the two measures only increased by about one percentage point, while the others decreased. It is not necessary to add named entity features alone.(4)The influence of the combination of part of speech and named entity features: after fine-tuning the pretraining word vector and adding part of speech and named entity features, the accuracy, recall rate, and F value of the two measures are improved, and F value is the highest of all feature combinations.

In general, fine-tuning the combination of pretraining word vector, part of speech, and named entity is an effective feature in ESE subjective expression element recognition. ESE elements are generally long and contain clauses expressing opinions implicitly. These long sentences may have many types of POS and named entity phrases occurrence which can enrich context information; we presume this is the reason why named entities and part-of-speech features work together effectively.

4.4. Contrast Experiment with Other Methods
4.4.1. Baseline Model

This paper chooses four methods as the baseline model, listed as follows:(1)CRF-OE: Breck et al. [8] use conditional random fields to solve the problem of extracting subjective expression elements. Word feature (one-hot vector), syntactic feature, and semantic feature (WordNet lexicon) are used.(2)RecursiveNN: Irsoy and Cardie [18] use structural recurrent neural networks to identify opinion holder elements.(3)BSRNN: Irsoy and Cardie [4] use a multilayer bidirectional recurrent neural network (RNN) to solve the problem of extracting subjective expression elements. Except that the number of layers of bidirectional RNN is set to three layers, the hyperparameters and word vectors are maintained as well as the method in this paper.(4)BiLSTM: Wang et al. [12] use the BiLSTM network model to extract two subjective expression elements and super-parameters and word vectors are maintained the same.

4.4.2. Contrast and Analysis

The deep neural network model is compared with the baseline model in identifying the corresponding opinion elements. The results are shown in Table 5.

The experimental results show that the model we proposed has the best recognition effect on DSE, and BLSTM, BSRNN, and CRF-OE are less effective. The results show that this model is better than BSRNN model in recognizing DSE elements in nested opinion sentences. BSRNN model sometimes omits a part of DSE elements and misjudges ESE elements as DSE elements.

In ESE recognition, the most effective is our model and BSRNN. The proposed model is higher than BSRNN in Proportional Overlap metric and lower than BSRNN in Binary Overlap metric. The performance of BLSTM model is worse than that of the former two. Compared with BLSTM and BSRNN models, this model is good at correctly identifying ESE elements, and BSRNN and BLSTM are have an advantage to identify short ESEs as DSE elements.

This paper is slightly improved compared with other models. Therefore, by analyzing the recognition results of multiple models, it is found that the number of opinion elements identified in this model is more than that of other models. But the length is obviously longer than that of BSRNN, which is closer to the result of manual labeling. It is shown that the quality of viewpoint elements identified by this model is higher when the number of viewpoint elements identified is similar. It is observed that the effect of boundary determination is better than BSRNN and BLSTM models.

Although this model is superior to the baseline model in opinion expression extraction to some extent, but the recognition performance still needs to be improved. Compared with the manual tagging of corpus, several problems are summarized:(1)Similar ESE subjective expression elements may be merged(2)Shorter ESE subjectivity expression factors are easily neglected(3)A few of the subjectivity expression elements identified by DSE are shorter than those labeled by hand

5. Conclusion

In this paper, we studied the extraction method of subjective expression elements and proposed a compound model based on multifeature deep neural network and conditional random field. Then we described the various features used in the model, the reasons for selecting these features, and the way of feature construction. We carried out experiments on subjectivity expression elements (DSE and ESE) with multifeature combination, the effect of each feature combination mode on the corresponding subtask is tested, and the most suitable feature combination for the corresponding subtask model is recommended.

Experimental results show that the proposed opinion elements extraction model can effectively extract viewpoint elements and has better accuracy, recall rate, and F value. However, the experiment also showed that the extraction effect of the model is restricted when the opinion elements are short and similar, which we will carry on in future work. In a word, our work can provide reference for the follow-up research on the extraction of subjective expression elements.

Data Availability

The experiment was conducted with open dataset MPQA Opinion Corpus v1.2, which can be downloaded at http://mpqa.cs.pitt.edu/corpora/mpqa_corpus/mpqa_corpus_1_2/.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.


This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant nos. 61272447 and 61802271 and in part by the Fundamental Research Funds for the Central Universities under Grant nos. SCU2018D018 and SCU2018D022.