Abstract

Emotion Distribution Learning (EDL) is a recently proposed multiemotion analysis paradigm, which identifies basic emotions with different degrees of expression in a sentence. Different from traditional methods, EDL quantitatively models the expression degree of the corresponding emotion on the given instance in an emotion distribution. However, emotion labels are crisp in most existing emotion datasets. To utilize traditional emotion datasets in EDL, label enhancement aims to convert logical emotion labels into emotion distributions. This paper proposed a novel label enhancement method, called Emotion Wheel and Lexicon-based emotion distribution Label Enhancement (EWLLE), utilizing the affective words’ linguistic emotional information and the psychological knowledge of Plutchik’s emotion wheel. The EWLLE method generates separate discrete Gaussian distributions for the emotion label of sentence and the emotion labels of sentiment words based on the psychological emotion distance and combines the two types of information into a unified emotion distribution by superposition of the distributions. The extensive experiments on 4 commonly used text emotion datasets showed that the proposed EWLLE method has a distinct advantage over the existing EDL label enhancement methods in the emotion classification task.

1. Introduction

Text emotion classification (recognition) is an important research topic with many promising novel applications [1], such as emotional human-computer interaction [2], intelligent customer service [3], music emotion classification [4], anticipating corporate financial performance [5], and online product review analysis [6]. The goal of text emotion recognition is to find out the writers’ emotional states contained in sentences [1]. In recent years, researchers have proposed a lot of effective and fruitful work in the field of text emotion classification [26].

In general, the emotion expressed in a sentence is a mixture of a variety of basic emotions (e.g., anger, fear, joy, or sadness), where each basic emotion has a certain degree of contribution to the overall expression [7]. Traditionally, text emotion recognition models are based on Single-Label Learning (SLL) or Multi-Label Learning (MLL). In SLL, one sentence is assumed to be associated with only one emotion label [8]. To cope with the situation where one sentence simultaneously evokes several different emotions, MLL assigns multiple emotion labels to a sentence [9]. However, the modeling ability of MLL is insufficient to quantitatively analyze the ambiguity in multiple emotions. Given a sentence, MLL identifies the prominent emotion labels, while it cannot tell the specific expression intensity of each emotion [10].

To address this problem, by drawing on the idea of Label Distribution Learning (LDL) [11,12], Zhou et al. [7] proposed Emotion Distribution Learning (EDL) for facial emotion recognition. In the next year, EDL was applied to text emotion classification [13]. EDL deals with multiple emotions by associating each instance (e.g., facial image or sentence) with an emotion distribution vector, where the vector’s dimension is the number of all possible emotions. In the emotion distribution, each component represents the intensity of the corresponding emotion on the given instance. Obviously, EDL is suitable to solve the quantitative multiemotion analysis problem, especially when emotion ambiguities occur [7]. In recent years, many effective EDL methods have been proposed. For example, Zhou et al. [13] designed an EDL model by introducing the constraint of interrelationships between emotions. Jia et al. [14] proposed an EDL method that considers the local correlation of emotion labels. Zhao et al. [15] put forward an EDL model based on meta-learning with small samples. He et al. [16] designed an EDL method based on graph convolutional neural networks, where the correlation between emotions is considered. Xiong et al. [17] proposed an EDL model based on convolutional neural networks that utilizes the polarity and the sparsity of emotion labels. Fan et al. [18] designed an EDL method to predict image emotion distribution by learning labels’ correlation. Xi et al. [19] proposed emotion distribution learning based on surface electromyography for predicting the intensities of basic emotions. Liang et al. [20] proposed a novel semisupervised multimodal emotion recognition model based on cross-modality distribution matching. All these EDL methods exhibited better performance than traditional models.

Most existing EDL works focused on improving emotion recognition accuracy by proposing novel prediction models. Few methods have been proposed to determine the emotion distribution from existing annotated datasets, which only contain single-labeled emotions. A radical solution to the problem would be the creation of datasets, in which emotion distributions (or quantitative multilabeled instances) are annotated instead of single emotion labels, which is difficult to do in practice. A convenient method to obtain EDL datasets is to transfer the existing quantitative multilabeled datasets by label score normalization (normalizing the sum of label score of each instance to be 1). However, the quantitative multilabeled text emotion datasets are scarce, except for a few special ones such as the SemEval 2007 Task 14 [21]. Most of the existing text emotion datasets are single-labeled. To utilize the traditional emotion datasets in EDL, label enhancement aims to transform single-labeled emotions to emotion distributions, whose idea is similar to that of LDL [22].

Label enhancement needs to leverage some extra knowledge to convert a single label to a distribution, in which the effectiveness of knowledge is essential. An up-to-date introduction of label enhancement in label distribution learning can be found in Xu et al. [22]. Up to now, only several EDL label enhancement methods have been proposed [23, 24]. Yang et al. [23] proposed Mikels’ Wheel-based emotion distribution Label Enhancement (MWLE) method, which utilizes psychological emotional knowledge to transform emotion labels into distributions. However, the MWLE method is proposed for facial emotion classification without considering the affective words that are effective in text emotion analysis. The affective words have different intensity and emotional tendency, which are generally annotated based on linguistic knowledge. Emotional words contain a lot of emotional information, which is discriminative for text emotion recognition [25]. Zhang et al. [24] proposed the Lexicon-based emotion distribution Label Enhancement (LLE) method, which generates emotion distributions from a single-label by introducing the linguistic information of affective words. The experimental results show that the performance of the LLE method is better than that of the MWLE method [24], but the disadvantage of the LLE method is that the psychological correlation between human emotions is not considered. Psychological emotional knowledge can be used to effectively obtain the intrinsic correlations among emotions [23], while affective words contain effective discriminating information about emotions [25]. Both of them are essential to efficient emotional classification. However, to the best of our knowledge, no existing label enhancement work has considered both the psychological and linguistic knowledge.

In this paper, we present a novel Emotion Wheel and Lexicon-based emotion distribution Label Enhancement (EWLLE) method, which calculates the psychological distances between emotions according to Plutchik’s emotion wheel and utilizes the linguistic information of affective words from some classical lexicons. Plutchik’s wheel of emotions is a well-known psychological model proposed by Robert Plutchik in 1980 to describe human emotional relationships [26]. We exploit Plutchik’s wheel of emotions to determine similarities between different emotions through Gaussian distribution. For a given sentence, based on psychological emotion distances, EWLLE generates discrete Gaussian distribution of sentence emotion labels and the affective words’ emotion labels and then superposes them into a unified emotion distribution. Different from existing EDL label enhancement methods, EWLLE takes into consideration the psychological and linguistic emotional knowledge at the same time during the label enhancement procedure. Extensive experiments on 4 public text emotion datasets, TEC [27], Fairy Tales [28], CBET [29], and ISEAR [30], demonstrate that the proposed EWLLE method performs favorably against the state-of-the-art approaches in the task of text emotion recognition.

The rest of the paper is organized as follows. Section 2 introduces some related works of emotion label enhancement. Section 3 describes the proposed method of label enhancement based on the emotion wheel and lexicon in detail. Section 4 provides a series of comparative experiments to verify the effectiveness of the proposed method. Finally, Section 5 concludes the paper.

2.1. Emotion Distribution Learning

In general, the emotional expression in a text sentence or facial image is usually a combination of multiple basic emotions. All related basic emotions play a certain role in the overall expression and together constitute an emotion distribution [7]. Figure 1 shows two representative sentences and the corresponding annotations from the SemEval dataset [21]. For the sentence (a), the dominating emotion, surprise, accounts for 41.9% expression level; meanwhile, the other two major emotions, joy and anger, present 20.1% and 16.9%, respectively. The situation of the sentence (b) is analogous. The examples show that a single sentence may possibly contain multiple emotions with different intensities rather than a single label.

As we stated earlier, most traditional emotion recognition methods are based on Single-Label Learning (SLL) or Multi-Label Learning (MLL), which address the problem of “which emotions are used to describe the sample,” rather than the problem of “how to describe the sample’s emotions quantitatively.” Actually, both SLL and MLL cannot obtain the specific intensity of each basic emotion [11]. If we simply annotate sentences with multiple major emotions, whose expression levels are higher than a given threshold, the multiemotion recognition task could be solved by MLL, but such a simplistic approach would discard the information of the emotion expression level, which makes it is impossible to quantitatively analyze related emotions in the subsequent affective computing task [7].

In light of the success of the novel machine learning paradigm of Label Distribution Learning (LDL) [11], Emotion Distribution Learning (EDL) was proposed to solve the facial and text emotion classification tasks [7, 13]. Different from SLL and MLL, EDL assigns each instance with an emotion distribution, where each distribution component represents the intensity of the corresponding emotion to the given instance.

In the traditional single-label text emotion recognition task, each sentence has a corresponding emotion label , , where C is the number of all possible emotions. Rather than just predicting the emotion label , the goal of EDL is to find a function to map sentence to an emotion distribution , where represents the intensity of -th emotion to the sentence , and . Note that, unlike the probability distribution that assumes only one label is correct at a time, the emotion distribution allows an instance to have multiple emotion labels simultaneously. Any emotion with an intensity higher than 0 is a possible label for the instance, while the expression level of each emotion label can vary.

EDL predicts the intensity values of sentences across a set of emotion categories by the emotion distribution. Such information is important for understanding fine-grained emotions, especially when ambiguities exist [7]. In recent years, many effective EDL methods have been proposed [1320]. However, one of the major problems for the development of EDL models is the lack of emotion distribution in annotated datasets, due to the difficulty to annotate emotion distributions. In the existing text emotion datasets, it is rare to see multiemotion score annotations (e.g., the SemEval 2007 Task 14 [21]), which can be transferred to emotion distribution by label score normalization. In order to utilize abundant traditional single-labeled text emotion datasets, methods of label enhancement are required.

2.2. Label Enhancement

In label distribution learning, label enhancement refers to the process of recovering the label distribution from a single-label or multilabel dataset [22]. If we regard the original ground-truth label as the score of 1, the basic idea of most label enhancement methods is to reduce the score of and increase the scores of some other related labels in the generated distribution, where the score of generally remains the highest. The label score adjustment strategies vary among different label enhancement methods. In practice, both the correlation among the labels and the topological information of the feature space are usually utilized to recover label distribution from logical label. After the concept of label distribution learning was proposed, some literatures have proposed some effective label enhancement methods. The existing label enhancement algorithms are roughly divided into three types [22], i.e., fuzzy theory-based method [31, 32], graph model-based method [33, 34], and prior knowledge-based method [35, 36]. We give a brief introduction to these 3 kinds of label enhancement methods as follows.

The fuzzy theory-based label enhancement method digs out the correlation among the labels by introducing the fuzziness into the originally rigid logical labels and transforms logical label into label distribution. For example, the label enhancement algorithm based on fuzzy clustering uses the membership degree of the examples generated in the fuzzy clustering process to each cluster and converts the membership degree of the example to the cluster into the membership degree of the category, thereby generating the label distribution [31]. The label enhancement algorithm based on the fuzzy kernel membership degree uses the kernel technique to calculate the fuzzy membership degree of the example to each category in the high-dimensional space, so as to mine the correlation between the categories in the training set [32].

The graph model-based label enhancement algorithm uses graph models to represent the topological structure between examples, establishes the relationship between examples and labels through some model assumptions, and then enhances logical label to label distribution. For instance, the label enhancement algorithm based on label propagation expresses the topology structure between examples through a graph model and uses the difference of path weights in the propagation process to make descriptive differences, so as to mine the relationship between the labels in the training set [33]. The manifold-based method reconstructs the manifolds of the feature space and the label space and uses the smoothing assumption to migrate the topological relationship of the feature space to the label space, thereby enhancing logical label to label distribution [34].

The prior knowledge-based label enhancement algorithm introduces some kind of prior knowledge, mines the implicit correlation between labels according to the characteristics of the dataset, and enhances logical label into label distribution. Obviously, the validity of prior knowledge is the key to the success of this kind of method. By introducing the prior knowledge of correlation among different head poses, Geng et al. [35] built label distribution from logical label and its neighboring head poses. In the application of facial age estimation, the lack of facial images with definite age labels makes traditional age prediction algorithms inefficient. Based on the prior knowledge of the facial similarity between adjacent human ages, Geng et al. [36] recovered label distribution from ground-truth age and its adjacent ages and proposed an adaptive label distribution learning model to learn the human age ambiguity. Furthermore, Zhang et al. [37] presented a developed prior assumption of facial age correlation, which limits age label distribution that only covers a reasonable number of neighboring ages. Based on the developed prior knowledge, Zhang et al. [37] proposed a practical label distribution paradigm and outperformed current state-of-the-art facial age recognition methods.

2.3. Lexicon-Based Label Enhancement

In addition to the annotations, emotion label, or emotion distribution, text based EDL can utilize the extra prior information of affective words contained in sentences compared to the classical LDL. Affective words are words with different intensites of emotional tendency [25], which are associated with certain emotion labels. The affective lexicon is a dictionary of affective words, which is one of the most important linguistic resources in affective computing for text [25]. One sentence may include multiple affective words, and one affective word can be associated with several emotion labels. Many existing studies showed that affective words contain abundant discriminative information for text emotion recognition [25]. Researchers have proposed a variety of emotion recognition methods based on emotional lexicon, where affective words are extracted from sentences and then used to predict emotion labels. For example, Agrawal and An [38] utilized the constructed emotion lexicon to classify emotional sentences; Wang and Pal [39] proposed a multiconstraint emotion classification model based on emotional lexicon.

Similar to that of LDL [22], label enhancement in EDL converts the emotion label of sentence to the emotion distribution . Given the rich information encoded in affective words, it is desirable to introduce affective words into the text based EDL label enhancement models. Following this idea, Zhang et al. [24] proposed the method of Lexicon-based emotion distribution Label Enhancement (LLE), whose main idea is to attach secondary emotions based on affective words to the ground-truth emotion. As shown in Figure 2, the ground-truth emotion label of the example is sadness, which has the highest score in the generated emotion distribution. Meanwhile, four secondary emotions of anger, fear, joy, and disgust are extracted based on lexicon and added to the emotion distribution. Secondary emotions have a lower score than the dominating emotion.

Given a sentence and its emotion label , the specific approach of LLE is as follows [24]: (1) extracting all affective words from the sentence based on affective lexicons and obtaining the corresponding emotion label set ; (2) assigning intensity scores to the corresponding emotion labels if there are other affective words than the ones with the ground-truth emotion; otherwise, the one-hot vector is used as emotion distribution for the sentence .

Formally, the intensity score of the ground-truth emotion is calculated by

Meanwhile, the expression level of the -th emotion with is computed bywhere is the number of affective words of the -th emotion in the sentence, and is the weight parameter of the ground-truth emotion label. After calculating the intensity score for all emotion labels, is guaranteed by normalization.

Given a sentence with its logical emotion label, LLE extracts affective words from text and attaches the corresponding emotional information to the logical label, which is used to generate the final emotion distribution. Compared to the label enhancement method without using affective words [23], experimental results demonstrated that the emotion distribution generated by LLE has better performance [24].

In summary, the lack of textual emotional datasets with annotated emotion distributions is a distinct obstacle to the development of EDL. To address this problem, some prior knowledge-based label enhancement methods have been recently proposed by scholars to enhance logical label to emotion distribution [23, 24]. However, most existing methods are deficient in the sense of utilizing extra prior knowledge, where the validity of prior knowledge is the key to success. MWLE calculates emotion distances by Mikels’ wheel and then adopts the Gaussian function to transform the sentence label to emotion distribution. However, MWLE is a label enhancement method without using the prior linguistic knowledge of affective words [23]. LLE builds the emotion distribution based on the information of affective words. Nevertheless, LLE does not consider the prior psychology knowledge of human emotions, which contains the information of widely observed intercorrelation among emotions [24]. The major difference between MWLE and LLE is the use of two different kinds of prior knowledge. To the best of our knowledge, the prior knowledge of psychology and linguistics has never been used together in label enhancement methods.

In order to effectively integrate both psychological knowledge and linguistic knowledge into the label enhancement model, this paper uses Plutchik’s wheel of emotions to calculate the psychological distance between emotions and proposes an emotion distribution label enhancement method based on the emotion wheel and the emotion lexicon. The details of the proposed method will be discussed in the next section.

2.4. The Emotion Wheel and Lexicon-Based Label Enhancement Method
2.4.1. Plutchik’s Wheel of Emotions

Human emotional expression is a complex phenomenon, where intrinsic strong intercorrelations among emotions exist widely [26]. Some particular emotions often appear simultaneously in a face or a sentence, which shows a high positive correlation. Meanwhile, some other emotions present the opposite cooccurrence phenomenon, which can be regarded as a negative correlation. Robert Plutchik [26] proposed the theory of emotional wheel, which is a classic model that describes human emotional relationships from a psychological perspective. Plutchik’s wheel of emotions contains 8 basic emotions: anger, disgust, sadness, surprise, fear, trust, joy, and expect. As shown in Figure 3, these 8 emotions are divided into 4 groups of opposite emotions and allocated accordingly in the emotion wheel. The two emotions in the diagonal position are the opposition (negative correlation), and the adjacent emotions have some kind of similarity (positive correlation).

Since the similarity in Plutchik’s emotion wheel represents the corresponding psychological distance of emotions, we use the interval angle in the emotion wheel to measure the distance between emotions, where each 45-degree interval in the wheel is defined as 1 scale. The bigger the interval angle, the larger the emotional distance. For example, for the adjacent emotions of anger and expect, which are separated by 45°, their distance is 1. The distance between joy and sadness is 4, since they are two opposite emotions with 180-degree intervals. In our study, the distance between any two different emotions is an integer from 1 to 4, and the distance between the same emotions is 0.

In previous work of EDL, some effective models based on the prior psychological knowledge of human emotions have been proposed [13, 23, 40]. For instance, Zhou et al. [13] introduced the emotion label constraints into the optimization function of the maximum entropy based EDL model, where the constraints are calculated according to Plutchik’s emotion wheel. Mikels et al. proposed an EDL method based on Mikels’ wheel of emotions [40] and the convolutional neural network, where Mikels’ emotion wheel is another classical psychological emotion model. The main difference between two emotion wheels is that they contain different emotions. Both studies using wheels achieved excellent results.

For the task of emotion distribution label enhancement, Yang et al. [23] proposed the method of Mikels’ Wheel-based emotion distribution Label Enhancement (MWLE). The MWLE method calculates the emotion distances based on Mikels’ wheel of emotions and transfers emotion label into emotion distribution by the Gaussian function. However, the information of affective words is ignored, since no text is available in facial emotion analysis. As a result, the performance of MWLE is inferior to that of LLE [24]. Until now, we have not found existing work on emotion distribution label enhancement that considers both psychological and linguistic emotional knowledge.

2.4.2. The Proposed Method

Just as upgrading a low-resolution image to a high-resolution one actually requires more information, label enhancement needs to introduce some external knowledge to effectively transform a single label into a distribution. In addition to the label , we propose using both psychological emotional knowledge and linguistic information of affective words. Based on the ground-truth sentence label and the emotion labels of affective words, combined with the emotion correlation knowledge learned from Plutchik’s emotion wheel, we propose the Emotion Wheel and Lexicon-based emotion distribution Label Enhancement (EWLLE) method. EWLLE defines the psychological emotion distance according to the corresponding interval angle in Plutchik’s wheel of emotions, then generates the discrete Gaussian distributions across all emotion categories for the ground-truth sentence label and the emotion labels of affective words, respectively, and finally integrates them into a unified emotion distribution by the superposition operation.

In particular, for the sentence , EWLLE extracts all affective words by looking up the emotion lexicon, where n is the number of affective words in . The number of n is zero when there is no affective word contained in the sentence. Meanwhile, each affective word has several (at least 1) associated emotion labels , where is the number of emotion labels of . The total number of emotion labels of affective words extracted from sentence is .

For the enhancement from an emotion label α to an emotion distribution, we make two reasonable assumptions. Firstly, the ground-truth emotion label α should have the highest value in the generated distribution to ensure its dominating position. Secondly, the score of other emotions is reduced along with the distance to the label α, which reflects the fact that an emotion similar to the dominating emotion has a higher weight than distanced ones. Since the emotions form a loop in Plutchik’s emotion wheel, the emotion distribution based on psychological distance will be a symmetrical distribution centered on the label α with decreasing values on both sides. The distribution may look like a Gaussian distribution or triangular distribution. From the label enhancement results in the facial age estimation task, Gaussian distribution is a better choice than triangular distribution [41]. We follow this work and use Gaussian distribution.

Based on the above principles, the EWLLE method adopts the discrete Gaussian function to enhance the emotion label to the distribution . Formally, the discrete Gaussian distribution centered on the label is calculated as follows:where is the standard deviation of the Gaussian function, is the normalization factor to ensure , is the distance between the emotion and the ground-truth emotion α. We use the psychological distance described in Section 3.1 to calculate . When the standard deviation is larger, more similar emotions are considered in the generated emotion distribution since the corresponding Gaussian distribution is flatter. The standard deviation is set to 1 in our experiments.

Once all emotion labels in sentence are obtained, EWLLE generates the Gaussian distribution and by formula (3) for the sentence label and the affective words’ emotion labels respectively. And then, in order to combine the two kinds of information, EWLLE interpolates the distribution of and to obtain the emotion distribution bywhere n is the affective word number in sentence , is the number of emotion labels of the -th affective word , is the -th emotion label of the affective word , is the ground-truth emotion label of sentence , and are the generated Gaussian distributions of the emotion labels and the sentence label respectively, and the weight coefficient λ is used to control the proportion of in the emotion distribution . The specific steps of the EWLLE algorithm are shown in Algorithm 1.

Input: training sentence and its emotion label , weighting parameter λ, emotion lexicon L
Output: emotion distribution of
(1)Extract all affective words from by looking up L
(2)for each
(3) Obtain all emotion labels of by looking up L
(4) Generate discrete Gaussian distribution for each according to (3)
(5)end for
(6)Generate discrete Gaussian distribution for according to (3)
(7)return

The range of the parameter λ is . The emotion distribution is solely generated from the ground-truth emotion label when λ = 1, where no affective word information is included in EWLLE. In contrast, when λ = 0, EWLLE produces the emotion distribution only based on the affective words and the lexicon without the help of the annotation . Since the manually labeled label is generally accurate, its emotion discriminating power should be greater than that of the automatic extracted emotional information of affective words. Therefore, we consider the optimal threshold of the parameter λ is greater than 0.5, reflecting the fact that the sentence label is more important than the affective words’ labels. We will investigate the effect of the parameter λ in the experiments.

The EWLLE method defines the psychological distance between emotions based on the interval perspective on Plutchik’s emotion wheel, generates discrete Gaussian distributions for real emotion labels and emotion labels of emotion words, respectively, based on this distance, and finally superimposes the distributions of the two labels into a unified emotion distribution. Unlike the existing label enhancement methods, the EWLLE method integrates the psychological and linguistic knowledge of emotions, and the generated label distributions contain more information quantity. Furthermore, the EWLLE method is not a simple combination of the existing MWLE and LLE method. The method of LLE is to count the number of affective words in sentences and assign scores to the secondary emotion according to the number of affective words. The more the affective words appear, the higher the scores are. The modeling steps of the LLE approach cannot be simply combined with the consideration of psychological information in the MWLE approach. The following experimental results will demonstrate that the EWLLE method can obtain better results by combining both psychological and lexical knowledge.

3. Experiment

3.1. Experimental Setup

The experiments were conducted on 4 widely used single-labeled text emotion datasets, i.e., TEC [27], Fairy Tales [28], CBET [29], and ISEAR [30]. For detailed information of all experimental datasets, we list the sentence number of each emotion, the total sentence number, and the averaged word number of each sentence in Table 1.

We preprocessed the text in a normative manner, where all of the numbers and stop words were removed, words were converted into lowercase, and word stemming was performed. Then, the pretrained word2vec word embedding model [42] was used to represent words in the form of a 300-dimensional vector. The words unseen in the word2vec model were initialized by the random uniform distribution . Then, each sentence was converted into a matrix and fed to the EDL prediction model.

The affective lexicon utilized in the methods of EWLLE and LLE is a combination of two classical lexicons, i.e., NRC [43] and Emosenticnet [44]. NRC contains 14,182 affective words and 10 emotions. Emosenticnet includes 13,189 affective words and 6 emotions. We retained 6 intersected emotions of the two lexicons, namely, anger, fear, joy, disgust, sadness, and surprise, and removed affective words not marked by any retained emotion. The emotion labels of an affective word were set as the union of the corresponding original labels. Finally, we got 15,603 affective words, and each word has 1.31 emotion labels on average.

The state-of-the-art EDL model based on multitask convolutional neural network (CNN) was used as the prediction model [24]. The emotion with the highest score in the output emotion distribution is regarded as the predicted emotion. For the setting of CNN framework, we use filter windows of 3, 4, and 5 with 100 feature maps each, dropout rate of 0.5, mini-batch size of 50, and optimization function of SGD algorithm following the same routing in Zhang et al. [24].

The standard stratified 10-fold cross-validation procedure was applied. We divide the dataset into ten subsets of equal size according to the category proportion. Each subset is used in turn as a test set, and the remaining subsets are used as the training set. To make the experimental results comparable, the models participating in the comparison used the same data division. The final performances were recorded by the averaged emotion classification accuracy and the corresponding standard deviations over 10-fold cross-validation.

The codes were implemented in Python language with the machine learning framework of Pytorch 1.3.1 and carried out on a Lenovo PC with Intel(R) Core(TM) i7-6700 3.40 GHz CPU, 32 GB RAM.

To evaluate the performance of our proposed method, we conducted the following two sets of experiments: (1) analyzing the effect of the weight coefficient λ on the EWLLE method; (2) comparing the classification accuracy of EWLLE to some state-of-the-art EDL label enhancement methods.

3.2. Effects of the Parameter λ

As described in Section 3, the emotion distribution generated by EWLLE is a combination of the Gaussian distribution of the sentence emotion labels and the affective words’ emotion labels. The parameter λ plays an important role in controlling the relative proportion of the two kinds of information of sentence label and affective words’ labels. In order to investigate the effects of λ, we varied the value of λ from 0 to 1 with the step of 0.1 and recorded the corresponding classification accuracy of the CNN based EDL model. Figure 4 shows the specific results on all experimental datasets.

As we can see from Figure 4, although the absolute score of emotion classification accuracy is quite different, the accuracy curves show some similar trends on all datasets. When the value of λ increases from 0 to 0.7, the accuracies are consistently improved, which indicates that it is beneficial to incorporate the sentence emotion label at this stage. However, when λ is beyond a certain point, the scores generally drop, which illustrates that relying too much on the information from sentence labels, to the detriment of affective words, is harmful. The value of 0.8 is the optimal value of λ on all 4 datasets, which is the point where the information from the sentence emotion label and the affective words reaches a kind of balance. Furthermore, the fact that the optimal value of λ is much greater than 0.5 verifies our previous conjecture that the importance of sentence emotion labels is greater than that of affective words.

Furthermore, we find that the optimal accuracy score with λ = 0.8 is significantly higher than that of λ = 0 (only considering the information from affective words) or that of λ = 1 (only including the information of sentence emotion label) on all 4 datasets. This demonstrates that it is essential to consider both sentence emotion labels and affective words’ emotion labels in the label enhancement process.

3.3. Comparative Results of Different Label Enhancement Methods

We compared the proposed EWLLE method with some state-of-the-art label enhancement methods, which are One-hot, MWLLE, and LLE. The comparative experiment worked in a pipeline way. Firstly, the traditional single-label datasets were converted into emotion distribution labeled ones by label enhancement. Then, the CNN model was built on the enhanced datasets to predict the emotions, among which the highest one is selected as the final prediction. At last, the emotion classification accuracy was recorded to represent the performance of the corresponding method.(i)One-hot: Representing the sentence emotion label directly by the one-hot vector, where the vector component of the ground-truth label is 1; otherwise, it is 0. The length of the one-hot vector is the number of all possible emotion labels.(ii)MWLE: Yang et al. [23] proposed an emotion distribution Label Enhancement method based on Mikels’ Wheel. MWLE calculates emotion distances by Mikels’ wheel and then adopts the Gaussian function to transform the sentence label to emotion distribution. Two versions of MWLE are proposed by Yang et al. [23], and we use the better version (constraint 1) in our experiments. Note that Mikels’ emotion wheel in MWLE was replaced by Plutchik’s emotion wheel, because its emotions fit better the emotions labels in the experimental datasets.(iii)LLE: Zhang et al. [24] designed the Lexicon-based emotion distribution Label Enhancement method. In the emotion distribution generated by LLE, the ground-truth label is assigned a certain score, and the remaining scores are allocated across all other emotion labels by counting the corresponding affective words. LLE does not include any psychological emotional knowledge.(iv)EWLLE: Our proposed Emotion Wheel and Lexicon-based emotion distribution Label Enhancement method. EWLLE considers both the psychological and linguistics information. The weight coefficient λ was set to 0.8.

The detailed comparative results of 4 label enhance methods are shown in Table 2, where the mean classification accuracy ± the standard deviation over the ten-fold cross-validation is reported. The last row of Table 2 lists the score of comparative methods averaged on 4 datasets. The best score of each row is highlighted in bold.

The results of Table 2 show clearly that EWLLE outperforms other label enhancement methods on all four datasets. Regarding the averaged accuracy on all datasets, EWLLE has the score of 0.663, which is 0.017 higher than LLE, 0.027 better than MWLE, and 0.043 higher than One-hot. Compared to the suboptimal method of LLE, the performance of EWLLE is significantly improved, which indicates that introducing the psychological emotional knowledge into label enhancement is necessary.

In addition, the performance of LLE is superior to that of MWLE, which is consistent with the experimental results of Zhang et al. [24]. The performance difference of LLE and MWLE illustrates the discriminative power of affective words, which is beneficial to text-based EDL. It is not enough to enhance a single label to an emotion distribution solely based on the psychological emotional knowledge, as used in MWLE. Since neither the prior emotional knowledge nor the information of affective words is included, the one-hot method has the worst performance in the experiment expectedly.

4. Conclusions

In the field of Emotion Distribution Learning (EDL), label enhancement is an important method to solve the insufficient problem of emotion distribution annotated datasets. In this paper, we proposed a method of Emotion Wheel and Lexicon-based emotion distribution Label Enhancement (EWLLE) to effectively enhance the sentence emotion label in single-labeled datasets to the emotion distribution. Unlike existing methods, EWLLE adopts both the psychological emotional knowledge and the linguistics information of affective words. Based on Plutchik’s wheel of emotions, EWLLE generates discrete Gaussian distribution for sentence emotion labels and emotion labels of affective words, respectively, and then superposes them into a unified emotion distribution. Extensive experimental results showed that EWLLE performs favorably against the state-of-the-art label enhancement methods.

In the next research, we will introduce more prior affective knowledge into the EDL label enhancement method and try many different affective modeling methods to make use of prior knowledge more effectively. In addition, the recognition of negative and other sophisticated affective words in the label enhancement method is also the problem we will study in the future.

Data Availability

The raw data required to reproduce these findings are available in the cited references in Section 3.1 of the manuscript.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported in part by the Natural Science Foundation of China under Grant nos. 61866017, 61866018, and 61966019, in part by the Support Program for Outstanding Youth Talents in Jiangxi Province no. 20171BCB23013, and in part by the Natural Science Foundation of Jiangxi Province under Grant no. 20192BAB207027.