Emotion Label Enhancement via Emotion Wheel and Lexicon
Emotion Distribution Learning (EDL) is a recently proposed multiemotion analysis paradigm, which identifies basic emotions with different degrees of expression in a sentence. Different from traditional methods, EDL quantitatively models the expression degree of the corresponding emotion on the given instance in an emotion distribution. However, emotion labels are crisp in most existing emotion datasets. To utilize traditional emotion datasets in EDL, label enhancement aims to convert logical emotion labels into emotion distributions. This paper proposed a novel label enhancement method, called Emotion Wheel and Lexicon-based emotion distribution Label Enhancement (EWLLE), utilizing the affective words’ linguistic emotional information and the psychological knowledge of Plutchik’s emotion wheel. The EWLLE method generates separate discrete Gaussian distributions for the emotion label of sentence and the emotion labels of sentiment words based on the psychological emotion distance and combines the two types of information into a unified emotion distribution by superposition of the distributions. The extensive experiments on 4 commonly used text emotion datasets showed that the proposed EWLLE method has a distinct advantage over the existing EDL label enhancement methods in the emotion classification task.
Text emotion classification (recognition) is an important research topic with many promising novel applications , such as emotional human-computer interaction , intelligent customer service , music emotion classification , anticipating corporate financial performance , and online product review analysis . The goal of text emotion recognition is to find out the writers’ emotional states contained in sentences . In recent years, researchers have proposed a lot of effective and fruitful work in the field of text emotion classification [2–6].
In general, the emotion expressed in a sentence is a mixture of a variety of basic emotions (e.g., anger, fear, joy, or sadness), where each basic emotion has a certain degree of contribution to the overall expression . Traditionally, text emotion recognition models are based on Single-Label Learning (SLL) or Multi-Label Learning (MLL). In SLL, one sentence is assumed to be associated with only one emotion label . To cope with the situation where one sentence simultaneously evokes several different emotions, MLL assigns multiple emotion labels to a sentence . However, the modeling ability of MLL is insufficient to quantitatively analyze the ambiguity in multiple emotions. Given a sentence, MLL identifies the prominent emotion labels, while it cannot tell the specific expression intensity of each emotion .
To address this problem, by drawing on the idea of Label Distribution Learning (LDL) [11,12], Zhou et al.  proposed Emotion Distribution Learning (EDL) for facial emotion recognition. In the next year, EDL was applied to text emotion classification . EDL deals with multiple emotions by associating each instance (e.g., facial image or sentence) with an emotion distribution vector, where the vector’s dimension is the number of all possible emotions. In the emotion distribution, each component represents the intensity of the corresponding emotion on the given instance. Obviously, EDL is suitable to solve the quantitative multiemotion analysis problem, especially when emotion ambiguities occur . In recent years, many effective EDL methods have been proposed. For example, Zhou et al.  designed an EDL model by introducing the constraint of interrelationships between emotions. Jia et al.  proposed an EDL method that considers the local correlation of emotion labels. Zhao et al.  put forward an EDL model based on meta-learning with small samples. He et al.  designed an EDL method based on graph convolutional neural networks, where the correlation between emotions is considered. Xiong et al.  proposed an EDL model based on convolutional neural networks that utilizes the polarity and the sparsity of emotion labels. Fan et al.  designed an EDL method to predict image emotion distribution by learning labels’ correlation. Xi et al.  proposed emotion distribution learning based on surface electromyography for predicting the intensities of basic emotions. Liang et al.  proposed a novel semisupervised multimodal emotion recognition model based on cross-modality distribution matching. All these EDL methods exhibited better performance than traditional models.
Most existing EDL works focused on improving emotion recognition accuracy by proposing novel prediction models. Few methods have been proposed to determine the emotion distribution from existing annotated datasets, which only contain single-labeled emotions. A radical solution to the problem would be the creation of datasets, in which emotion distributions (or quantitative multilabeled instances) are annotated instead of single emotion labels, which is difficult to do in practice. A convenient method to obtain EDL datasets is to transfer the existing quantitative multilabeled datasets by label score normalization (normalizing the sum of label score of each instance to be 1). However, the quantitative multilabeled text emotion datasets are scarce, except for a few special ones such as the SemEval 2007 Task 14 . Most of the existing text emotion datasets are single-labeled. To utilize the traditional emotion datasets in EDL, label enhancement aims to transform single-labeled emotions to emotion distributions, whose idea is similar to that of LDL .
Label enhancement needs to leverage some extra knowledge to convert a single label to a distribution, in which the effectiveness of knowledge is essential. An up-to-date introduction of label enhancement in label distribution learning can be found in Xu et al. . Up to now, only several EDL label enhancement methods have been proposed [23, 24]. Yang et al.  proposed Mikels’ Wheel-based emotion distribution Label Enhancement (MWLE) method, which utilizes psychological emotional knowledge to transform emotion labels into distributions. However, the MWLE method is proposed for facial emotion classification without considering the affective words that are effective in text emotion analysis. The affective words have different intensity and emotional tendency, which are generally annotated based on linguistic knowledge. Emotional words contain a lot of emotional information, which is discriminative for text emotion recognition . Zhang et al.  proposed the Lexicon-based emotion distribution Label Enhancement (LLE) method, which generates emotion distributions from a single-label by introducing the linguistic information of affective words. The experimental results show that the performance of the LLE method is better than that of the MWLE method , but the disadvantage of the LLE method is that the psychological correlation between human emotions is not considered. Psychological emotional knowledge can be used to effectively obtain the intrinsic correlations among emotions , while affective words contain effective discriminating information about emotions . Both of them are essential to efficient emotional classification. However, to the best of our knowledge, no existing label enhancement work has considered both the psychological and linguistic knowledge.
In this paper, we present a novel Emotion Wheel and Lexicon-based emotion distribution Label Enhancement (EWLLE) method, which calculates the psychological distances between emotions according to Plutchik’s emotion wheel and utilizes the linguistic information of affective words from some classical lexicons. Plutchik’s wheel of emotions is a well-known psychological model proposed by Robert Plutchik in 1980 to describe human emotional relationships . We exploit Plutchik’s wheel of emotions to determine similarities between different emotions through Gaussian distribution. For a given sentence, based on psychological emotion distances, EWLLE generates discrete Gaussian distribution of sentence emotion labels and the affective words’ emotion labels and then superposes them into a unified emotion distribution. Different from existing EDL label enhancement methods, EWLLE takes into consideration the psychological and linguistic emotional knowledge at the same time during the label enhancement procedure. Extensive experiments on 4 public text emotion datasets, TEC , Fairy Tales , CBET , and ISEAR , demonstrate that the proposed EWLLE method performs favorably against the state-of-the-art approaches in the task of text emotion recognition.
The rest of the paper is organized as follows. Section 2 introduces some related works of emotion label enhancement. Section 3 describes the proposed method of label enhancement based on the emotion wheel and lexicon in detail. Section 4 provides a series of comparative experiments to verify the effectiveness of the proposed method. Finally, Section 5 concludes the paper.
2. Related Works
2.1. Emotion Distribution Learning
In general, the emotional expression in a text sentence or facial image is usually a combination of multiple basic emotions. All related basic emotions play a certain role in the overall expression and together constitute an emotion distribution . Figure 1 shows two representative sentences and the corresponding annotations from the SemEval dataset . For the sentence (a), the dominating emotion, surprise, accounts for 41.9% expression level; meanwhile, the other two major emotions, joy and anger, present 20.1% and 16.9%, respectively. The situation of the sentence (b) is analogous. The examples show that a single sentence may possibly contain multiple emotions with different intensities rather than a single label.
As we stated earlier, most traditional emotion recognition methods are based on Single-Label Learning (SLL) or Multi-Label Learning (MLL), which address the problem of “which emotions are used to describe the sample,” rather than the problem of “how to describe the sample’s emotions quantitatively.” Actually, both SLL and MLL cannot obtain the specific intensity of each basic emotion . If we simply annotate sentences with multiple major emotions, whose expression levels are higher than a given threshold, the multiemotion recognition task could be solved by MLL, but such a simplistic approach would discard the information of the emotion expression level, which makes it is impossible to quantitatively analyze related emotions in the subsequent affective computing task .
In light of the success of the novel machine learning paradigm of Label Distribution Learning (LDL) , Emotion Distribution Learning (EDL) was proposed to solve the facial and text emotion classification tasks [7, 13]. Different from SLL and MLL, EDL assigns each instance with an emotion distribution, where each distribution component represents the intensity of the corresponding emotion to the given instance.
In the traditional single-label text emotion recognition task, each sentence has a corresponding emotion label , , where C is the number of all possible emotions. Rather than just predicting the emotion label , the goal of EDL is to find a function to map sentence to an emotion distribution , where represents the intensity of -th emotion to the sentence , and . Note that, unlike the probability distribution that assumes only one label is correct at a time, the emotion distribution allows an instance to have multiple emotion labels simultaneously. Any emotion with an intensity higher than 0 is a possible label for the instance, while the expression level of each emotion label can vary.
EDL predicts the intensity values of sentences across a set of emotion categories by the emotion distribution. Such information is important for understanding fine-grained emotions, especially when ambiguities exist . In recent years, many effective EDL methods have been proposed [13–20]. However, one of the major problems for the development of EDL models is the lack of emotion distribution in annotated datasets, due to the difficulty to annotate emotion distributions. In the existing text emotion datasets, it is rare to see multiemotion score annotations (e.g., the SemEval 2007 Task 14 ), which can be transferred to emotion distribution by label score normalization. In order to utilize abundant traditional single-labeled text emotion datasets, methods of label enhancement are required.
2.2. Label Enhancement
In label distribution learning, label enhancement refers to the process of recovering the label distribution from a single-label or multilabel dataset . If we regard the original ground-truth label as the score of 1, the basic idea of most label enhancement methods is to reduce the score of and increase the scores of some other related labels in the generated distribution, where the score of generally remains the highest. The label score adjustment strategies vary among different label enhancement methods. In practice, both the correlation among the labels and the topological information of the feature space are usually utilized to recover label distribution from logical label. After the concept of label distribution learning was proposed, some literatures have proposed some effective label enhancement methods. The existing label enhancement algorithms are roughly divided into three types , i.e., fuzzy theory-based method [31, 32], graph model-based method [33, 34], and prior knowledge-based method [35, 36]. We give a brief introduction to these 3 kinds of label enhancement methods as follows.
The fuzzy theory-based label enhancement method digs out the correlation among the labels by introducing the fuzziness into the originally rigid logical labels and transforms logical label into label distribution. For example, the label enhancement algorithm based on fuzzy clustering uses the membership degree of the examples generated in the fuzzy clustering process to each cluster and converts the membership degree of the example to the cluster into the membership degree of the category, thereby generating the label distribution . The label enhancement algorithm based on the fuzzy kernel membership degree uses the kernel technique to calculate the fuzzy membership degree of the example to each category in the high-dimensional space, so as to mine the correlation between the categories in the training set .
The graph model-based label enhancement algorithm uses graph models to represent the topological structure between examples, establishes the relationship between examples and labels through some model assumptions, and then enhances logical label to label distribution. For instance, the label enhancement algorithm based on label propagation expresses the topology structure between examples through a graph model and uses the difference of path weights in the propagation process to make descriptive differences, so as to mine the relationship between the labels in the training set . The manifold-based method reconstructs the manifolds of the feature space and the label space and uses the smoothing assumption to migrate the topological relationship of the feature space to the label space, thereby enhancing logical label to label distribution .
The prior knowledge-based label enhancement algorithm introduces some kind of prior knowledge, mines the implicit correlation between labels according to the characteristics of the dataset, and enhances logical label into label distribution. Obviously, the validity of prior knowledge is the key to the success of this kind of method. By introducing the prior knowledge of correlation among different head poses, Geng et al.  built label distribution from logical label and its neighboring head poses. In the application of facial age estimation, the lack of facial images with definite age labels makes traditional age prediction algorithms inefficient. Based on the prior knowledge of the facial similarity between adjacent human ages, Geng et al.  recovered label distribution from ground-truth age and its adjacent ages and proposed an adaptive label distribution learning model to learn the human age ambiguity. Furthermore, Zhang et al.  presented a developed prior assumption of facial age correlation, which limits age label distribution that only covers a reasonable number of neighboring ages. Based on the developed prior knowledge, Zhang et al.  proposed a practical label distribution paradigm and outperformed current state-of-the-art facial age recognition methods.
2.3. Lexicon-Based Label Enhancement
In addition to the annotations, emotion label, or emotion distribution, text based EDL can utilize the extra prior information of affective words contained in sentences compared to the classical LDL. Affective words are words with different intensites of emotional tendency , which are associated with certain emotion labels. The affective lexicon is a dictionary of affective words, which is one of the most important linguistic resources in affective computing for text . One sentence may include multiple affective words, and one affective word can be associated with several emotion labels. Many existing studies showed that affective words contain abundant discriminative information for text emotion recognition . Researchers have proposed a variety of emotion recognition methods based on emotional lexicon, where affective words are extracted from sentences and then used to predict emotion labels. For example, Agrawal and An  utilized the constructed emotion lexicon to classify emotional sentences; Wang and Pal  proposed a multiconstraint emotion classification model based on emotional lexicon.
Similar to that of LDL , label enhancement in EDL converts the emotion label of sentence to the emotion distribution . Given the rich information encoded in affective words, it is desirable to introduce affective words into the text based EDL label enhancement models. Following this idea, Zhang et al.  proposed the method of Lexicon-based emotion distribution Label Enhancement (LLE), whose main idea is to attach secondary emotions based on affective words to the ground-truth emotion. As shown in Figure 2, the ground-truth emotion label of the example is sadness, which has the highest score in the generated emotion distribution. Meanwhile, four secondary emotions of anger, fear, joy, and disgust are extracted based on lexicon and added to the emotion distribution. Secondary emotions have a lower score than the dominating emotion.
Given a sentence and its emotion label , the specific approach of LLE is as follows : (1) extracting all affective words from the sentence based on affective lexicons and obtaining the corresponding emotion label set ; (2) assigning intensity scores to the corresponding emotion labels if there are other affective words than the ones with the ground-truth emotion; otherwise, the one-hot vector is used as emotion distribution for the sentence .
Formally, the intensity score of the ground-truth emotion is calculated by
Meanwhile, the expression level of the -th emotion with is computed bywhere is the number of affective words of the -th emotion in the sentence, and is the weight parameter of the ground-truth emotion label. After calculating the intensity score for all emotion labels, is guaranteed by normalization.
Given a sentence with its logical emotion label, LLE extracts affective words from text and attaches the corresponding emotional information to the logical label, which is used to generate the final emotion distribution. Compared to the label enhancement method without using affective words , experimental results demonstrated that the emotion distribution generated by LLE has better performance .
In summary, the lack of textual emotional datasets with annotated emotion distributions is a distinct obstacle to the development of EDL. To address this problem, some prior knowledge-based label enhancement methods have been recently proposed by scholars to enhance logical label to emotion distribution [23, 24]. However, most existing methods are deficient in the sense of utilizing extra prior knowledge, where the validity of prior knowledge is the key to success. MWLE calculates emotion distances by Mikels’ wheel and then adopts the Gaussian function to transform the sentence label to emotion distribution. However, MWLE is a label enhancement method without using the prior linguistic knowledge of affective words . LLE builds the emotion distribution based on the information of affective words. Nevertheless, LLE does not consider the prior psychology knowledge of human emotions, which contains the information of widely observed intercorrelation among emotions . The major difference between MWLE and LLE is the use of two different kinds of prior knowledge. To the best of our knowledge, the prior knowledge of psychology and linguistics has never been used together in label enhancement methods.
In order to effectively integrate both psychological knowledge and linguistic knowledge into the label enhancement model, this paper uses Plutchik’s wheel of emotions to calculate the psychological distance between emotions and proposes an emotion distribution label enhancement method based on the emotion wheel and the emotion lexicon. The details of the proposed method will be discussed in the next section.
2.4. The Emotion Wheel and Lexicon-Based Label Enhancement Method
2.4.1. Plutchik’s Wheel of Emotions
Human emotional expression is a complex phenomenon, where intrinsic strong intercorrelations among emotions exist widely . Some particular emotions often appear simultaneously in a face or a sentence, which shows a high positive correlation. Meanwhile, some other emotions present the opposite cooccurrence phenomenon, which can be regarded as a negative correlation. Robert Plutchik  proposed the theory of emotional wheel, which is a classic model that describes human emotional relationships from a psychological perspective. Plutchik’s wheel of emotions contains 8 basic emotions: anger, disgust, sadness, surprise, fear, trust, joy, and expect. As shown in Figure 3, these 8 emotions are divided into 4 groups of opposite emotions and allocated accordingly in the emotion wheel. The two emotions in the diagonal position are the opposition (negative correlation), and the adjacent emotions have some kind of similarity (positive correlation).
Since the similarity in Plutchik’s emotion wheel represents the corresponding psychological distance of emotions, we use the interval angle in the emotion wheel to measure the distance between emotions, where each 45-degree interval in the wheel is defined as 1 scale. The bigger the interval angle, the larger the emotional distance. For example, for the adjacent emotions of anger and expect, which are separated by 45°, their distance is 1. The distance between joy and sadness is 4, since they are two opposite emotions with 180-degree intervals. In our study, the distance between any two different emotions is an integer from 1 to 4, and the distance between the same emotions is 0.
In previous work of EDL, some effective models based on the prior psychological knowledge of human emotions have been proposed [13, 23, 40]. For instance, Zhou et al.  introduced the emotion label constraints into the optimization function of the maximum entropy based EDL model, where the constraints are calculated according to Plutchik’s emotion wheel. Mikels et al. proposed an EDL method based on Mikels’ wheel of emotions  and the convolutional neural network, where Mikels’ emotion wheel is another classical psychological emotion model. The main difference between two emotion wheels is that they contain different emotions. Both studies using wheels achieved excellent results.
For the task of emotion distribution label enhancement, Yang et al.  proposed the method of Mikels’ Wheel-based emotion distribution Label Enhancement (MWLE). The MWLE method calculates the emotion distances based on Mikels’ wheel of emotions and transfers emotion label into emotion distribution by the Gaussian function. However, the information of affective words is ignored, since no text is available in facial emotion analysis. As a result, the performance of MWLE is inferior to that of LLE . Until now, we have not found existing work on emotion distribution label enhancement that considers both psychological and linguistic emotional knowledge.
2.4.2. The Proposed Method
Just as upgrading a low-resolution image to a high-resolution one actually requires more information, label enhancement needs to introduce some external knowledge to effectively transform a single label into a distribution. In addition to the label , we propose using both psychological emotional knowledge and linguistic information of affective words. Based on the ground-truth sentence label and the emotion labels of affective words, combined with the emotion correlation knowledge learned from Plutchik’s emotion wheel, we propose the Emotion Wheel and Lexicon-based emotion distribution Label Enhancement (EWLLE) method. EWLLE defines the psychological emotion distance according to the corresponding interval angle in Plutchik’s wheel of emotions, then generates the discrete Gaussian distributions across all emotion categories for the ground-truth sentence label and the emotion labels of affective words, respectively, and finally integrates them into a unified emotion distribution by the superposition operation.
In particular, for the sentence , EWLLE extracts all affective words by looking up the emotion lexicon, where n is the number of affective words in . The number of n is zero when there is no affective word contained in the sentence. Meanwhile, each affective word has several (at least 1) associated emotion labels , where is the number of emotion labels of . The total number of emotion labels of affective words extracted from sentence is .
For the enhancement from an emotion label α to an emotion distribution, we make two reasonable assumptions. Firstly, the ground-truth emotion label α should have the highest value in the generated distribution to ensure its dominating position. Secondly, the score of other emotions is reduced along with the distance to the label α, which reflects the fact that an emotion similar to the dominating emotion has a higher weight than distanced ones. Since the emotions form a loop in Plutchik’s emotion wheel, the emotion distribution based on psychological distance will be a symmetrical distribution centered on the label α with decreasing values on both sides. The distribution may look like a Gaussian distribution or triangular distribution. From the label enhancement results in the facial age estimation task, Gaussian distribution is a better choice than triangular distribution . We follow this work and use Gaussian distribution.
Based on the above principles, the EWLLE method adopts the discrete Gaussian function to enhance the emotion label to the distribution . Formally, the discrete Gaussian distribution centered on the label is calculated as follows:where is the standard deviation of the Gaussian function, is the normalization factor to ensure , is the distance between the emotion and the ground-truth emotion α. We use the psychological distance described in Section 3.1 to calculate . When the standard deviation is larger, more similar emotions are considered in the generated emotion distribution since the corresponding Gaussian distribution is flatter. The standard deviation is set to 1 in our experiments.
Once all emotion labels in sentence are obtained, EWLLE generates the Gaussian distribution and by formula (3) for the sentence label and the affective words’ emotion labels respectively. And then, in order to combine the two kinds of information, EWLLE interpolates the distribution of and to obtain the emotion distribution bywhere n is the affective word number in sentence , is the number of emotion labels of the -th affective word , is the -th emotion label of the affective word , is the ground-truth emotion label of sentence , and are the generated Gaussian distributions of the emotion labels and the sentence label respectively, and the weight coefficient λ is used to control the proportion of in the emotion distribution . The specific steps of the EWLLE algorithm are shown in Algorithm 1.
The range of the parameter λ is . The emotion distribution is solely generated from the ground-truth emotion label when λ = 1, where no affective word information is included in EWLLE. In contrast, when λ = 0, EWLLE produces the emotion distribution only based on the affective words and the lexicon without the help of the annotation . Since the manually labeled label is generally accurate, its emotion discriminating power should be greater than that of the automatic extracted emotional information of affective words. Therefore, we consider the optimal threshold of the parameter λ is greater than 0.5, reflecting the fact that the sentence label is more important than the affective words’ labels. We will investigate the effect of the parameter λ in the experiments.
The EWLLE method defines the psychological distance between emotions based on the interval perspective on Plutchik’s emotion wheel, generates discrete Gaussian distributions for real emotion labels and emotion labels of emotion words, respectively, based on this distance, and finally superimposes the distributions of the two labels into a unified emotion distribution. Unlike the existing label enhancement methods, the EWLLE method integrates the psychological and linguistic knowledge of emotions, and the generated label distributions contain more information quantity. Furthermore, the EWLLE method is not a simple combination of the existing MWLE and LLE method. The method of LLE is to count the number of affective words in sentences and assign scores to the secondary emotion according to the number of affective words. The more the affective words appear, the higher the scores are. The modeling steps of the LLE approach cannot be simply combined with the consideration of psychological information in the MWLE approach. The following experimental results will demonstrate that the EWLLE method can obtain better results by combining both psychological and lexical knowledge.
3.1. Experimental Setup
The experiments were conducted on 4 widely used single-labeled text emotion datasets, i.e., TEC , Fairy Tales , CBET , and ISEAR . For detailed information of all experimental datasets, we list the sentence number of each emotion, the total sentence number, and the averaged word number of each sentence in Table 1.
We preprocessed the text in a normative manner, where all of the numbers and stop words were removed, words were converted into lowercase, and word stemming was performed. Then, the pretrained word2vec word embedding model  was used to represent words in the form of a 300-dimensional vector. The words unseen in the word2vec model were initialized by the random uniform distribution . Then, each sentence was converted into a matrix and fed to the EDL prediction model.
The affective lexicon utilized in the methods of EWLLE and LLE is a combination of two classical lexicons, i.e., NRC  and Emosenticnet . NRC contains 14,182 affective words and 10 emotions. Emosenticnet includes 13,189 affective words and 6 emotions. We retained 6 intersected emotions of the two lexicons, namely, anger, fear, joy, disgust, sadness, and surprise, and removed affective words not marked by any retained emotion. The emotion labels of an affective word were set as the union of the corresponding original labels. Finally, we got 15,603 affective words, and each word has 1.31 emotion labels on average.
The state-of-the-art EDL model based on multitask convolutional neural network (CNN) was used as the prediction model . The emotion with the highest score in the output emotion distribution is regarded as the predicted emotion. For the setting of CNN framework, we use filter windows of 3, 4, and 5 with 100 feature maps each, dropout rate of 0.5, mini-batch size of 50, and optimization function of SGD algorithm following the same routing in Zhang et al. .
The standard stratified 10-fold cross-validation procedure was applied. We divide the dataset into ten subsets of equal size according to the category proportion. Each subset is used in turn as a test set, and the remaining subsets are used as the training set. To make the experimental results comparable, the models participating in the comparison used the same data division. The final performances were recorded by the averaged emotion classification accuracy and the corresponding standard deviations over 10-fold cross-validation.
The codes were implemented in Python language with the machine learning framework of Pytorch 1.3.1 and carried out on a Lenovo PC with Intel(R) Core(TM) i7-6700 3.40 GHz CPU, 32 GB RAM.
To evaluate the performance of our proposed method, we conducted the following two sets of experiments: (1) analyzing the effect of the weight coefficient λ on the EWLLE method; (2) comparing the classification accuracy of EWLLE to some state-of-the-art EDL label enhancement methods.
3.2. Effects of the Parameter λ
As described in Section 3, the emotion distribution generated by EWLLE is a combination of the Gaussian distribution of the sentence emotion labels and the affective words’ emotion labels. The parameter λ plays an important role in controlling the relative proportion of the two kinds of information of sentence label and affective words’ labels. In order to investigate the effects of λ, we varied the value of λ from 0 to 1 with the step of 0.1 and recorded the corresponding classification accuracy of the CNN based EDL model. Figure 4 shows the specific results on all experimental datasets.
As we can see from Figure 4, although the absolute score of emotion classification accuracy is quite different, the accuracy curves show some similar trends on all datasets. When the value of λ increases from 0 to 0.7, the accuracies are consistently improved, which indicates that it is beneficial to incorporate the sentence emotion label at this stage. However, when λ is beyond a certain point, the scores generally drop, which illustrates that relying too much on the information from sentence labels, to the detriment of affective words, is harmful. The value of 0.8 is the optimal value of λ on all 4 datasets, which is the point where the information from the sentence emotion label and the affective words reaches a kind of balance. Furthermore, the fact that the optimal value of λ is much greater than 0.5 verifies our previous conjecture that the importance of sentence emotion labels is greater than that of affective words.
Furthermore, we find that the optimal accuracy score with λ = 0.8 is significantly higher than that of λ = 0 (only considering the information from affective words) or that of λ = 1 (only including the information of sentence emotion label) on all 4 datasets. This demonstrates that it is essential to consider both sentence emotion labels and affective words’ emotion labels in the label enhancement process.
3.3. Comparative Results of Different Label Enhancement Methods
We compared the proposed EWLLE method with some state-of-the-art label enhancement methods, which are One-hot, MWLLE, and LLE. The comparative experiment worked in a pipeline way. Firstly, the traditional single-label datasets were converted into emotion distribution labeled ones by label enhancement. Then, the CNN model was built on the enhanced datasets to predict the emotions, among which the highest one is selected as the final prediction. At last, the emotion classification accuracy was recorded to represent the performance of the corresponding method.(i)One-hot: Representing the sentence emotion label directly by the one-hot vector, where the vector component of the ground-truth label is 1; otherwise, it is 0. The length of the one-hot vector is the number of all possible emotion labels.(ii)MWLE: Yang et al.  proposed an emotion distribution Label Enhancement method based on Mikels’ Wheel. MWLE calculates emotion distances by Mikels’ wheel and then adopts the Gaussian function to transform the sentence label to emotion distribution. Two versions of MWLE are proposed by Yang et al. , and we use the better version (constraint 1) in our experiments. Note that Mikels’ emotion wheel in MWLE was replaced by Plutchik’s emotion wheel, because its emotions fit better the emotions labels in the experimental datasets.(iii)LLE: Zhang et al.  designed the Lexicon-based emotion distribution Label Enhancement method. In the emotion distribution generated by LLE, the ground-truth label is assigned a certain score, and the remaining scores are allocated across all other emotion labels by counting the corresponding affective words. LLE does not include any psychological emotional knowledge.(iv)EWLLE: Our proposed Emotion Wheel and Lexicon-based emotion distribution Label Enhancement method. EWLLE considers both the psychological and linguistics information. The weight coefficient λ was set to 0.8.
The detailed comparative results of 4 label enhance methods are shown in Table 2, where the mean classification accuracy ± the standard deviation over the ten-fold cross-validation is reported. The last row of Table 2 lists the score of comparative methods averaged on 4 datasets. The best score of each row is highlighted in bold.
The results of Table 2 show clearly that EWLLE outperforms other label enhancement methods on all four datasets. Regarding the averaged accuracy on all datasets, EWLLE has the score of 0.663, which is 0.017 higher than LLE, 0.027 better than MWLE, and 0.043 higher than One-hot. Compared to the suboptimal method of LLE, the performance of EWLLE is significantly improved, which indicates that introducing the psychological emotional knowledge into label enhancement is necessary.
In addition, the performance of LLE is superior to that of MWLE, which is consistent with the experimental results of Zhang et al. . The performance difference of LLE and MWLE illustrates the discriminative power of affective words, which is beneficial to text-based EDL. It is not enough to enhance a single label to an emotion distribution solely based on the psychological emotional knowledge, as used in MWLE. Since neither the prior emotional knowledge nor the information of affective words is included, the one-hot method has the worst performance in the experiment expectedly.
In the field of Emotion Distribution Learning (EDL), label enhancement is an important method to solve the insufficient problem of emotion distribution annotated datasets. In this paper, we proposed a method of Emotion Wheel and Lexicon-based emotion distribution Label Enhancement (EWLLE) to effectively enhance the sentence emotion label in single-labeled datasets to the emotion distribution. Unlike existing methods, EWLLE adopts both the psychological emotional knowledge and the linguistics information of affective words. Based on Plutchik’s wheel of emotions, EWLLE generates discrete Gaussian distribution for sentence emotion labels and emotion labels of affective words, respectively, and then superposes them into a unified emotion distribution. Extensive experimental results showed that EWLLE performs favorably against the state-of-the-art label enhancement methods.
In the next research, we will introduce more prior affective knowledge into the EDL label enhancement method and try many different affective modeling methods to make use of prior knowledge more effectively. In addition, the recognition of negative and other sophisticated affective words in the label enhancement method is also the problem we will study in the future.
The raw data required to reproduce these findings are available in the cited references in Section 3.1 of the manuscript.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This research was supported in part by the Natural Science Foundation of China under Grant nos. 61866017, 61866018, and 61966019, in part by the Support Program for Outstanding Youth Talents in Jiangxi Province no. 20171BCB23013, and in part by the Natural Science Foundation of Jiangxi Province under Grant no. 20192BAB207027.
H. Zhou, M. Huang, T. Zhang et al., “Emotional chatting machine: emotional conversation generation with internal and external memory,” in Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pp. 730–739, Louisiana, LA, USA, February 2018.View at: Google Scholar
Y. Zhou, H. Xue, and X. Geng, “Emotion distribution recognition from facial expressions,” in Proceeding of the 23rd ACM international conference on Multimedia, pp. 1247–1250, Brisbane, Australia, October 2015.View at: Google Scholar
M. Abdul-Mageed and L. Ungar, “EmoNet: fine-grained emotion detection with gated recurrent neural networks,” in Proceeding of the 55th annual meeting of the association for computational linguistics, pp. 718–728, Vancouver, Canada, August 2017.View at: Google Scholar
J. Yu, L. Marujo, J. Jiang et al., “Improving multi-label emotion classification via sentiment classification with dual attention transfer network,” in Proceeding of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 1097–1102, Brussels, Belgium, November 2018.View at: Google Scholar
D. Zhou, X. Zhang, Y. Zhou et al., “Emotion distribution learning from texts,,” in Proceeding of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 638–647, Texas, TX, USA, November 2016.View at: Google Scholar
X. Jia, X. Zheng, W. Li et al., “Facial emotion distribution learning by exploiting low-rank label correlations locally,” in Proceeding of the IEEE conference on computer vision and pattern recognition, pp. 9833–9842, Seattle, WA, USA, June 2019.View at: Google Scholar
Z. Zhao and X. Ma, “Text emotion distribution learning from small sample: a meta-learning approach,” in Proceeding of the 2019 Conference on Empirical Methods in Natural Language and the 9th International Joint Conference on Natural Language Processing, pp. 3957–3967, Hong Kong, China, November 2019.View at: Google Scholar
T. He and X. Jin, “Image emotion distribution learning with graph convolutional networks,” in Proceeding of the 2019 on International Conference on Multimedia Retrieval, pp. 382–390, Ontario, Canada, June 2019.View at: Google Scholar
H. Xiong, H. Liu, B. Zhong et al., “Structured and sparse annotations for image emotion distribution learning,” in Proceeding of the 33rd AAAI Conference on Artificial Intelligence, pp. 363–370, Hawaii, HI, USA, January 2019.View at: Google Scholar
J. Liang, R. Li, and Q. Jin, “Semi-supervised multi-modal emotion recognition with cross-modal distribution matching,” in Proceedings of the 28th ACM International Conference on Multimedia, pp. 2852–2861, New York, NY, USA, October 2020.View at: Google Scholar
C. Strapparava and R. Mihalcea, “Semeval-2007 task 14: affective text,” in Proceeding of Fourth International Workshop on Semantic Evaluations (SemEval-2007), pp. 70–74, Prague, Czech Republic, June 2007.View at: Google Scholar
J. Yang, D. She, and M. Sun, “Joint image emotion classification and distribution learning via deep convolutional neural network,” in Proceeding of the 26th International Joint Conferences on Artificial Intelligence, pp. 3266–3272, Melbourne, Australia, August 2017.View at: Google Scholar
Y. Zhang, J. Fu, D. She et al., “Text emotion distribution learning via multi-task convolutional neural network,” in Proceeding of the 27th International Joint Conferences on Artificial Intelligence, pp. 95–4601, Stockholm, Swede, July 2018.View at: Google Scholar
Z. Teng, D. T. Vo, and Y. Zhang, “Context-sensitive lexicon features for neural sentiment analysis,” in Proceeding of the 2016 conference on empirical methods in natural language processing, pp. 1629–1638, Texas, TX, USA, November 2016.View at: Google Scholar
S. M. Mohammad, “Emotional tweets,” in Proceeding of the First Joint Conference on Lexical and Computational Semantics, pp. 246–255, Montréal, QC, USA, June 2012.View at: Google Scholar
C. O. Alm and R. Sproat, “Emotional sequencing and development in fairy tales,” in Proceeding of 1st International Conference on Affective Computing and Intelligent Interaction, pp. 668–674, Beijing, China, October 2005.View at: Google Scholar
A. G. Shahraki, “Emotion mining from text,” University of Alberta, Edmonton, Canada, 2015, M.S. thesis.View at: Google Scholar
K. R. Scherer and H. G. Wallbott, “Evidence for universality and cultural variation of differential emotion response patterning,” Journal of Personality and Social Psychology, vol. 66, no. 2, pp. 310–328.View at: Google Scholar
N. E. Gayer, F. Schwenker, and G. Palm, “A study of the robustness of KNN classifiers trained using soft labels,” in Proceeding of the 2nd Conference Artificial Neural Networks in Pattern Recognition, pp. 67–80, Berlin, Germany, September 2006.View at: Google Scholar
Y. Li, M. Zhang, and X. Geng, “Leveraging implicit relative labeling-importance information for effective multi-label learning,” in Proceeding of IEEE International Conference on Data Mining, pp. 251–260, Barcelona, Spain, January 2016.View at: Google Scholar
P. Hou and X. GengM. Zhang, ““Multi-label manifold learning,” in Proceeding of the 30th AAAI Conference on Artificial Intelligence, pp. 1680–1686, Arizona, AZ, USA, February 2016.View at: Google Scholar
X. Geng and Y. Xia, “Head pose estimation based on multivariate label distribution,” in Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3742–3747, Columbus, GA, USA, June 2014.View at: Google Scholar
X. Geng, Q. Wang, and Y. Xia, “Facial age estimation by adaptive label distribution learning,” in Proceeding of the 22nd International Conference on Pattern Recognition, pp. 4465–4470, Stockholm, Sweden, August 2014.View at: Google Scholar
A. Agrawal and A. An, “Unsupervised emotion detection from text using semantic and syntactic relations,” in Proceeding of 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 346–353, Macau, China, December 2012.View at: Google Scholar
Y. Wang and A. Pal, “Detecting emotions in social media: a constrained optimization approach,” in Proceeding of the 24th International Joint Conference on Artificial Intelligence, pp. 996–1002, Buenos Aires, Argentina, July 2015.View at: Google Scholar
T. Mikolov, I. Sutskever, K. Chen et al., “Distributed representations of words and phrases and their compositionality,” in Proceeding of the 26th Advances in Neural Information Processing Systems, pp. 3111–3119, Nevada, NV, USA, December 2013.View at: Google Scholar
S. M. Mohammad and P. D. Turney, “Nrc emotion lexicon,” NRC Technical Report, vol. 2, 2013.View at: Google Scholar