Abstract

With the popularization of Internet applications and the rapid development of e-commerce, online shopping has become a widespread and important pattern of consumption. Online user comments are an important data asset on e-commerce sites and have a great potential value for online users and merchants. However, accurate and effective extraction of the characteristics of products and users’ sentiment evaluation from a tremendous amount of comments is a significant challenge. Based on the concept of the LinLog energy model, this paper proposes an online review attribute-sentiment pair correlation model that evaluates user comments. After preprocessing the comment data of mobile phones and constructing an attribute dictionary, the proposed model conducts a clustering analysis of attributes and sentiment pairs to gain accurate assessment of attributes in order to explore potential information from user comments. Experiments conducted on one real-world dataset with comprehensive measurements verify the efficacy of the proposed model.

1. Introduction

With the popularization of Internet applications and the rapid development of e-commerce, online shopping has become a widespread and important pattern of consumption. Online user comments are an important data asset for e-commerce sites and text collection [1] for product subject to users’ personal subjective or objective attitudes after shopping online. These data have a great potential value for online shoppers and merchants. From the perspective of consumers, the comment data affect their purchasing choices [2]. From the manufacturers’ point of view [3], by analyzing the comment data, they can improve existing products and develop new product attributes. Most online consumer product review platforms currently do not consider the needs and preferences of individual consumers. Few online review platforms organize and present comments in a personalized, product-oriented manner. Consequently, it is virtually impossible for consumers to identify comments targeted at specific product features through hundreds or even thousands of comments [4].

According to statistics, on average, only 6.5% of new product programs in the market can be transformed into products. Of that number, less than 15% of new products can be successfully commercialized and 37% of new products coming into the market are a failure in business [5]. When analyzing the factors restricting product development, it is difficult to capture the real user demand from only the overall performance of the products. For example, in the mobile phone industry, users are satisfied with the overall performance of iPhone 6 but give low feedback for its battery performance [6]. This paper connects the specific characteristics of mobile phones to the sentiments of users’ evaluation by tapping their comments to deeply explore the comment value.

In previous studies on user comments, researchers tended to analyze user sentiment implied in the comments and proposed numerous models to judge sentiment polarity. However, the models developed only analyze the superficial meaning of comments and fail to connect the sentiments and attributes accurately. Rob et al. [7] utilized the LinLog energy clustering analysis method to cluster music and concluded that the method achieves a better effect than other clustering algorithms, as it distinguishes the edge of each category and specifies the central attributes of the connection. Consequently, in this study, the same method was applied to develop an online review attribute-sentiment pair correlation model based on user comments that mines potential information in the comments to learn user demands and subsequently provides references for shoppers and for manufacturers to improve their goods.

The major contributions of this paper are as follows: (i)The correlation of mobile phone attributes and user sentiment on a user comment dataset is demonstrated(ii)An integrated method for constructing a dictionary of mobile phone attributes is presented(iii)A novel approach for matching user sentiments with mobile phone attributes is proposed(iv)Results of experiments conducted on a real dataset that demonstrate that the proposed method effectively provides purchasing advice to users are presented and discussed

2. Literature Review

Since Hatzivassiloglou and McKeown [8] proposed the idea mining technology in 1997, the opinion mining has gradually become one of the most important research areas in data mining. The current research focuses on the extraction of feature words and opinion words. The opinion words are the user’s evaluation of features. The feature words and opinion words constitute a two-group, fine-grained view. The goal of mining is to obtain the <character word, opinion word> two-group to represent the user’s evaluation of a feature. Because of the difficulty in obtaining a corpus, the study of unsupervised methods accounts for the majority.

Yi et al. [9] used the similarity test (Likelihood Test) method to identify the explicit aspect according to the grammatical structural features of the noun phrase, but this method cannot effectively solve the coverage problem of terminology. Smaller, you can build a dictionary of aspects for matching extractions in the text of comments. Samha et al. [10] use WordNet words (http://wordnet.princeton.edu) for explicit extraction, by querying the names of related fields in the dictionary. Synonym information was used to identify explicit aspects of Zhu et al. [11]. In 2011, another unsupervised MAB (multiaspect bootstrapping) model was proposed for explicit mining of Chinese restaurant reviews. Bagheri et al. [12] used the bootstrapping model, extracting aspects from the POS mechanism, firstly tagging the text with the part of speech, then using the heuristic rules to filter out the seed vocabulary set required by the conforming aspect composition model, and finally using the bootstrapping method to extract the aspects from the data. Quan and Ren [13] used point mutual information and word frequency-inverse document frequency (TF-IDF) to discover explicit aspects and related entity classes and the relationship between the parties, from which the aspect belongs to the entity. For example, the “photo quality” is closer to the entity class and the “digital camera” than the entity class “MP3.” Zhou et al. [14] developed an implementation aspect in 2016. Kaur and Bansal [15] proposed that in this paper, a mechanism for opinion mining of text comment data has been proposed for generating a product review report based on multiple features and which can reveal several products: positive and negative points. Kumar et al. [16] proposed a method to automatically extract comments from websites and use the Naïve Bayes classifier, logistic regression, and SentiWordNet algorithms to classify comments into positive and negative comments and use quality metrics to measure each of the performance of the algorithm. The CMiner system extracted from the viewpoint summary was first applied to the microblog topic comment data. They used dynamic programming-based algorithms to perform named entity tag segmentation in Chinese microblogs, based on “microblogs” in the same topic which may focus on the same or similar hypothesis, which implements an unsupervised label propagation algorithm, thus generating aspect candidate sets. Hu and Liu [17] provide feature-based customer review summaries such as digital cameras, mobile phones, and Mp3 players. The popular semantic assessment workshop (SEMVAL) provides an exclusive tracking of aspect-based sentiment detection, some of which have written heuristic techniques to mine aspects and emotions [18]. Recently, the authors used prior knowledge from several other product areas (e.g., comments on products from electronic categories) to extract aspects of the target product [19]. Prameswari et al. [20] used text mining methods and aspect-based sentiment analysis to obtain hotel user opinions in the form of emotions by applying the recursive neural tensor network (RNTN) algorithm. Aspect-based sentiment analysis can provide typical sentiment analysis. Information was provided. Li and Yang [21] have obvious advantages in comparing the sentiment mining model with the novel dictionary embedding module with the logistic regression and support vector machine model based on its performance. In 2018, Rakesh et al. [22] proposed the LDA variant APSUM model to simultaneously extract feature words and viewpoint words. Cheng et al. [23] proposed a novel aspect-aware potential factor model that effectively combines reviews and ratings for rating prediction. Zvarevashe and Olugbara [24] designed an emotional analysis framework to handle hotel customer feedback through opinion mining. Recently, Jiang et al. [25] applied natural language information opinions and emotion mining to the field of health monitoring and achieved good results in privacy protection and judgment of users’ psychological state.

Obtaining annotated data for evaluation in e-commerce review is difficult, which poses a limitation for supervised models. Thus, it is more reasonable to use unsupervised models in practical e-commerce applications. In this study, a feature dictionary, a perspective dictionary generation framework, and a window method were developed to first roughly construct feature words and viewpoint word pairs. Then, the LinLog clustering algorithm was used to optimize the relationship between feature words and viewpoint words, and the results mined to ascertain users’ actual evaluation of products.

3. Constructing Datasets

3.1. Constructing a Dictionary of Mobile Phone Attributes

It is well known that Internet commentary differs from official language and colloquialization is a characteristic of the network data. In order to accurately extract the mobile phone attributes and user sentiments in the comments, we first constructed a complete mobile phone attribute dictionary and then extracted the noun collections and adjective collections according to the dictionary.

The study was conducted on all user comments related to smartphones running Android or IOS in self-operated stores on http://JD.com, one of the biggest e-commerce websites. For the smartphones preloaded with IOS, the comments on iPhone 4, iPhone 5, and iPhone 6 were collected, resulting in 41,202 pieces of data in total. For smartphones running Android, the mainstream brands in the market were selected: Huawei, ZTE, Xiaomi, and Samsung. A total of 67,885 comments were collected. The cosine similarity algorithm [26] was adopted to remove duplicates and then to construct the attribute dictionary.

LDA, PageRank, and the Conditional Entropy integrated model were used to construct the dictionary. As shown in Figure 1, the available “NLPIR” tokenizer was first used to segment words and identify word property for the given corpus [27]. Then, nouns and noun phrases were extracted as a set, with no need for ordering in the set. Meanwhile, nouns, noun phrases, and adjectives were extracted as another set, but with the adjectives and nouns sequenced according to the original order for future reference. A fraction of the data in the noun set was slated to be modelled and input into the LDA model. According to Ma et al. [28], LDA performs well in Chinese e-commerce comment mining. Following the evaluation of the model, we found that words get the best results when the number of topics with the LDA model is set to 50 and the words from the top 500 weights under the relevant topic are selected. The meaning of the expression between themes may be crossed. For the same topic, the higher the weight of the words it contains, the more relevant it is to the topic [29, 30].

Taking all the candidate words obtained in the previous step from the corpus, Yan et al. [4] proposed an extended PageRank algorithm that achieved excellent results in e-commerce comment feature extraction. In our study, we employed this algorithm. After the LDA algorithm extracted the candidate keyword set, all data in the set were sequenced in a priority order based on the PageRank-based algorithm [31] to produce candidate evaluation object set I. The PageRank model results in a stable weight after iteration, regardless of the initial value [32]. The PageRank model can get the weight of each word, and the higher the weight is, the more important it is.

For the ordered set of nouns and adjectives, priority ordering of all nouns in the set was realized using the conditional entropy filtering algorithm based on the cooccurrence probability [33], having candidate evaluation object set II. After the conditional entropy model, the opinion words and the characteristic words with weights can be obtained at the same time. The opinion words can be used as the lexicon for the viewpoint dictionary. On the basis of the above two steps, the weighted value of the two candidate evaluation object sets was calculated for reordering. By weighting and averaging the results of the two models according to the candidate feature words, the scores of each candidate feature word were obtained, and the top 1000 candidate feature words with the highest weight were sorted to construct a feature dictionary.

We had two word libraries after data preprocessing: the corpus and the adjective library of weighted sorting. The distribution of the corpus and the adjective library is shown in Figure 2, and the examples of the library are shown in Tables 1 and 2. The categories are artificially defined for easy reading, whereas the content is the actual result obtained. Table 1 is the feature vocabulary, and Table 2 is the adjective library.

3.2. Matching User Sentiments with Mobile Phone Attributes

After the investigation and analysis of user comment data on http://JD.com, it was found that about 84% of users described the attribute information of mobile phones with “mobile phone attributes-adjectives” such as “quality-good,” “price-affordable” and “cost-effective.” Thus, we theorized that the adjectives in the comments could be extracted as user sentiments.

Kim and Hovy [34] proposed a method that searches evaluation objects in a fixed-length window of evaluation words. Their proposed method starts matching with the first character of a user comment according to the corpus of mobile phone attributes, to judge whether the terms beginning with the character belong to the dictionary. When the mobile phone attribute matches, the search continues for the adjectives from the location of the matched attribute, with the length being within the X-wide window. If there is only one such adjective, the matching is successful. When there are several adjectives that fit, the adjective closest to the mobile phone attribute is selected as the sentiment word. Finally, the mobile phone attributes and adjectives are extracted from the user comments to record as <attribute, sentiment>.

Based on the dictionary of mobile phone attributes constructed in the previous stage, in this study, we adopted the fixed-length window method. The method traverses the sentence after word segmentation. When a feature word is encountered, it takes the word as the center, takes the clause of the window size, and ascertains whether there is a viewpoint word that can constitute a <character word, view word> pair in the clause. All the viewpoint words that can be formed into a word pair are combined with the feature words to form a word pair. If you are looking for a point of view, look for a feature word that can form a word pair. For example, the phone has a high pixel size and good quality. If the window size is three, then <pixel, high> <quality, good> can be extracted from it. The window size setting was experimentally compared with the result of manual labeling (used as ground truth) and precision, recall, and F1 (used as metrics). The results are shown in Table 3.

Following preliminary evaluation, the window size was set to three. In this way, we found the collocation of “attribute words-adjectives” in the sentences. An example of matching results obtained from user comments on iPhone 6 is shown in Table 4. In this step, we obtained 118 pairs of user comments on iPhone 6, 1,158 on Huawei Honor, 470 on Xiaomi Note, and 410 on Samsung Galaxy.

4. Correlation Model of Mobile Phone Attribute-Sentiment Pairs

4.1. Overall Framework

This study was conducted to evaluate the efficacy of our proposed attribute-sentiment pair correlation model by using it to determine users’ evaluation of each attribute of a mobile phone so as to discover the advantages and disadvantages of the mobile phone. As shown in Figure 3, we firstly extracted attributes from user comments, built an attribute dictionary, and then matched the user sentiments with the attributes of the mobile phone according to the attribute dictionary. Following the association of the feature and attribute, the <character word, opinion word> pair was obtained. Finally, the correlation between the attributes and the sentiments was calculated to conduct clustering.

Clustering using the LinLog model is conducted for two purposes: (1) to classify the attributes and sentiments into different categories so as to explore the sentiments matched by attribute words. Because the word accuracy obtained in the previous step cannot be guaranteed and the results obtained by simple statistics cannot accurately measure the user’s perception of the mobile phone, it is necessary to further mine the word pair. Using the LinLog clustering algorithm, the most relevant viewpoints with feature words can be mined as user evaluations. At the same time, relative to frequent itemset mining algorithms such as Apriori and FP-growth, more low-frequency word pair information is retained. (2) The LinLog model maps feature words and viewpoints into vector space when clustering. The vector representation of feature words and viewpoint words not only contains the degree of association between feature words and viewpoints but also can be considered to contain certain semantic information.

4.2. Attribute-Sentiment Pair Model

When the model of mobile phone attributes and sentiment pairs is illustrated in a graph, all attribute and sentiment words can be seen as a node in the graph. The edge of the node refers to the cooccurrence between the attribute word and the opinion word, and the number of cooccurrences is the number of edges. Each node has both attraction and repulsion to surrounding nodes, and those nodes cluster according to the attraction or repulsion among them. Nodes with stronger attractions will cluster near to each other while those with stronger repulsions will be in different clusters. In this case, the energy of the whole graph will fall to the minimum. Finally, there will be many clusters of nodes in the graph, each of which consists of both attribute words and sentiment words. We believe that the sentiment words have a higher matching with the attribute words in the same cluster. In other words, we can use these sentiment words to accurately evaluate the characteristics of a mobile phone in the cluster.

The purpose of the model of mobile phone attributes and sentiment pairs is to cluster the closely connected nodes in the graph and to separate the loosely connected nodes, thus reducing the graph’s energy to a minimum. The lower the energy is, the more accurate the clustering effect is. In the model, we defined the degree of a node as the number of edges connected to that node. The greater the degree of a node is, the stronger its attractive and repulsive forces are. The model calculated the energy of the Graph P, as shown in where is the energy of graph , and are the two nodes in graph , and are the locations of the two nodes in the graph, is the Euclidean distance of the two nodes, is the degree of node (which is equal to the number of edges connected to node ), and is the degree of node (which is equal to the number of edges connected to node ).

Using this model for clustering, we finally obtained two outputs: (1) a diagram of clustering results, from which the size and distribution locations of each class can be visually observed, and (2) the coordinates of each node in the graph, namely, the locations of the mobile phone attribute words and sentiment words. According to the coordinates, we can calculate the distance between attribute and sentiment words. The closer the distance was, the more accurate the sentiment word’s description of the attribute was. We used the Euclidean distance to calculate the distance between the two nodes in the graph. The Euclidean distance calculation formula for the two-dimensional coordinate system is shown in where and represent two nodes in the graph, are the coordinates of node in the graph, and are the coordinates of node in the graph. The paper adopts a dictionary method to calculate the distance between nodes in the graph, which is to separate attribute words from sentiment words and then calculate the distance from each key to each sentiment word and finally sequence according to the distance from small to large. The smaller the distance is, the more accurate is the sentiment word’s description of the attribute.

5. Experiment

In this study, two experiments were conducted using the correlation model of mobile phone attribute-sentiment words for performance verification: the first experiment was conducted to verify the effect of the model only; the second was conducted to verify the overall effect of the whole system from term extraction to clustering.

When testing the model separately, the pairs extracted from user comments on iPhone 6 were selected as the dataset and then manually marked with the combination of mobile phone attributes and user sentiment words. The same data were then input into the model of mobile phone attribute-sentiment pairs for clustering. The artificially labeled clustering results were manually matched with the model clustering results, such that the two results could be compared. The rates of precision and recall and values were selected as the evaluation criteria according to the manually marked results. The specific formulas for accuracy, recall, and values are as follows:

Table 5 explains the meanings of TP, FN, FP, and TN in equations (3) and (4).

Optimal matching in the model-based clustering results was conducted for each artificial cluster Ci. Finally, the average value of evaluation marks of each cluster was calculated. The experimental results show that the precision, recall, and values resulting from applying only the mobile phone attribute sentiment model to clustering were 91.64%, 90.83%, and 91.46%, respectively.

The user comments on iPhone 6 were also used as a dataset to test the overall performance of the system. The mobile phone attributes and sentiment pairs were extracted manually and by machine. The manually extracted pairs were manually marked, and the pairs extracted by machine were input into the online review attribute-sentiment word model to get clustering results. According to the results with manual marks, the accuracy, recall, and values gained were 74.71%, 84.29%, and 79.21%, respectively.

It was found that the results obtained by LinLog clustering for manual labeling are more similar to the manual labeling than the original word pair. In other words, the accuracy of the word after clustering was improved. This also shows that after clustering, feature words and opinions are more closely linked, which better reflects the user’s overall evaluation of the features. It also shows that the LinLog model can map feature words and viewpoint words to vector space to a certain extent, and their corresponding vectors have semantic information.

The verification results indicate that the mobile phone attribute-sentiment model achieved better performance. Thus, better potential values can be found in user comments if the model is applied.

5.1. Online Review Attribute-Sentiment Model

In this study, product analysis was conducted of four mobile phones—specifically, iPhone 6, Samsung Galaxy Note 4, Huawei Honor 4X, and Xiaomi Note 4—and all comments were extracted from http://JD.com. There were 569 comments on iPhone 6, 892 on Samsung, 2,068 on Huawei, and 1,069 on Xiaomi. The correlative model of mobile phone attribute-sentiment pairs was applied to the comments on the four mobile phones. Owing to space limitations, only the results for Xiaomi and Samsung are presented here. The clustering results and distance between attributes and sentiment words are shown in the ensuing figures.

Figures 4 and 5 are the clustering results for the Xiaomi mobile phone and the Samsung mobile phone, respectively. A single node in the graph represents a mobile phone attribute or user sentiment. The size of the node reflects the frequency of the occurrence of the word in the comment, and the distance between nodes indicates the degree of proximity. Figures 6 and 7 show the partial attribute-sentiment distance maps of some Xiaomi mobile phones and Samsung mobile phones, respectively. Tables 6 and 7 compare the weights of users’ emotions on all aspects of the Xiaomi mobile phone and the Samsung mobile phone, respectively. The two tables were constructed in the following manner. First, a search was conducted for each positive, negative, and neutral emotion for each feature word. The emotional score is the average distance between the feature word and the emotional word. Then, the feature words were divided into 10 categories by manual labeling, and the average sentiment tendency scores of each category were calculated. The fraction column represents the emotional score in Tables 6 and 7. The smaller the value, the stronger the emotional tendency.

The cluster results show the users’ evaluation of the four mobile phones.

For the Xiaomi 1, users are satisfied with its feel, camera, and reaction speed, but are dissatisfied with its screen.

The statistical results for the Samsung Galaxy Note 4 show that users are not satisfied with the screen and reaction speed, but it has a better result than the Xiaomi 1.

After the analysis, such results may be related to different mobile phone prices. Consumers often have higher requirements for high-priced mobile phones.

The above analysis results disclose users’ evaluation of each attribute of the mobile phones. These results can provide a reference for users to buy mobile phones and also provide a basis for mobile phone manufacturers to further improve their products.

5.2. Comparative Experiment

Table 8 shows the results of clustering Xiaomi and Samsung using the LinLog model and the LDA model, respectively. The first line represents the feature word, and the subsequent line represents the opinion word that belongs to the same class as the feature word after clustering. Among them, red indicates a viewpoint word that is not related to the feature word. The choice of feature words is representative, and the opinion words are arranged according to the correlation between the test words and the feature words from high to low. The viewpoint words with superscripts in the table indicate different Chinese expressions but have the same English meanings due to differences in Chinese and English expressions. It can be clearly seen from the table that the feature word-view word combination can be effectively mined using the LinLog model.

6. Conclusion

In this paper, we proposed an online review attribute-sentiment pair correlation model. To utilize the model, we first preprocessed user comment data, built a dictionary of mobile phone attributes and a dictionary of sentiment words, and then used it to match user sentiment with mobile phone attributes. Finally, we conducted experiments on one real-world dataset. Our research results can be used to conduct a user sentiment comparison of various characteristics of mobile phones. This helps merchants to improve results and helps users make better decisions. This study is based on mobile phone review information but can also be extended to other types of goods on the e-commerce platform. In the future, we intend to extend our work in the following direction. First, we will consider syntactic analysis plus window-based collocation to extract word pairs and identify negative words in the text in order to realize more accurate clustering results. Second, we will expand our work to other review datasets such as Amazon and Taobao.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Key Research and Development Program (grant number 2017YFB0803300), the National Natural Science Foundation of China (grant numbers 91546121 and 61702043), the National Social Science Foundation of China (grant number 16ZDA055), and the Open Project Fund of the Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education.