Abstract

To assist filtering and sorting massive review messages, this paper attempts to examine the determinants of review attraction and helpfulness. Our analysis divides consumers’ reading process into “notice stage” and “comprehend stage” and considers the impact of “explicit information” and “implicit information” of review attraction and review helpfulness. 633 online product reviews were collected from Amazon China. A mixed-method approach is employed to test the conceptual model proposed for examining the influencing factors of review attraction and helpfulness. The empirical results show that reviews with negative extremity, more words, and higher reviewer rank easily gain more attraction and reviews with negative extremity, higher reviewer rank, mixed subjective property, and mixed sentiment seem to be more helpful. The research findings provide some important insights, which will help online businesses to encourage consumers to write good quality reviews and take more active actions to maximise the value of online reviews.

1. Introduction

With rapid development of e-commerce, consumers become more engaged in online shopping [1]. There are a growing number of commodities available online providing more options for consumers. The enormous choices of different types of products available to consumers make their shopping decisions more difficult. To offer better references for shoppers to select merchandises, many e-business websites (e.g., Amazon) have provided online product review functions, which enable consumers to post and communicate their opinions about products. Online product reviews are usually written by consumers stating positive or negative user experience about their purchases. Compared to commodity description given by sellers, online product reviews contain more specific narrative feedback on goods features and user experience, which reflect more real perception of their purchases. Therefore, online product reviews are recognised to have greater indicative values and become an essential feature for many e-business websites [2].

Searching and reading online product reviews have been more common in online purchasing behaviour. According to Schlosser [3], 58% shoppers who purchased online tend to provide product review information and 98% of them read reviews before making purchases. It is a good opportunity for online retailers as offering online product reviews is a way to attract customers and influence their purchasing decisions. Nonetheless, it also imposes the challenge of information overload. Moreover, the quality of reviews varies between individual comments, which make consumers difficult to examine the usefulness of reviews [4]. To address these issues, several leading online retailers provide “helpful/helpless” voting mechanism letting consumers express their opinions on historical comments. Empirical evidence [5] has shown that reviews voted as helpful have greater impact on consumers’ purchase decisions and thus sales performance.

In order to better understand consumers’ behaviour and improve online product review provisions, it is important to devote effort into studying what types of comments attract more attention and are more helpful in assisting consumer’s purchasing decisions. Previous studies concern more about the influencing factors of review helpfulness but review attraction. Our objective is to understand the determinants of online review helpfulness as well as attraction within a Chinese language context. Grounded on a literature review of online product review studies and theories of consumer behaviour, we depict two conceptual models including research hypotheses about influencing factors. Corresponding to “notice stage” and “comprehend stage” in reading process, review attraction and review helpfulness are analysed, respectively. Using the review data collected from a well-known e-business website, we empirically examine the influential factors utilising a mixed-method approach including statistical analysis, machine learning methods, and text analytics. We discuss practical implications of the proposed conceptual models considering strategies to filter and maximise review helpfulness.

This study complements the existing literature by providing a new perspective in understanding consumer behaviour of reading and writing online product reviews. We divide the reading process into two stages, namely, notice and comprehend stage, and examine determinants of review attraction and review helpfulness, respectively. We also distinguish information contained in online comment system as explicit and implicit information. These processes are highly consistent with real world circumstances but have been rarely seen in previous studies. In addition, our research expands variable design in related fields by combing structured and unstructured data. Text analysis methods are used to quantify the width, depth, mixture of objectivity and subjectivity, and mixed sentiment of reviews [19]. Furthermore, this paper attempts to use machine learning models to rank review helpfulness based on its determinants. Practically, it helps explore extended practical values of our research finding in designing review system on C2C and B2C website, thus improving consumer online shopping experience.

Our findings indicate that review attractiveness is affected by explicit information, such as review extremity, review reliability, and reviewer credibility, while review helpfulness is affected by both explicit and implicit information, such as review extremity, reviewer credibility, subjectivity, and sentiment orientation. Besides, commodity category can be a moderator which strengthens the effects of review extremity and mixed sentiment. These results are in accordance with our assumptions as well as previous studies (e.g., [4, 1113, 16, 20]). However, in our examination, two novel variables, review width and depth, are not significant elements in influencing review helpfulness, and this is possibly because of increased cost in reading and processing excessive information. In addition, we apply the revised conceptual models and discuss strategies to improve product review provision with better filter system and prediction of helpfulness, which offers pragmatic approach to online review system design.

The rest of the paper is organised as follows. Section 2 reviews relevant literature. This is followed by the development of conceptual models and hypotheses. Next section introduces the methods adopted in this study. The following two sections depict the data, variables, and models in the empirical test. Then findings and further discussion are presented in Section 7. The last section concludes this paper.

2. Literature Review

In this section, we draw on two important research streams in the literature: e-WOM and helpfulness of online product review. In addition, as our sample data is Chinese text, we also review relevant literature considering different language characteristics and mining approaches.

Online product review, also termed as online reviews and online consumer reviews in the literature, is widely recognised as a type of e-WOM communication [21]. Extant studies have documented various definitions of this emerging Internet communication. According to Hennig-Thurau et al. [22], e-WOM is “any positive or negative statement made by potential, actual, or former customers about a product or company, which is made available to a multitude of people and institutions via the Internet.” Unlike the traditional word-of-mouth, E-WOM has distinct characteristics, such as open access, anonymous, directed to individuals, efficient storage, and wide spread. These allow transformation in the mechanism whereby the review information is channelled among consumers, seller, manufacturers, and markets [23].

Online shopping is a typical asymmetric information scenario, where consumers often make decisions with limited knowledge. Utz et al. [24] identify online reviews as indicators of commodity quality are more reliable than conventional product description presented by retailers or manufacturers. More consumers prefer to shop on websites that provide review information as it can advise them with rational purchase decisions. This kind of affirmative or negative feedbacks on commodities from buyers has powerful impact on purchasing behaviour (e.g., [25, 26]). Evidence has demonstrated in previous literature that online consumer reviews significantly influence sales (e.g., [8, 27, 28]). In addition, the rise of online communities strengthens conformity effect on consumers. To mitigate purchasing risks, an individual tends to comply with the group norm and accepts others’ opinions as true information about the product [29].

Such large scale information sharing in the virtual community may help build trust between buyers and sellers in the online market through further disclosing information about the quality of product and credibility of seller [5]. The presence of consumer reviews online may positively affect customer’s perception of usefulness of the website, which has potential to attract consumer visits and increase the time spent online [6]. However, with more reviews and information overloading on the Internet, online retailers have incentives to provide good quality reviews that are more engaging, reliable, helpful, and valuable. Thus, to realise the benefits of consumer reviews in the online network, it is essential to understand the mechanism that online users participate in the online review system, particularly in the aspect of what makes an attractive and helpful review from the perspective of consumers’ cognitive process and behaviour.

Helpfulness of online product comments reveals how consumers evaluate a review. Based on information economics theories, Mudambi and Schuff [6] define a helpful customer review as “peer-generated product evaluation that facilitates the consumer’s purchase decision process.” They differentiate search and experience goods and indicate the influencing effects of review extremity, review depth, and product type on perceived helpfulness. From a communication and persuasion theory aspect, Peng et al. [7] propose a conceptual model indicating that review rates and length, votes for helpfulness, and Internet use experience are the most influential factors of review helpfulness. It is empirically proven in studies (e.g., [12, 13, 20]) that textual content of review and reviewer engagement characteristic (e.g., review rating, reviewer’s reputation, and reviewer exposure) do influence review helpfulness. Indeed, from the information processing perspective, Forman et al. [10] illustrate that disclosure of review identity information has positive association with subsequent sales and peer recognition of reviews. Reviews with identity-relevant information tend to be more helpful and can shape the online community response.

Another relevant research stream mainly focuses on the language features and sentiment of online reviews. Using text mining methods, Cao et al. [4] highlight that the semantic features of review have greater impact on the vote for helpfulness. In other words, reviews with extreme opinions seem more helpful than mixed or neutral comments. Hao et al. [16] find a positive correlation between review helpfulness and positive attitude, high mixture of positive and negative attitudes, high mixture of subjective and objective expressions, and average sentences length. Sun [17, 18] focuses on the effect of sentiment orientation and demonstrates that direction and admixture of sentiment orientation have a significant impact on consumer perception of review helpfulness. H. Wang and W. Wang [30] propose an opinion-aware analytical framework for sentiment mining of online product reviews. It is useful to detect product weaknesses, which leads to product defects reduction and quality improvement. Furthermore, in review helpfulness prediction, a combination of linguistic features and other features (e.g., subjectivity and readability) has great potential in improving accuracy [15].

Overall, e-WOM has received more attention in academia since online consumer communities emerged and gained in scale. Research on online product review is still in progress and a majority of researches focus on three main areas: () the impact of online reviews on consumer’s behaviour and perceptions [6, 25, 26], () the dynamic relations between online reviews and sales of e-marketplace [17, 18, 31, 32], and () the motivation and mechanism of online review communication and transmission [33, 34]. Existing studies usually choose search commodity (e.g., mobile) and experience commodity (e.g., movie) as research objects, and majority of them adopt quantitative methods to model and empirically examine the determinants of review helpfulness. In recent years, text mining approach has been incorporated to evaluate textual data in online reviews [4, 35]. However, impact factors of review helpfulness need further examination with improved empirical models, mixed methods, and extensive product types. This paper makes an effort to investigate the influential factors of review attraction and helpfulness.

3. Conceptual Framework and Hypotheses

3.1. Review Attraction and Review Helpfulness

In order to facilitate consumers to choose useful reviews from massive and diverse comments, many e-commerce websites design voting systems to evaluate helpfulness and sort all reviews based on the voting results. To optimise the interactive comment system, it is imperative to understand the helpfulness perception mechanism of consumers. This can be investigated from two aspects. One is the layout of review module on website. According to the design of review website, there is a list of comments provided by previous consumers, of which partial information of reviews is displayed on the initial web page. To access more details of a particular review, the system normally directs readers to a new page or unfolds the hidden parts of the review text. Such design of review system is closely related to the other aspect that is consumers’ reading behaviour and cognitive process. At first, the general information of reviews does not require concentration but the unique characters of particular reviews will draw attention of readers. Then a more detailed examination of the textual contents of a particular review will require consumer’s careful reading and understanding. These two processes take place in sequence where attractiveness and helpfulness of review are highlighted at the two stages, respectively. Therefore, we divide the review perception process into two phases.

Phase 1 (notice stage). Consumers browse partial information displayed in the comment list and figure out those that interest most and need detailed learning. Generally, this information is most straightforward at this stage, which makes it simple for consumers to grasp key messages and identify its value for further reading.

Phase 2 (comprehend stage). After figuring out the reviews that are most attractive, shoppers will read the texts of those comments carefully. Then they will have their own judgement of review values based on individual understanding of the textual information. Thereafter some customers will give feedback by voting on helpfulness of the reviews, while others may not.

To intuitively illustrate these two phases, we take a typical online product review on https://amazon.cn as an example. It contains several different messages, including review rating, reviewer’s identity, product, and comment details. Consumers usually glance over all messages and decide whether it is worth reading through. This is the first phase where consumers notice the important reviews in a short time. Afterwards shoppers will read, learn, and evaluate the textual statement, which completes the second phase. Accordingly, we discuss the influencing factors of review attraction and review helpfulness.

Existing research usually interprets those factors from two aspects, namely, review text features and reviewer characteristics. Different from prior studies, this paper selects consumer behaviour as an entry point and classifies information into explicit and implicit ones. Explicit information refers to the messages that can be easily captured without careful examination and it influences subsequent reading decisions. Implicit information is hidden in the review text, which can be uncovered with careful reading and interpreting. In fact, this classification is in line with the two stages of perception process. At notice stage, consumers are able to observe explicit information effortlessly and the implicit information can only be acquired at the comprehend stage. On account of these considerations, we assume that () explicit information dominates consumer perception at notice stage: that is, review attraction is influenced by explicit information; () at comprehend stage, both explicit and implicit information have impacts on consumers; that is, review helpfulness is influenced by explicit and implicit information.

After an extensive survey on the literature and e-commerce websites, we take the following factors into consideration. Explicit information includes review extremity (i.e., numerical star rating), review reliability (i.e., review length), and reviewer credibility (i.e., reviewer ranking). Implicit information includes review width (i.e., product features mentioned in review), review depth (i.e., number of characters in descripting single commodity feature), mixture of subjectivity and objectivity (i.e., review contains both subjective and objective information), and mixed sentiment (i.e., review contains both positive and negative information). Figure 1 sketches the relationships among key elements in this paper. It is observed that a review normally contains 15–400 characters, and the number of product feature words mentioned is from 1 to 5. There is a great possibility that review length (i.e., review reliability) is associated with review width. Therefore, review reliability is excluded from examination of impact factors of helpfulness. In the next subsection, research hypotheses are developed regarding influencing factors of review attraction and review helpfulness.

3.2. Hypotheses Development and Conceptual Model
3.2.1. Hypotheses on Review Attraction Determinants

Review extremity manifests consumer’s intense or moderate review attitudes towards products, which is the deviation from the mean or a reasonable value of an attitude scale [6]. Numerical star rating is widely used for reflecting the review extremity, and it typically ranges from one to five stars. In general, a very high rating (five stars) and a very low rating (one star) indicate an extremely positive or negative view of the commodity, respectively, and a moderate view is given by rating three stars. Past research [8] has identified that reviews expressing more extreme and strong feeling and views are more likely to interest and disperse among the online community. Besides, on e-commerce websites, numerical star rating is highly conspicuous in the comment area. Moreover, emphases placed on writing review vary among different consumers. This leads to diversity in review quality. We define this variance as the reliability of online reviews. There are various criteria to assess reliability, among which review length is most intuitive. It is broadly admitted that reviews with more characters have greater potential to be more reliable as more efforts have been put in writing these reviews [6]. From consumers’ view, a longer review has greater attractiveness as it reflects the writer’s sincerity and may contain more useful and genuine information. Past research has also illustrated the important role of source credibility in user’s adoption of online information [36]. Within an e-commerce context, the source credibility can be partially represented by reviewer’s credibility, which includes user’s identity, reputation, and activity level. Reviewer’s information is usually displayed clearly on the websites and its exposure can be assumed to positively affect the attractiveness of product reviews. Therefore, we derive the following hypotheses and Figure 2 illustrates the conceptual framework for subsequent empirical examination of influencing factors of review attraction:Hypothesis : review extremity has significant positive association with review attraction.Hypothesis : review reliability has significant positive association with review attraction.Hypothesis : reviewer credibility has significant positive association with review attraction

3.2.2. Hypotheses on Review Helpfulness Determinants

As stated earlier, review extremity is directly presented as numerical star rating. There is evidence that review extremity can influence consumers’ judgement and perception of the value of comments [9]. To be specific, extremely positive or negative reviews are more helpful than moderate reviews [10, 11]. In addition, the quality of a review depends on many factors such as reviewer’s online shopping experience. Consumers consider reviewer’s perceived reputation and reliability, which may influence message receivers’ attitudes and evaluations of the comment text [14]. Generally, people with more professional knowledge are more likely to earn trust from others in the community. Another important factor is review width, which can be understood as completeness of the review information. For a single commodity, a high degree of review width means the comment text covers more features of that product. It increases efficiency in acquiring information about product features in a short time. Furthermore, most of prior studies consider review depth as the length of review text, which interprets “depth” on the basis of a whole review entry. Here, we try to explain review depth from the product feature perspective. It leads to another interpretation of review depth, that is, the average number of characters in describing a single feature of a particular product. With in-depth summary of a product feature, consumers are able to gain detailed information in specific contexts. The added depth of information is beneficial to consumers to make decisions. On the other hand, it may require additional reading effort and thus alter their perception of review values. Thus, the following hypotheses are stated:Hypothesis : review extremity has significant positive association with review helpfulness.Hypothesis : reviewer credibility has significant positive association with review helpfulness.Hypothesis : review width has significant positive association with review helpfulness.Hypothesis : review depth has significant positive association with review helpfulness.

Individuals have different ways to express opinions and feelings, which induce various presentations of online product review text. Pang and Lee [37] classify document into subjective and objective in text mining process, and subjectivity extracts are sentiment-oriented whereas objectivity ones list and confirm characteristic of products only. Ghose and Ipeirotis [11] figure out that, for search commodity, the extent of subjectivity in a review significantly influences consumer’s perception of review helpfulness, and the review is more informative and helpful with a mixture of subjectivity and objectivity. As for experience goods, highly sentimental description of personal feelings that is not captured in product introduction is far more valuable than directly advertised information. In addition, advertising label information has one-sided and two-sided messages. One-sided information contains either positive or negative messages, whereas two-sided argument consists of both. Previous evidence shows that two-sided messages enhance perceived credibility in consumer communications [38, 39] and the proportion of different attitudes in a review is critical to two-sided message effectiveness [40]. Different from most of studies on review helpfulness that consider tendency and intensity of a single side of attitude, Hao et al. [16] adopt the two-sided concept and identify mixed sentiment has positive influence on review helpfulness of experience goods. From the above discussion, the following hypotheses are derived:Hypothesis : mixture of subjectivity and objective has significant positive association with review helpfulness.Hypothesis : mixed sentiment has significant positive association with review helpfulness.

Consumer’s evaluation ability and search costs vary among different types of product and it has great impact on purchase decisions. In a traditional offline shopping environment, commodities are categorised as convenience goods, shopping goods, and specialty goods [41]. As a decrease of search costs in Internet shopping era, it is hard to clearly distinguish these types. Instead, a widely acceptable practice is a classification into search goods and experience goods, based on how and to what extent consumers learn about product features. Search goods have lower purchase uncertainty as the characteristics of the product can be easily evaluated before purchase (e.g., camera and MP3). As for experience goods, product features cannot be observed in advance but can be ascertained upon consumption (e.g., skin care, food, books, and music). Furthermore, information needs differ between search goods and experience goods, and so do consumer’s perception and feedback, which has potential to affect review helpfulness through review extremity and mixed sentiment. Thus, product type may have moderate effect on these two elements that are assumed to influence review helpfulness. From a utility viewpoint, the features of search goods that affect perceived utility are relatively easy to measure and verify; hence, extreme reviews are commonly accepted as authentic. Rather, extreme reviews may not be agreed for experience goods, as it is more subjective judgement according to individual experience and preference. In addition, reviews with mixed sentiment expressions may assist more in decisions on buying search goods, as these reviews offer comprehensive information about every aspect of the product. But it may not be true for experience goods because of the subjective nature of reviews. It has greater propensity to have both positive and negative opinions, of which the mixture may has reduced effects on consumer’s perception of review helpfulness. Therefore, we put forward the following hypotheses and the conceptual framework is developed accordingly (see Figure 3):Hypothesis : product type moderates the effects of review extremity on review helpfulness, and extremely negative review of search product has stronger impact on review helpfulness.Hypothesis : product type moderates the effect of mixed sentiment on review helpfulness, and mixed sentiment review of search product has stronger positive impact on review helpfulness.

4. Research Methods

4.1. Data Acquisition

Studies about online reviews require acquisition of massive data from specific websites and online comment systems. Web crawler is a commonly used tool to perform this function. Also known as web spider, it is a program which systematically and automatically brows and extracts information from web pages. Many software tools, such as Nutch, JSpider, and WebCollector, have been developed to satisfy users’ needs of web crawling. In this paper, LocoySpider is chosen as the web crawler tool. Developed in 2005, LocoySpider is a powerful and specialised web crawling and data extraction tool, which can download text, images, and other files from webpages in a fast speed. It also can manage data, for instance, editing and filtering data, importing data to database or publishing it on web backend. This tool has greater applicability to static web pages with standardised URL. More details on data collection and samples are provided in Section 5.1.

4.2. Data Preprocess

For the textual data contained in reviews, preprocessing is conducted with word segmentation and part-of-speech tagging. This stems from a consideration that original review text is unstructured or semistructure data, which needs to be converted to structured data for gaining useful information. Comparing to English where words are relatively separate and independent, Chinese language is different as there are no separators between words in Chinese text. Thus, the quality of word segmentation and part-of-speech tags are critical to the subsequent steps of information processing. We adopt ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis System), a Chinese automatic word segmentation system developed by Institute of Computing Technology, Chinese Academy of Sciences. It is one of the best Chinese lexical analysis systems. ICTCLAS can facilitate word segmentation, part-of-speech tagging, and unknown words recognition. More detail about this system is illustrated in Zhang et al. [42].

4.3. Text Analysis
4.3.1. Feature Extraction

Let represent the collection of commodities used in analysis, and each commodity has a set of product reviews , where stands for the number of review entries for commodity . There is specific description of product features contained in each review and it is the information required in feature extraction. An effective method is used to extract product features from review text, and the main steps are described as follows.

Labelling Noun, Noun Phrase, and Construct Pending Character Words List. In general, nouns and noun phrases are more commonly seen in the description of product features. So the first step in feature extraction is labelling noun and noun phrase and viewing them as pending product features. When marking noun phrase, window size can be used as a control variable to ensure the words in the noun phrase are no more than the number set for window size. Here the window size is set as 3.

Filter Character Words with Help of Part-of-Speech Path Template. The phrases describing product characters often follow certain part-of-speech pattern, and it can be helpful to remove some unrelated phrases. The part-of-speech path template is done manually consulting text of product description from official websites or reviews. Using mobile as an example, the template contains: n/n, a/n, a/n/n, a/a/n, v/n, v/n/n, and v/v/n, where n, v, a, and x, respectively, denote noun, verb, adjective, and string. This step increases the accuracy in identifying noun phrases that describe product features.

Further Removal of Repeated or False Character Words. There may be repeated or false character words after filtering, and it affects the accuracy of quantified data. To optimise the character words list, we construct character words hierarchy for each commodity and create synonym list of character words. Then the final character words are determined with assist of programmatic screening and manual inspection.

4.3.2. Sentiment Analysis

Polarity dictionary is the base to analyse the evaluative character of a word. In Chinese sentiment analysis practice, HowNet has been widely applied and become increasingly rich in words and phrases that are distinguished as positive and negative. Apart from HowNet, we also create a field sentiment dictionary including words that depict specific commodity features (e.g., fit) and Internet terms conveying consumers’ sentiment (e.g., geilivable, which means something or someone is awesome). For individual type of product, a number of reviews are selected. They are labelled manually with positive or negative natures to build area sentiment dictionary. Then it is combined with HowNet outcome after deleting repeated words through which it makes an integrated and complete sentiment dictionary.

There is another situation where words in review text cannot be found in the sentiment dictionary. To make judgement on the sentiment orientation of these words, normally it requires computing the similarity between the unknown words and the words in the sentiment dictionary. Here, SO-PMI method is adopted. First, two sets of words are identified: that is, is the set with positive semantic orientation and is the set with negative semantic orientation. Then the semantic orientation of a specific string is calculated from the strength of its association with minus the strength of its association with (1). PMI between two words is defined as (2)

When SO-PMI is positive and greater than threshold value δ (), word has a positive semantic orientation (i.e., praise). When SO-PMI is negative and less than threshold value (), word has a negative semantic orientation (i.e., criticism). In other circumstances, word is regarded as neutral semantic. The threshold can be determined by testing the accuracy of measuring praise and criticism under different values. After the above process, we keep the words having either positive (denoted by 1) or negative (denoted by ) semantic orientation as the semantic labels of each review. Then whether the review conveys positive or negative information can be demonstrated through examination of sentiment words in review text.

4.3.3. Subjectivity and Objectivity

There is rare research in the existing literature on text analysis studying how to understand and classify subjective and objective messages, especially in processing Chinese text. Because of the complexity of Chinese language grammar and lexical items, unsupervised or semisupervised learning methods seem too complicated and hard to achieve desired outcome. Therefore, observation method is used to uncover the pattern. Here, objective information refers to the message that describes product characters in an unbiased way with simple language, while reviewer’s advice, opinion, attitude, and sentiment are always presented in subjective information. This paper makes effort on setting criteria that help judge subjective information. The remaining messages fall into objective category. Here we define subjective information as follows:(a)if it clearly states advise on purchase decision, such as suggesting others (not) to buy the product, (not) recommending the commodity;(b)if it conveys reviewer’s feeling, emotion, or complaining, such as “it is totally a waste,” “I regret to buy this,” “why it is more expensive,” and “no longer believe domestic product”;(c)if the review is not related to the commodity characters, such as “my boyfriend really likes it” and “there is obvious discrimination against the Chinese customers.”

A review text has a mixture of subjectivity and objectivity if it contains both subjective and objective information. To quantify this variable, we manually label each review based on the criteria set above. It is denoted that 0 represents a review with either subjective or objective information, and 1 means there is a mixture of both.

4.4. Data Analysis

In this paper, multiple linear regression is adopted to examine the correlation of review attraction and review helpfulness with potential determinants. Support vector machine (SVM) and random forest models are used for classification and prediction of review helpfulness.

Multiple linear regression deals with two or more explanatory variables and models their relationship with a dependent variable. It is more realistic as normally a particular problem cannot be influenced by single factor, and investigation into more factors grants it greater explanatory power. A general form of multiple linear regression mobile is as formula 3, where is dependent variable, are explanatory variables, are parameters, and is the residual.

SVM is a supervised learning model which is widely applied in classification and regression analysis. It performs well especially when applied to small data sets and nonlinear models. A SVM constructs a set of hyperplanes and finds an optimal solution to maximise the margin around the separating hyperplane. A better separator is achieved by the hyperplane that has the largest distance to the nearest training data (support vectors) point. It can be used to both linearly separable and nonlinearly separable sets.

Random forest is a classifier constructed by an ensemble of classification trees. Each classification tree is independent and built by using a random subset of the training data. When there is a new input vector, each tree gives a classification and the forest chooses the classification result with the most votes from each tree. Random forest has some advantages. For example, it is an accurate algorithm and runs efficiently on large datasets. Besides, it can detect variable interactions.

5. Data and Measurement

5.1. Data Collection

We choose Amazon China (http://www.amazon.cn/) as the e-commerce platform for data source. Amazon is a worldwide online retail markets and has extensive consumer review systems [12]. Over the last two decades, Amazon review system has been developed and improved comprehensively. Moreover, Amazon is the first to provide the helpfulness vote system which creates great values. Based on the helpfulness votes, Amazon ranks all reviews and reviewers. Therefore, we collect actual consumer review data from Amazon.

In addition, the selection of commodities is critical to the reliability of the empirical test. Our selection is based on two basic rules. First, there should be adequate review data of the chosen product, and the threshold is set as 300 comments. Second, both search commodity and experience commodity should be included, and each covers at least 2 categories. It helps reduce effects of limited selection of commodities. 8 products from 4 categories are chosen as research objects (see Table 1).

5.2. Measurement
5.2.1. Dependent Variable

Review_Attraction. Ideally, review attraction should be measured by number of people who read the review. However, this data cannot be observed and recorded from websites. Amazon provides information of total votes for the reviews, which is generally admitted to have significantly positive relationship to number of views. Thus, we use total votes to measure review attraction and it is a continuous variable that is equal to or greater than zero.

Review_Helpfulness. There are two ways to measure review helpfulness. One is the number of votes for usefulness, and the other way uses the ratio of number of helpful votes to total votes. The later method has been widely adopted as it considers the effect of votes for useless reviews on the review helpfulness. We also adopt this approach and this variable has values between 0 and 1. Besides, it needs to note that we remove the reviews whose total votes are less than 0 in order to make sample data more reasonable.

5.2.2. Independent Variable

Review_Extremity. In previous studies, review extremity is often measured by the scores or star rates given by consumers. The numerical star rating usually takes integers between 1 and 5. Thus, we define review extremity as the difference between the score of individual review and a specific value. We use 3, an average score as the specific value, which gives the variable an integral value between and 2.

Review_Reliability. In this paper, review reliability demonstrates consumer’s judgement on review authenticity and quality. Number of characters in the review text is a useful indicator. Usually, a longer review shows that more time is spent on writing the review. Therefore, we use number of characters in review text to measure this variable, which is continuous and greater than 1.

Reviewer_Rank. Reviewer’s credibility is reflected by various pieces of identity information such as rank and reputation. Based on Amazon online system, we choose reviewer rank to depict reviewer credibility, which is a continuous variable and is greater than 1.

Review_Width. Review text contains several product features and the number of features covered depicts the width of this review. Considering there are inherent differences of the dimensions of features among various products, we adopt a ratio of number of features mentioned in the review to a specific benchmark. For each commodity, this benchmark is the maximum number of features described in the reviews of that particular product. Thus, it is a continuous variable with a value between 0 and 1.

Review_Depth. It is considered as the number of characters used for describing a single product feature. We use the proportion of the number of characters to total number of features outlined in the review. It is a continuous variable and is greater than 1.

Review_Object. It is challenging to evaluate the degree of objectivity and subjectivity of a review as it varies according to judgement of different consumers. Hence, we use a binary variable. When a review presents a mixture of both subjective and object information, the variable takes value of 1; otherwise, the variable has value of 0.

Review_Sentiment. The review sentiment can be examined from three aspects: sentiment orientation, sentiment intensity, and level of mixed sentiment. The first two are mainly measured by the review extremity. So here we focus on the third aspect. It is a binary variable which takes value of 1 when the review conveys a mixture of positive and negative attitudes towards the product features; otherwise, the variable value is 0.

5.2.3. Moderator Variable

Commodity_Category. A dummy variable is used to depict the moderating effect. In the research hypotheses, we assume search commodity has stronger moderating effects. Hence, the dummy variable takes value of 1 if the product is search commodity and its value is 0 for experience commodity.

Table 2 lists all variables in this study.

5.3. Descriptive Analysis

Review data for 8 products is collected from Amazon China using LocoySpider. Reviews that have less than two votes or extremely large number of votes are removed from the sample. In total, 633 valid samples are collected (see Table 1) including 340 entries for search commodity and 293 entries for experience commodity.

Table 3 presents the descriptive characteristics of the sample. In general, review helpfulness is at a high level with average value of 0.74 and a small standard deviation. Review extremity has an average value of 0.42 which illustrates that extremely negative reviews are rare. But the mean and variance for Review_Reliability, Review_Depth, and Reviewer_Rank are much higher compared to other variables. Moreover, in view of the moderating effects of commodity category, the samples show several distinct characteristics (see Table 4). First, search commodity has greater review attraction as the total votes are more than that of experience commodity. Presumably it is due to the difference in popularity of the commodity itself or consumers have dissimilar perception of reviews of different categories. Second, the mean and deviation of Review_Extremity for search commodity are 0.58 and 1.5, whereas the values are 0.29 and 1.68 for experience commodity. It seems search commodity tends to have more extremely positive reviews. Third, as for rank of reviewers, it is significantly lower for reviewers writing comments for experience commodity. One possible reason is that number of total votes has positive influence on reviewer rank in Amazon ranking system. At last, the mean and deviation of Review_Depth for search commodity are 67.15 and 65.09, which are about double than experience commodity (mean 39.26 and deviation 28.32). It shows that reviewers tend to provide more information about features of search commodity. Nevertheless, the selection of search and experience commodities may have an impact on the above statistical differences.

6. Analysis and Discussion

6.1. Regression Model

As analysed in previous section, the value ranges of several variables in the raw data sample are significantly different from others, which may cause problems and errors in the outcome of regression analysis. In order to ensure the data comparability and improve regression fit, further process of raw data measurement is considered. To be specific, because of large scales, four variables, Review_Attraction, Review_Reliability, Review_Depth, and Reviewer_Rank, are exponentially standardised. To minimize the information loss, the logarithm of the values is taken for these variables. Following the steps of conceptual models, determining variables, and processing data, we construct the regression models as follows.

Regression model of influencing factors of review attraction is

Regression model of influencing factors of review helpfulness is

6.2. Variable Correlation Analysis

Correlation analysis is employed to examine the relationships among variables and measure their dependence. This study covers 9 variables, including 7 continuous variables and 2 binary variables. Here we use Spearman correlation coefficient to examine their association. The results of correlation analysis are presented from Tables 57. From the results in Table 5, all correlation coefficients are less than 0.4, which indicates these 4 variables have little correlation and small likelihood of multicollinearity. But in view of significance test, review attraction has significant correlations with the other three independent variables. Second, review helpfulness analysis contains 7 variables and all correlation coefficients are less than 0.4, which illustrates low possibility of multicollinearity. In comparison to other coefficients, statistically significant correlations (absolute value is great than 0.3) are found between helpfulness and reviewer rank and between review depth and review object. Moreover, we also consider the possible correlation between the two dependent variables. The results in Table 7 demonstrate that there is no significant correlation as the coefficient is 0.349; that is, the level of review attraction has no significant influence on review helpfulness.

6.3. Influencing Factors of Review Attraction

Considering consumer’s behaviour of viewing product comments at the notice stage, we examine the possible influence of review extremity, review reliability, and reviewer credibility on the review attraction. As the focus is placed on the effects of above factors on the transient behaviour of viewing, commodity category is not considered in this model. Based on the regression model in (4), regression analysis is conducted on the entire samples and results are presented in Table 8.

The regression model is reliable as the Sig. is zero and there is no collinearity. Adjusted square is 0.258 which illustrates that the three independent variables in the model can explain the dependent variable to certain degree. To be more specific, first, Review_Extremity is negatively associated with Review_Attraction (, . The variable Review_Extremity takes value from to 2, indicating extreme negative and extreme positive rating. If the review has 4 or 5 scores (i.e., Review_Extremity value is 1 or 2), consumers pay less attention; if the review is rated at 1 or 2 (i.e., Review_Extremity value is or ), consumer attraction increases. In other words, extreme negative reviews have higher review attraction, which partially supports Hypothesis . Second, Review_Reliability is positively associated with Review_Attraction , . This result supports Hypothesis , indicating that review attraction is higher with more characters in review text. But the effect may be weak as the coefficient is relatively small. Third, Reviewer_Rank is negatively associated with Review_Attraction , . Higher value of this variable illustrates that the reviewer has less experience in writing reviews. Thus, it can be inferred that reviews written by reviewers with higher rank are more attractive to consumers. Therefore, Hypothesis is supported.

6.4. Influencing Factors of Review Helpfulness
6.4.1. Regression Results

According to the regression model in (5), we examine the influence of 6 factors on review helpfulness, namely, review extremity, reviewer credibility, review width, review depth, mixture of subjectivity and objectivity, and mixed sentiment. Regression results of the whole sample are shown in Table 9.

The adjusted square is 0.170 and test result is significant, which means the regression model is statistically acceptable. As for the coefficients, Review_Width and Review_Depth have no significant influence on Review_Helpfulness. It may be affected by commodity category and consumer preference. Nonetheless, the other 4 variables are significant. More specifically, Review_Extremity is significantly associated with Review_Helpfulness. The negative coefficient means extreme negative reviews tend to be more helpful for consumers. This partially supports Hypothesis . Reviewer_Rank is negatively associated with Review_Helpfulness , . Higher ranking of reviewer (i.e., smaller value of the variable Reviewer_Rank) indicates greater credibility in providing product comments, which are more helpful for consumers. Therefore, Hypothesis is supported. Review_Object is positively associated with Review_Helpfulness , . If a review involves both subjective and objective information on products, it tends to have greater helpfulness for consumers, which supports Hypothesis . Review_Sentiment is positively associated with Review_Helpfulness , . It demonstrates that consumers prefer to read reviews containing both positive and negative information. Reviews with mixed sentiment may have greater helpfulness, which supports Hypothesis .

6.4.2. Moderator Effect of Commodity Category

Moderation may weaken, amplify, or even reverse the original relationship [43]. To test the moderator effects in multiple regression, two approaches are commonly used. One is analysis of variance (ANOVA), and it can be used when both predictor and moderator variables are categorical. The other situation is when one or both variables are continuous. We can introduce an interaction term by multiplying two variables together. He, we analyse the moderator effects of Commodity_Category on the correlation between Review_Helpfulness, Review_Extremity, and Review_Sentiment.

First, we examine the moderator effect of Commodity_Category on Review_Extremity by introducing an interaction term of Category Extremity. The results in Table 9 show that the coefficient of interaction term is significant (), which indicates that the difference in commodity category has an influence on the effect of review extremity on review helpfulness. Furthermore, the negative coefficient indicates that when Commodity_Category is search commodity, extreme negative reviews have greater effect on review helpfulness. This result supports Hypothesis .

In addition, to examine the moderator effect of Commodity_Category on Review_Sentiment, we present a two-way ANOVA. As both variables are categorical, -score standardisation is performed on all variables. The variance analysis results are described in Table 10. The coefficient of interaction term is significant (), which means the effect of review sentiment on review helpfulness can be moderated by commodity category. As illustrated in Figure 4, mixed sentiment in review text has a greater influence on review helpfulness when the product belongs to the search commodity category. Thus, Hypothesis is supported.

6.5. Revised Conceptual Model

In the above empirical analysis, the regression results show that the explanatory power of the two proposed models is generally satisfactory. However, several issues in the models need to be further investigated.

For review attraction analysis, it supports the conceptual model where review attraction is influenced by review extremity, review reliability, and reviewer credibility. Specifically, reviews that convey extreme negative opinions, contain more text characters, and are written by high ranking reviewer are more attractive to consumers. Nonetheless, the above variables seem not able to sufficiently explain the determinants of review attraction as the adjust square is relatively small. Other factors may be considered in the regression analysis. One possible element is the total vote at different time intervals. It may affect the length of exposure time of a particular review in the commenting system. This factor has not been taken into account in this study as it distracts research attention from the review text itself. Besides, the title of review may also affect the attractiveness of the review. But it is difficult to effectively quantify this factor; thus, this factor is left for now. Moreover, one limitation in this model is that review attraction is approximated by number of total votes. However, when total votes are small (e.g., less than 5) it is hard to fully describe the difference in review attraction. This limitation largely stems from imperfect information of the online review system.

Regarding the review helpfulness analysis, the conceptual model is partially supported by the empirical test. In general, reviews with negative extremity, higher reviewer credibility, mixture of subjectivity and objectivity, and mixed sentiment are proven to have positive influence on review helpfulness. Especially for search commodity, the effects of review extremity and mixed sentiment are even stronger. But review width and depth are not significant factors in affecting review helpfulness. In terms of width, although reviews containing more product features and information tend to be more attractive in theory, consumers are only interested in description of product features. Other information seems redundant and increases consumers’ reading effort. Likewise, review depth may also require more reading effort which may not be offset by marginal value gained. Another way to understand this is that some reviews provide useful information with concise sentences. The information value offered in such review worth consumers’ reading effort, even though lack of thorough details. Overall, the conceptual framework for review helpfulness is revised as shown in Figure 5.

6.6. Model Application
6.6.1. Online Review Filter

This paper discusses review attraction and helpfulness issues within an information overload context. The empirical analysis has identified several influencing factors based on two conceptual models. More importantly, the research findings can be used for online review system optimisation in order to help consumers efficiently target and obtain information from high quality and valuable reviews. Usually helpful reviews impress consumers with greater value, whereas it may not be the case if the review only seems attractive but not useful. Hence, review helpfulness is the primary concern in consumer’s perception of review value. Therefore, two strategies are put forward to filter massive reviews.

The first strategy focuses on review helpfulness. Online retailers can design the filter based on this single measure. In practice, it is suggested to classify all reviews into two broad categories. One is the reviews that their helpfulness can be computed based on consumers’ votes. The other type of reviews has no votes but their helpfulness can be predicted with the help of classification models and information about review extremity, reviewer rank, mixture of subjectivity and objectivity, mixed sentiment, and other factors. Besides, different algorithms should be considered for different commodity categories due to their moderator effect. This strategy is straightforward but reviews filtered by this approach may not be attractive enough (e.g., short in text characters). There is a possibility that consumers ignore these reviews and keep browsing. Therefore, a better strategy can be considered involving improvement of review attraction. While review helpfulness is still the priority, attraction factors are used to further filter out reviews with low attractiveness. Ideally, the combined strategy performs better in a way that consumers may acquire more useful information from single page visit.

6.6.2. Online Review Helpfulness Prediction

Previous analysis has identified 4 factors influencing review helpfulness. Here, we try to explore whether these factors can be used to predict review helpfulness. We begin with review helpfulness data discretisation. The raw data of helpfulness is continuous between 0 and 1, and a threshold, , can be determined in this interval. When the value of Review_Helpfulness is greater than , this review is regarded as a helpful one and denoted as 1. Otherwise, it is a useless review denoted as 0. To determine the optimal value, we manually label helpfulness to part of samples and compare the results with corresponding collected data. Through this process, 21 alternative values are chosen. Then we examine Precision and Recall of each value and determine the best based on value using (6). Figure 6 displays values of different values. As illustrated in Figure 7, in which different values of and are displayed, the best value is 0.7 as is maximum at that point. Thus, a review is labelled “helpful” when the ratio of “helpful” votes to all votes is greater than 0.7. Otherwise, the review is useless.

Next, two supervised learning methods are adopted to classify review helpfulness. One is SVM, which has been used in prior research to predict review helpfulness. In addition, random forest is an ensemble learning method which is more effective than individual learning methods in many cases. The sample data is divided into training set and test set based on a ratio of 7 : 3, which are, respectively, used for training and evaluating classification models. The accuracy of classification measures the model effectiveness. Here sample division, model training, and classification test are conducted 100 times, and we evaluate the model effectiveness by average accuracy.

The SVM presents a good classification result. Training set and test set are trained using linear and nonlinear classifiers. Average accuracy rate is 75.15% under linear SVM and 75.46% under nonlinear SVM. It illustrates that SVM can effectively classify “helpful” and “useless” reviews, and nonlinear classifier outperforms linear classifier. In addition, random forest also achieves good classification results. The best average classification accuracy rate is 76.27% when setting values of 500 for tree and 2 for try. The box plot in Figure 7 shows classification accuracy of the two models. Particularly, random forest slightly outperforms SVM when the 4 influencing factors of review helpfulness identified in previous analysis are considered.

Furthermore, we examine the impact of explicit or implicit information on review helpfulness classification using the same methods. As defined earlier, review extremity and reviewer credibility are explicit information, and mixture of objectivity and subjectivity and mixed sentiment are implicit information. The test results show that classification accuracy is 68.52% based on explicit information and 66.91% based on implicit information (see box plot in Figure 8). In addition, a classification based on all 4 variables is better than the result using only two of them.

7. Conclusions

This paper attempts to understand consumer behaviour toward online product review by exploring the determinants of review attraction and review helpfulness. Review attraction and helpfulness issues are explored, respectively, at the two stages: notice stage and comprehend stage. We propose two conceptual models examining influencing factors of review attraction and helpfulness based on empirical test on 633 review samples collected from Amazon China. Our findings indicate that review attraction is mainly influenced by explicit information, such as review extremity, review reliability, and reviewer credibility. Reviews with extremely negative scores, more characters in text, and written by high ranking reviewers are more attractive to consumers. These results conform to our expectations and confirm the hypotheses about the influential factors of review attractiveness. As for review helpfulness, we find that both explicit and implicit information affect consumer’s perception of review helpfulness. A review is in particular helpful if it is scored extremely negative, written by high ranking reviewers, conveying both subjective and objective information, and mixed by positive and negative sentiments. These conclusions are in line with past research exploring the helpfulness of consumer reviews (e.g., [4, 1113, 16, 20]). Besides, such influence is moderated by commodity category, where the effects of review extremity and mixed sentiment are even stronger for search commodity. However, in contrast to our hypotheses, width and depth of review messages do not make significant impact on review helpfulness. A reasonable explanation of this result may be that more efforts are required to process excessive information contained in the review text which decreases the marginal value of reading a review.

To this end, we discuss strategies to filter product reviews and explore the classification of helpfulness using SVM and random forest methods based on the identified factors which achieve good accuracy. This leads to a few important managerial implications. First, e-business operators should encourage consumers to write more valuable reviews. For instance, online retailers can provide guidance and tips for consumers to improve the quality of their comments. Also an incentive (e.g., shopping points and coupons) can be introduced to reward individuals who write high quality reviews. Second, sellers and manufacturers should pay more attention to extreme negative reviews and take more active actions to address them. These negative reviews are more attractive and influential to consumers. It is an opportunity for sellers and manufacturers to detect problems in product and service and therefore improve brand image and promote sales. Moreover, online review system needs interaction among consumers, which is beneficial for them to communicate more detailed information regarding the reviews and products. It is worthy to bring real-time interactive mechanisms to online commenting systems.

There are also several suggestions for future research extension. First, this paper proposes a two-stage model including notice and comprehension to explore determinants of attraction and helpfulness of online product review. However, the aspects relating to the two stages can be modelled simultaneously such as using the attractiveness as a mediator. This is an important future research extension that may offer some interesting insights. Furthermore, experimental approach can be introduced to study consumer behaviour in a specific context, which may enrich the analysis. Besides, additional factors that potentially influence review attraction and helpfulness may be added to the model to improve the explanatory power of model. Finally, mathematical modelling approaches from economic and game theory perspectives can be helpful to understand consumer behaviour in reading and writing reviews, which may lead to some interesting research insights [44, 45].

Competing Interests

The authors declare no conflict of interests.

Acknowledgments

This research is partially supported by National Natural Science Foundation of China (nos. 71432003 and 71272128) and Specialized Research Fund for the Doctoral Program of Higher Education (no. 20130185110006).