Abstract

Online customer reviews can clearly show the customer experience, and the improvement suggestions based on the experience, which are helpful to product optimization and design. However, the research on establishing a customer preference model based on online customer reviews is not ideal, and the following research problems are found in previous studies. Firstly, the product attribute is not involved in the modelling if the corresponding setting cannot be found in the product description. Secondly, the fuzziness of customers’ emotions in online reviews and nonlinearity in the models were not appropriately considered. Thirdly, the adaptive neuro-fuzzy inference system (ANFIS) is an effective way to model customer preferences. However, if the number of inputs is large, the modelling process will be failed due to the complex structure and long computational time. To solve the above-given problems, this paper proposed multiobjective particle swarm optimization (PSO) based ANFIS and opinion mining, to build customer preference model by analyzing the content of online customer reviews. In the process of online review analysis, the opinion mining technology is used to conduct comprehensive analysis on customer preference and product information. According to the analysis of information, a new method for establishing customer preference model is proposed, that is, a multiobjective PSO based ANFIS. The results show that the introducing of multiobjective PSO method into ANFIS can effectively solve the defects of ANFIS itself. Taking hair dryer as a case study, it is found that the proposed approach performs better than fuzzy regression, fuzzy least-squares regression, and genetic programming based fuzzy regression in modelling customer preference.

1. Introduction

When researching and designing products, we need to deeply analyze the real needs of customers, such as easy to use and comfortable to hold, as they can be mapped into specific settings of the product [1]. Through the analysis of customer preference, the relationship between product specification information and customer preference can be well constructed. In the industry, the most commonly used information acquisition methods include user interviews or questionnaires, which are market analysis methods for user preferences. However, some deficiencies of them can be seen. In the stage of data collection, it is time consuming and expensive to use the method of investigation, especially when it involves interviews. On the other hand, customer surveys often only contain formatted tables or targeted interview questions. Therefore, in the survey stage, only the answers to the interview and the scores of the questionnaire can be obtained, and the data collected do not involve too much emotional expression. At present, online customer reviews with various contents can be seen in many online channels, which are very important data in the process of product planning and design or customer preference analysis and play an important role in improving the competitiveness of enterprises. Most online customer reviews are basically about their feelings in the process of using the product or their suggestions for the product. The content is written out of the customer’s own will, rather than the answers to fixed preset questions. In other words, the content of these online reviews are all around products, which are difficult to obtain in many traditional survey methods. Based on the content of customer comments, it is easy to understand the customer preferences, and there is no investigation involved, which does not require the cost. However, after obtaining effective data, we need to develop and design a modelling method that can build the customer preference model.

The researchers have applied opinion mining for the content of online reviews for product optimization and design. These studies extract data from the content of online customer reviews, including product specifications, customer needs, and preferences and determine the ranking and importance according to the above dimensions. There are also some research studies on the relationship between product attributes and customer preferences, which are established by rule mining. In previous studies, three issues are noted. For the 1st issue, the information on some product attributes may not be provided in the product’s description and the corresponding value setting cannot be found. In this circumstance, the product attribute is usually not selected to be involved in the modelling of the relationships, even if it is related to the customer preference to a large extent. The 2nd issue is about the fuzziness and nonlinearity existing in the modelling process which are not addressed well in the previous research. Online reviews have a certain relevance with the fuzziness of customers’ emotional expression, but many studies have no in-depth analysis on this aspect. Also, the nonlinear relationships between customer preferences and product attributes are not expressed explicitly in the developed models. The 3rd issue is found in the adaptive neuro-fuzzy inference system (ANFIS). ANFIS is mainly a multilayer feedforward network form which combines the learning power of artificial neural networks and explicit knowledge representation of fuzzy inference systems [2]. ANFIS is a widely used in modelling customer preferences to capture the fuzziness and nonlinearity in the modelling. However, if the number of inputs in ANFIS is large, the modelling process will be failed due to the complex structure and long computational time.

The research and analysis of this paper are based on the previous research content and constructively propose a new comprehensive way of multiobjective particle swarm optimization (PSO) based ANFIS and opinion mining, which can realize the modelling and analysis of customer preferences based on online customer reviews. In this paper, the opinion mining method is used to conduct a series of sentiment analyses on customer preferences and product attributes. If the value setting of product attributes cannot be found from the product description, the sentiment score of the product attribute obtained based on sentiment analysis is used as its value setting. Thus, the 1st issue can be addressed. For issues 2 and 3, a multiobjective PSO-based ANFIS is proposed to model customer preferences based on the collected settings of product attributes and the information mined. ANFIS is a very effective way to establish the nonlinear relationship between customer preferences and product attributes. And, the generated fuzzy rules can address the fuzziness in the data. But when there are many inputs, ANFIS cannot model effectively. And, with the increase of input, the number of fuzzy rules grows according to the exponential speed and the complexity of the model structure greatly increases, which may lead to an increase of calculation time or training failure. To overcome the limitation of ANFIS and solve the third issue, a multiobjective PSO approach is introduced into ANFIS to determine the optimal inputs and remove the unimportant inputs to simplify the structure of ANFIS. In the multiobjective PSO approach, two objectives are involved which are minimizing modelling errors and maximizing the models’ index of confidence (IC). PSO has fast convergence speed and high stability, especially in the search for the optimal solution [3]. Compared with the traditional evolutionary algorithm, the PSO algorithm can obtain a smaller error. Beiranvand et al. [4] found that multiobjective PSO algorithm has significant stability in many scenarios and has its own unique advantages in generating association rules, which is better than genetic association rules, rough PSO algorithm, multiobjective genetic algorithm, and multiobjective differential evolution algorithm. And, it is helpful to determine the best product attributes of new products.

Other contents in this paper include: Section 2 is about the related research. Section 3 is an important chapter of this paper, which is about the proposed method in the process of customer preference modelling. In Section 4, the new modelling method in the previous chapter is applied to the actual case analysis, and then the effectiveness and practicability of the new modelling method are verified. In Section 5, the results of some validation experiments are analyzed. Section 6 is the last section of this paper, which makes an overall summary.

The following content is mainly about the opinion mining methods, customer preference modelling for product design in the previous studies, and the recent optimization algorithms.

2.1. Opinion Mining for Product Design

Opinion mining is mainly a way of computational research on the emotions and opinions expressed by text content, and it is also a way of sentiment analysis. It is a process to effectively identify the keywords and phrases related to features in the text, and then determine the emotional polarity and strength of the descriptive words. Customers’ emotions can be understood through sentiment analysis in the product evaluation content published by customers, and their preferences for products can be accurately analyzed [5]. By mining the content of online customer reviews, opinion mining analysis can obtain product attributes and customer preferences and other information. The customer preference information was extracted through the establishment of an ontology learning customer demand representation system, which has better accuracy [6]. Zimmermann et al. [7] proposed a framework to discover implied product features and evaluate the content polarity of customer reviews for multiple types of products. The paper proposes a two-tier model which can use case analogy reasoning method and sentiment analysis to achieve effective identification of potential customer needs [8]. Tuarob and Tucker [9] have developed a new method to extract product features through social media automatically. In addition, a new data mining driven method is proposed, which can achieve the acquisition of customer preferences and product attribute data in a certain scale of huge social media data [10]. Zhang et al. [11] proposed an opinion mining and extraction algorithm that can discover feature opinions, opinion expressions and features. A Bayesian sampling method which can extract features from many text data is developed and designed [12]. Zhou et al. [13] developed a hybrid sentiment analysis method, which combines rough set technology and affective lexicons and realizes the enhancement of feature model by extracting online comment content information. The paper proposed a subjective and objective feature extraction in online customer reviews, mainly based on rube unsupervised rules [14]. A new case-based method was proposed, which can extract customer preferences through the integration of Kansei Engineering and text mining [15]. In order to avoid the time lag of offline surveys, Trappey et al. [16] designed a system that can determine real-time customer demand by online customer comments. Zhang et al. [17] applied the method of opinion mining in the extraction of customer preferences and product features and dealt with the redundancy of feature words by adopting a clustering technology based on semantic similarity. For product design, an ontology-based reasoning system is developed to extract effective information from online customer reviews [18]. At present, there are few papers on modelling product attributes and customer preferences based on online reviews. A rule induction framework was developed, which can form if-then rules and can associate product attributes and customer preferences of online reviews [19]. Jiang et al. [20] developed a multiobjective PSO method to obtain the association rules between product attributes and customers’ emotional preferences. However, the if-then rules are not sufficient enough to be applied for the determination of the optimal product attribute design of new products.

2.2. Modelling Customer Preference

There is a lot of research on the methods of building customer preference models to analyze the relationship between product attributes and customer preferences. On this basis, we can better determine the optimal attribute design of new products. Much research on customer preference modelling is based on statistical techniques, such as partial least-squares analysis [21] and statistical linear regression [22]. A method based on belief rules was designed to realize the modelling process, which can determine the design elements settings of products [23]. Chen et al. [24] analyzed the relationship between product attribute design and customer preference and pointed out an artificial neural network modelling method. However, affected by the subjective judgment of the respondents, the fuzziness of the customer preferences was not considered in the above methods.

For the problem of fuzziness, researchers have done a lot of research and put forward a lot of solutions. Fung et al. [25] adopted a fuzzy inference technique to relate customer preference with the relevant product attributes. Tomasiello et al. [26] applied a variant of ANFIS with fractional Tikhonov regularization to handle problems with fuzziness. In modelling customer preference in office chair design, a method based on fuzzy rules is proposed [27]. For the analysis of the relationship between product attribute design and customer preference, a possibility regression method based on nonlinear programming is developed [28]. For building a customer preference model, a new method of Tanaka’s fuzzy linear regression was developed using a survey data [29]. At present, to address the nonlinear and fuzzy problems in customer preference modelling, the methods based on polynomial modelling and fuzzy regression were proposed. A method based on genetic programming and fuzzy regression was developed for the establishment of nonlinear and fuzzy terms in structures [30]. Jiang et al. [31] introduced a chaos-based fuzzy regression approach to modelling customer preference for product design. For customer preference modelling, especially for fuzzy coefficients and polynomial structure, a fuzzy regression program based on stepwise method [32] and a fuzzy regression program based on forward selection [33] were proposed. However, the basic way in the above research is to take the traditional data survey for the modelling of customer preferences. At present, based on the information extracted from online customer reviews, there is no suitable customer preference modelling method.

2.3. The Recent Optimization Algorithms

The main principle of the development of PSO is based on the evolution of the social behavior of birds in biology. With the joint work of birds, they gradually converge to the specific location of food. Some variants of PSO have been proposed in recent studies. Wang et al. [34] introduced a dynamic group learning distributed particle swarm optimization for large-scale optimization and extended it for the large-scale cloud workflow scheduling. Zhang et al. [35] proposed a cooperative coevolutionary bare-bones particle swarm optimization with function-independent decomposition for a large-scale supply chain network design with uncertainties problems. Xia et al. [36] designed a triple archives PSO, which can select proper exemplars and design an efficient learning model for a particle. The social learning particle swarm optimization with a novel adaptive region search is designed to keep the diversity of the solutions and accelerate the convergence speed [37]. Li et al. [38] proposed a pipeline-based parallel particle swarm optimization which has significant potential applications in time-consumption optimization problems. For the multiobjective problem, Zhan et al. [39] proposed a novel coevolutionary technique named multiple populations for multiple objectives in which PSO is adopted for each population, and coevolutionary multiswarm PSO is developed. Liu et al. [40] proposed a coevolutionary particle swarm optimization with a bottleneck objective learning strategy to improve convergence on all objectives. Some adaptive optimization algorithms are also introduced in recent years. Zhan et al. [41] proposed an adaptive distributed differential evolution to relieve the sensitivity of strategies and parameters in complex optimization problems. Wang et al. [42] introduced an adaptive granularity learning distributed PSO with the help of machine-learning techniques to solve the problems of the slow convergence in the huge search space and the trap into local optima in large-scale optimization. Some future research directions on using evolutionary computation algorithms to solve complex continuous optimization problems are discussed in [43].

3. Proposed Methodology

On the basis of online customer reviews, the process of the developed method is: product feature collection, opinion mining, and customer preference model construction based on a multiobjective PSO-based ANFIS method. The process of the developed method can be seen in Figure 1. In this way, we can determine the relationship between product attributes and customer preferences and effectively solve the problem of fuzziness in the process of customer emotional expression. Product attributes used in the modelling involve two types. For the first type, the settings of the product attributes can be directly collected from the product description, while the settings of the second type cannot be obtained based on the information of the products. For the second type, the sentiment score of product attribute obtained based on sentiment analysis is used as its corresponding value setting.

3.1. Opinion Mining from Online Customer Review

The first is to determine the sample products. Online reviews of products can be obtained with the help of web crawler software. We collect useful data to file, such as Excel, as the data source of sentiment analysis. Then, we start to analyze the content of online customer reviews and calculate the emotional score of the second type of product attributes and customer preferences.

For the opinion mining of online reviews, this paper adopts Semantria, which has a more efficient emotional analysis function and is widely used in the industry. By importing the data collected in Excel into Semantria for analysis, we can determine the polarity of the customer’s emotion and score the emotion. The specific process includes [44]: the first step: the preprocessing is used to get clean text content, from which noise in unstructured content is deleted, including stop words, punctuation and HTML characters. The second step: part of speech tagging is employed for categorizing the opinion-bearing words from online reviews into adjectives, verbs, adverbs, and nouns. Generally, nouns are used to describe product attributes and customer preferences, while adjectives and adverbs describe nouns’ emotion. The third step is to analyze the content of online customer reviews and effectively extract the emotional expression content related to product attributes and customer preferences. The fourth step: feature pruning is applied to remove the incorrect features and redundant features, which involves compactness and redundancy pruning. The fifth step: the synonymous phrases are grouped into the same group, and the method of K-means clustering is adopted. The customer preference phrases of hair dryer: “excellent quality,” “great quality,” “good product,” and “high quality” were classified as group “quality.” The sixth step: sentiwordnet determines the emotional score and semantic polarity of opinion-bearing words for individual customer preferences or product attributes [45]. The final emotional score of customer preference or product attribute is then obtained based on the scores of each opinion-bearing word.

3.2. Modelling Customer Preference Using a Multiobjective PSO-Based ANFIS Approach

The emotional scores of the second type of product attributes and customer preferences are computed based on opinion mining. According to the settings of the first type of product attributes and the obtained emotional score, a multiobjective PSO-based ANFIS approach is developed to establish the relationship between product attributes and customer preferences. In this process, the multiobjective PSO method is used to solve the biobjective optimization problem for minimizing modelling errors and maximizing IC of the models, and the optimal input of ANFIS is obtained. The Pareto optimal solutions can be obtained, and a trade-off solution can be selected to generate a customer preference model by using ANFIS.

3.2.1. Formulation of Biobjective Optimization Model

The first step is to develop the biobjective optimization model. To minimize the modelling errors, the mean absolute percentage error (MAPE) of modelling was adopted to formulate an objective function using the following equation:where is the ith sentiment score of customer preference in data sets. is the ith predictive sentiment score based on the generated model. and n represents the dataset number.

IC is similar to the determinant confident () in classical regression. The higher values of IC imply a better prediction of [46]. The second objective function is maximizing IC, which is defined as follows:

In (2), SSE = SST − SSR. SSE is the error sum of squares, SST is the total sum of squares, and SSR is the residual sum of squares, which can be calculated by (3)–(5), respectively,where is the mean of the sentiment scores of the customer preference in the data sets.

3.2.2. PSO Algorithm

In PSO, the potential solution of the problem can be regarded as a bird in the birds swarm, which is also equivalent to the “particle” in the algorithm. In the dimension of D, the particle will maintain its flight state at a certain speed, which is dynamically adjusted based on its flight experience and group flight experience. And, each particle will also be assigned to a fitness set, which is determined by the value of the objective function. A particle’s own flight experience is its own current best position, , which is defined as the position with the best fitness set. At the same time, each particle also refers to the global best position, the symbol is , which is defined as the best value in . The is the particle’s global flying experience. The optimization search starts from the randomly initialized particle swarm and is completed by the iteration of the PSO.

In the dimension of D, many particles will build a particle swarm and start to search for the optimal solution continuously according to a certain speed. Each particle will adjust its position based on the global optimal position and its own optimal position, which is the basic connotation of swarm intelligence. The particle’s position is set as , at present , and m is the size of particle swarm. Also, D is the number of parameters to be determined for the inputs of ANFIS. The particle’s speed is . In the proposed approach, each particle represents one input set for ANFIS with a different structure to model customer preference. The specific form of particle structure can be seen in Table 1. If the value of the product attribute in a particle is 1, the product attribute is selected as the input. Otherwise, it is discarded.

At this point, if 5 is set as the number of attributes of the product and the particle value is:  = (0, 1, 1, 0, 1), the product attributes 2, 3, and 5 are selected as inputs, and the products attributes 1 and 4 are absent.

The particle’s historical best position is , which has the best fitness set among all the positions. The best position for the whole swarm is , . The results of shows the optimal input set of ANFIS. According to the particle’ speed, position, the distance between the global optimal position and the position at this time, and the distance between the particle’s own optimal position and the position at this time, the particle’s position and speed can be updated based on the idea of inertia weight [47]:where and represent the position vector and the speed vector of the particle at the iteration, respectively. represents iterations’ number. represents the inertia weight which helps the particles balance the ability of exploitation and development in the search space. and represent learning factors and both the values are set as 2. and represent random values chosen from the range [0, 1].

3.2.3. Pareto Dominance

In this method, MAPE and IC are selected as the two objective functions of multiobjective PSO. According to the settings of product attributes and the results of sentiment analysis, we can obtain the data sets for customer preference modelling. Using the position vector of particles and data sets, the values of two objective functions are computed using (1) and (2) and recorded as the fitness set of each particle. A fitness set is expressed as, where and are the values of the objective functions. With the help of the Pareto dominant theory, the multiobjective problem can be solved effectively. If a solution x1 dominates another solution x2, the following requirements should be met. From all the objective functions, we can see that the solution x1 is not worse than the solution x2. Moreover, in no less than one objective function, the solution x1 is strictly better than the solution x2. The above-given two conditions are applied to the minimization optimization problem:

For a maximization issue, x2 is dominated by x1 if and . Using Pareto dominant theory, the solution that is not dominated by other solutions can be called a Pareto optimal solution. In the process of iteration, each particle will compare its current position with its own best position. If the solution of its current position dominates its best position, the information of the best position will be updated to be the current position. Then, based on the above-given two requirements, the solutions in are compared with each other, and the best solution is selected as the global best position .

3.2.4. ANFIS Structure

Based on the global best position , the optimal solution is obtained as the inputs of ANFIS to model customer preference. A typical ANFIS structure is shown in Figure 2. It includes two inputs and one output, and each input has two membership functions.

For each input, a membership function denotes one linguistic description. represent the membership function for the linguistic description of , and represent the membership function of the linguistic description of , in which and . Therefore, there are four membership functions of the two inputs and they are denoted as the four nodes in the first layer (L1). The content of the triangle membership function is as follows:

In (8), and represent the triangular fuzzy numbers.

At L2, the outcome of each combination of with is denoted by one rule. That is, the total number of rules is 4. The fuzzy rules are described as follows:where , with are the parameters of of fuzzy rules . The outputs of L2 are described as follows:

In (10), is the firing strength of each fuzzy rule. The connection weight of L2 with L3 is the normalized firing strength as defined by (11). The larger the value of implies that is more important.

The internal model of in L3 is actually a first-order Sugeno fuzzy model, described in the following equation:

At L4, the total output is denoted by a single node and obtained by calculating the sum of all the input signals, as shown in the following equation:

It can be seen from (13) that the single output () of ANFIS is a linear combination of all internal models under fuzzy rules and the normalized firing strengths, which is the predicted sentiment score of the customer preference.

3.3. Computational Procedures of the Proposed Methodology for Modelling Customer Preference Based on Online Customer Reviews

Based on online reviews, the method of building a customer preference model is as follows:Step 1: the product attributes of the first type are identified and their settings are collected based on the description of the products.Step 2: for online customer reviews of sample products, they will be collected through certain processing and stored in Excel files. In Section 3.1, there is a corresponding elaboration. Semantria will conduct a series of opinion mining for all online reviews. The definitions of the second type of product attributes and customer preferences are then completed. Based on the keywords and phrases related to each product attribute and customer preference, opinion mining is conducted again for the online reviews. The emotional scores of the product attribute and customer preferences can be obtained. According to steps 1 and 2, we can get the data sets, which can be used as the data source to simulate user preferences.Step 3: based on the content of Section 3.2 and the data sets in step 2, this paper proposes a multiobjective PSO-based ANFIS approach for customer preference modelling. By adopting a multiobjective PSO algorithm, ANFIS input can be determined. The group size of PSO, iteration times, the learning factor, inertia weight, and search space dimension are initialized. Based on Table 1, each particle will randomly initialize its position and speed in the corresponding range.Step 4: in the iteration phase, the particle’s individual optimal position and the global optimal position of particles are initialized in the first iteration. The initial individual optimal position is set as the initial position of each particle. According to Section 3.2.4, ANFIS is used to model customer preferences, and then (15) is used to complete the overall output prediction. The values of two objective functions, MAPE and IC, are then computed for each particle using (1) and (2), respectively, and they will be the initial individual best fitness set . In the corresponding , particles are compared with each other using Pareto dominant theory. The particle that meets the conditions in Section 3.2.3 becomes the Pareto optimal solution and is set as the initial optimal particle. In this case, its fitness set is the initial global optimal fitness set and its position vector is the initial global optimal position .Step 5: as the iteration process increases from k to k + 1, the particle speed vector and the position vector are updated according to (6) and (7), respectively. When the value in and is beyond the corresponding defined search range, then the value will be adjusted to the range limit value. Based on the updated position of particles, the predicted output is obtained from ANFIS using (15). After a series of calculations for the two objective functions using (1) and (2), we can get the fitness set of particle after iteration. According to the Pareto dominance described in Section 3.2.3, the of particle and can be compared. When is dominated by , the value of is substituted by the value of . At the same time, the individual optimal position of the particle will be updated to be . Among , the Pareto dominance is then conducted. The defined Pareto optimal solution in is the new value of the global optimal fitness set . At the same time, the number of the optimal particle needs to be accurately recorded and its position is used as the global best position .Step 6: if the preset limit of iteration is met, it will stop. The represents the optimal product attributes for modelling customer preference using ANFIS and are the corresponding values of MAPE and IC. Based on equations (10), (12)∼(15), and the selected solution, the customer preference models can be established effectively and the fuzzy rules can be generated using (11).

4. Implementation

Based on the research of customer preference modelling, this paper adopts the method of case analysis. The research object is selected as the hair dryer, and the research work is carried out according to the hair dryers’ online customer reviews. In this paper, ten products of A∼J are selected. On the Amazon shopping platform, we collected a total of 10754 published product reviews, and the review data were stored in excel through processing. Then, the process of opinion mining is implemented by Semantria. Based on the steps in Section 3.1, using the collected data, preprocessing process and part of speech tagging were conducted first. Then, phrases and keywords were extracted effectively and high frequent ones were chosen. Feature pruning was employed to delete the redundant features. After that, words and phrases which are synonymous or related to the same product attribute or customer preference were then grouped. For example, the mined phrases “easy to operate,” “separate switch,” “less noise,” “quiet,” and “simple to use” were summarized as a group “easy to use,” which is an important representative of customer preferences. Customer preferences are summarized, including five categories of quality, price, weight, easy to use, and performance. This paper analyzes the “easy to use,” which is denoted as y and develops a new method for customer preference modelling, that is, multiobjective PSO-based ANFIS approach. “Drying time” is one of the product attributes related to “easy to use,” which is the second type of product attribute as its settings cannot be found from the information of products and is denoted as x8. The extracted key words and phrases “quickly,” “faster,” “fast drying,” and “short time” were grouped under the category “drying time.” Among all the online reviews, the numbers of online reviews which involve customers’ comments and opinions on “easy to use” and “drying time” are 304, 140, 149, 103, 88, 49, 50, 165, 81, and 64 as well as 503, 204, 250, 146, 178, 78, 70, 365, 177 and 119 for products A∼J, respectively. The online reviews were analyzed again by using the user category analysis of Semantria. Phrases and keywords associated with the product attributes of the second type and each customer's preference were treated as the settings of the “user category.” Through sentiment analysis, the sentiment scores of the product attribute of the second type and the customer preferences for each product were obtained. The examples of online reviews on “easy to use” and “drying time” as well as the obtained emotional polarity and scores are shown in Table 2. The sentiment scores of “easy to use” and “drying time” for products A∼J are obtained as shown in the last two columns in Table 3, which are used as the values of customer preference and the settings of a product attribute, respectively.

Among the product attributes of various contents, seven product attributes of the first type are related to “easy to use.” These attributes include weight attribute, length attribute, width attribute, height attribute, power attribute, heat setting attribute, and speed setting attribute, which are denoted as x1x7, respectively. Table 3 shows the attribute settings of 10 sample products.

With the support of the data set in Table 3, a multiobjective PSO-based ANFIS approach is proposed to construct relationships between customer preference, y, and product attributes x1x8. According to Table 1, the number of dimensions D in the search space of the multiobjective PSO algorithm is 8, which is equivalent to the number of product attributes. The search ranges of and of particles were [0, 1] and [−0.5, 0.5], respectively. After many operations, it can be determined that the number of iterations is 30 and the size of the particle swarm is 5, which are the minimum value settings with high prediction accuracy. From the interval of [0.1, 0.9], a random value is taken as the inertia weight . c1 and c2 are chosen as 2. and are randomly selected from the interval of [0, 1]. For ANFIS, the number of membership functions for each input was set as 3. Since the smallest training error was obtained at the 3rd iteration and kept stable in the following iterations, the training epoch number was set as 5. The customer preference model can be established after analysis of online reviews by utilizing MATLAB software. The laptop with an i7-7500U CPU and 8 GB RAM was used as the equipment for the experiment. The optimal solutions for “easy to use” and the corresponding values of MAPE and IC were obtained and some examples are shown in Table 4. In the table, each solution represents an ANFIS structure. If the value of the product attribute is 1, the product attribute is used as the input of ANFIS. For example, the values of x2, x4, and x8 are 1 in the 7th solution. Thus, the product attributes x2, x4, and x8, are used as the inputs of ANFIS for modelling customer preference.

It can be seen from the table that IC is directly proportional to the number of inputs and MAPE is inversely proportional to the number of inputs. However, based on the ANFIS structure described in Section 3.2.4, the number of terms and fuzzy rules in the model is increasing exponentially. For example, the number of fuzzy rules for one to five inputs is 3, 9, 27, 81, and 243, respectively. In other words, the calculation time and the complexity of the model increase significantly with more inputs. The modelling results in Table 4 also show when the number of inputs is equal or larger than 2, the modelling errors are very small, and the value of IC reaches the largest value of 1. To perform a trade-off between the complexity of ANFIS and modelling errors, in this study, the ANFIS with two inputs is selected to modelling customer preference because of its simple structure and good modelling accuracy. Among all the optimal solutions with two inputs, the 5th optimal solution is chosen as it has the smallest MAPE and the largest IC. Therefore, x2 and x4 are the input attributes for ANFIS. Based on (8) and (10)∼(15), the customer preference model for “easy to use” was established based on the proposed approach.

Based on the established model, the product development team can predict the customer preference score according to the new settings of product attributes. Moreover, the model can also be used to determine the optimal product attributes for designing new products by maximizing customer preference scores. Nine fuzzy rules were generated by using (9) as follows:

5. Validation

For the systematic analysis and evaluation of the effectiveness of the proposed method, this paper compares and analyzes the modelling results of genetic programming-based fuzzy regression (GP-FR), fuzzy least square regression (FLSR), fuzzy regression (FR), and ANFIS. In the process of building a customer preference model, it was found that ANFIS cannot be realized because of its complex structure. In order to get the value of h parameter in FR and FLSR, this paper selected a number of values in [0, 1] interval for experiments and tests and selected the h value with the lowest modelling error. The h value of FR was set to 0.1, and the h value of FLSR was set to 0.99. To make a compromise between the modelling accuracy and computational time, the models based on GP-FR were established by using a different setting of iteration number and population size. Finally, the number of iterations was set to 200 and the population size was set to 40. The generation gap, the maximum depth of the tree, and the probability of mutation and crossover were set to 0.8, 5, 0.3, and 0.7, respectively. This paper describes the settings of parameters of the proposed method in Section 4. The developed models for “easy to use” based on the four approaches and their corresponding MAPE and IC are shown in Table 5.

The table shows that all models can effectively deal with the fuzziness of modelling. However, only the models established by GP-FR and the proposed method can solve the problem of modelling nonlinearity. In addition, the value of MAPE based on the proposed method is smaller, and the value of IC is larger than those based on the other three approaches.

In order to further verify the effectiveness of the proposed method, a total of thirty validation tests are arranged. In the process of each test, two data sets were randomly selected from all data sets as testing data, and another eight data sets were used as training data for generating customer preference models. MAPE and variance of errors (VoE) in (1) and (16) are used to systematically compare and analyze the modelling results of FR, FLSR, GP-FR, and multiobjective PSO-based ANFIS.

The contents in Figures 3 and 4 are about the MAPE and VoE values under the four methods, respectively. The verification results of FR, FLSR, GP-FR, and the proposed method are indicated by the line with the symbol “+,” “∗,” “O,” and the solid line, respectively.

Based on the above-given experiment settings, the training time for FR, FLSR, GP-FR, and the proposed method are 1.9186, 0.3663, 143.6772, and 1.1449 seconds, respectively. Except for the GP-FR method, it is not much different from the training time of the other three methods.

In Table 6, the average MAPE and VoE of the 30 validation tests after adopting four methods are described. Through the above-given comparative analysis, we can find that the proposed method performs better than other approaches in terms of MAPE, VoE, and their means. The mean MAPE and VoE based on the proposed approach are reduced by 105 and 108 times in comparison with that based on the other three approaches, respectively.

6. Conclusions

The traditional way to collect data is to use customer surveys. Through a series of collection and analysis of customer survey data, we can understand customers’ real preferences and then establish the model. However, the process of data collection will definitely take a long time. Not only that, because the general questionnaire will set questions in advance, and the customers can answer according to the content of the questions. Therefore, the data contents collected under such characteristics do not have much emotional expression. Comparatively, online customer reviews can contain a lot of emotional expressions of customers, such as comments on products or suggestions on optimization design. In this way, customers’ product preferences can be easily obtained, and there is no cost in this process. For data mining for online review content and the application in the design of new products, some research has been carried out. But in the previous research, some issues have been found as follows. Firstly, the product attributes without the setting information were not involved in the modelling of customer preference. Secondly, many research contents do not effectively solve the fuzzy problem of emotional expression in the modelling and the nonlinearity existing in the models. Thirdly, the modelling process of ANFIS will be failed if the number of inputs is large, as it leads to a complex structure and long computational time. To overcome the above research limitations, a new methodology, which involves opining mining for product attributes and customer preferences from online reviews as well as a multiobjective PSO-based ANFIS approach for establishing customer preference models, is proposed. In this paper, the corresponding application case analysis on the hair dryer products is carried out. The effectiveness and practicability of the proposed method are verified. The proposed method is compared with the ANFIS, FR, FLSR, and GP-FR approaches. By comparing the modelling results, it can be found that the model constructed by the proposed method can effectively solve the problems of nonlinearity and fuzziness. In addition, the multiobjective PSO-based ANFIS approach is superior in values of VoE and MAPE compared with other methods. In the future, we will determine the best product attributes setting of new products using the developed customer preference models. On the other hand, a study of the improvement of the proposed approach with the adaptive determination of the parameter settings for PSO and ANFIS would also be considered in future work by referring to the recent studies in Section 2.3.

Data Availability

The data used to support the findings of this study can be obtained from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant no. 71901149).