Abstract

In order to give full consideration to the consumer’s personal preference in cloud service selection strategies and improve the credibility of service prediction, a preference-aware cloud service selection model based on consumer community (CC-PSM) is presented in this work. The objective of CC-PSM is to select a service meeting a target consumer’s demands and preference. Firstly, the correlation between cloud consumers from a bipartite network for service selection is mined to compute the preference similarity between them. Secondly, an improved hierarchical clustering algorithm is designed to discover the consumer community with similar preferences so as to form the trusted groups for service recommendation. In the clustering process, a quantization function called community degree is given to evaluate the quality of community structure. Thirdly, a prediction model based on consumer community is built to predict a consumer’s evaluation on an unknown service. The experimental results show that CC-PSM can effectively partition the consumers based on their preferences and has good effectiveness in service selection applications.

1. Introduction

Cloud services, which take resource virtualization as the core technology and on-demand resource provisioning as the main characteristics, provide strong processing capacity to complete the users’ all kinds of applications, while saving the cost of software installation and maintenance. With the development of cloud computing technology, more and more IT enterprises package their existing applications to cloud services for the users’ consumptions, which form a pool with mass services. In the cloud service market, a three-tier service framework is formed, including cloud platform, service provider, and consumer [1, 2]. A service provider provides the consumers with services by renting virtual resources of the cloud platform. Different service providers can adopt different technologies, so they will provide a number of services with different quality. Facing the competing services with the same or similar function, service selection has become a core question concerned by the cloud consumers [3].

How to make a consumer select a satisfactory service from the service pool so as to improve the customer’s loyalty and attract more consumers is significant for the cloud platform [4]. Therefore, more and more scholars have attached great importance to the study on the methods of cloud service selection. At present, main research directions of cloud service selection can be grossly divided into two categories. One is the service selection method based on objective QoS, which transforms the problem of service selection to the problem of QoS calculation, and then recommends an optimal service or service composition to a consumer by building a mathematical model based on the service’s QoS and the consumer’s preference [59]. The other is the service selection method based on subjective recommendation, which regards service as “commodity,” and then copies reputation mechanism based on user feedbacks in the electronic commerce to solve the problem of service selection [1014].

Both of the above categories have their virtues and faults. On the one hand, the service selection method based on objective QoS provides a prediction of the consumer’s satisfaction to an unknown service. But the premise of such a situation is that all the information, including the service’s QoS and the consumer’s preference, needs to be clearly described. In fact, it is very difficult for the consumer to accurately describe his/her preference. Besides that, due to the economic interests, the service provider may give a false QoS description. On the other hand, the service selection method based on subjective recommendation builds a model for selecting a credible service through reputation mechanism, but it still needs to distinguish the consumers with different preferences. If not, it will make a wrong recommendation. Then how to find the trusted users for recommending services has become a key problem of the method. The traditional way is to choose the nearest neighbors, such as collaborative filtering (CF). But it needs to search the similar users at each round, which is unfavorable for the efficiency and stability of recommender system. Because of the consumers’ inherent characteristics, such as the differences in knowledge structure, professional requirements, and social factors, the consumers’ preferences would show some group features. It is a natural form of some service preference-oriented consumer communities in the cloud. If a stable group structure based on the consumers’ preferences is found, we can recommend services through the group structure. For the service requester, the recommendations from the consumers similar with him are trustworthy [14].

Based on the idea, a service selection model based on community discovery for consumer preference is proposed, which breaks the limitation of traditional service selection methods. It does not depend on the description of the services’ QoS parameters and the consumers’ preferences but helps to improve the efficiency and stability of service recommendation. Our main ideas and contributions are as follows:(1)The concept of consumer preference similarity is proposed to identify the differences of the consumers’ preferences.(2)The consumer community structure is discovered based on consumer preference similarity, which contains some stable groups for service recommendation.(3)A novel service selection model based on consumer community is presented, which explores a method of predicting the consumers’ evaluations through the stable and subjective information.

We conducted a lot of experiments with the simulated data and the data sets from Epinions.com, and the results showed the effectiveness of our work. In addition, we obtained many interesting and useful findings from the experiments.

The remainder of the paper is organized as follows. In Section 2, the related work is introduced and discussed. In Section 3, a community discovery model based on consumer preference is presented, and a consumer community discovery algorithm for the model is described. Then in Section 4, a service selection model based on consumer community (CC-PSM) is presented, and the corresponding algorithm is given, while, in Section 5, the experimental results are reported and analyzed to evaluate the effectiveness and feasibility of CC-PSM. Finally, the paper is concluded and possible directions of future work are highlighted in Section 6.

In the cloud environment, one of main obstacles to choose a cloud service for the consumer is the diversity and agility of cloud service, which leads to the difficulty of comparing different cloud services. To solve the problem, some scholars are devoted to the study on the parameter matching approach between services and consumers. Ding et al. [5] designed a service recommendation method based on the historical data of the service performance, which can regulate multiattribute matching between provider solutions and customer demands. However the consumer’s demand description and the service provider’s performance description often have no unified standard. For this, Karim et al. [15] explored a mechanism to map the users’ QoS requirements to the right QoS specifications of SaaS and then proposed a set of rules to perform the mapping process.

Godse and Mulik [16] thought that cloud service selection was a multicriteria decision-making problem (MCDM), so they made use of Analytic Hierarchy Process (AHP) technique for prioritizing the services features and also for expert-led scoring of the services. In [6], the authors gave full consideration to different criteria of the cloud services and different demands of the consumers and proposed a multicriteria cloud service selection methodology. Zhao et al. [7] proposed an innovative service selection algorithm based on the service’s response time, trust degree, and monetary cost to find the appropriate services with satisfying the users’ multiple QoS requirements. Nevertheless, in order to improve the accuracy and efficiency of MCDM and reduce the uncertainty of decision problem, it must ensure that the cloud service provider not only offers real description of the service, but never changes the service parameters during the period of decision-making, which is unfavorable to resist malicious or low-quality service providers in the cloud environment. So a cloud service selection method based on parallel multicriteria decision analysis was put forward to rank all cloud services in different time period in [8].

In addition, due to the fuzziness and uncertainty of user preference, some scholars have studied the quantification of user preference. In [17], considering the uncertainty of user subjective and objective weight preferences, the authors obtained the user subjective weight by intuitionistic fuzzy set and objective weight by attribute significance of rough set. Finally, the uncertain user QoS preference-aware cloud service selection was transformed to a multiple attribute decision-making problem to select a best service for the user. In [18], the authors transformed a consumer’s qualitative and semiquantitative personalized preferences into quantitative numeric weights based on AHP.

To improve the consumer’s satisfaction is the ultimate goal of service selection method. In [9], the authors focused on QoS-satisfied predictions about the composition of cloud service components and presented a QoS-satisfied prediction model based on a hidden Markov model. Zheng et al. [10] pointed out that a pure QoS prediction was not enough for the consumer to reflect accurate order of the candidate services. So they designed a personalized QoS ranking prediction framework. The framework discovers the users with similar preference through comparing the users’ ranking similarity of the commonly invoked services and then makes QoS ranking prediction of a set of cloud services for the target user based on the similar users’ ranking with him.

To eliminate a sharp trust crisis between cloud consumers and cloud service providers, trust is introduced in the process of service selection. In [19], a novel cloud service-composition method based on the trust span tree was proposed. Through the credible relationship evolution, the trust union of service providers with same or similar function is formed, which helps to exclude uncertain or malicious service from the trust span tree. But trust is a subjective cognition judgment. Therefore, the trust evaluation needs not only objective measurement but also subjective perception. Ding et al. [11] designed a novel framework named CSTrust for conducting cloud service trustworthiness evaluation by combining QoS prediction and customer satisfaction estimation. To reduce the bias caused by unreasonable feedback from unprofessional or malicious cloud users, a method was proposed for filtering the feedback from such users in [12]. After processing, the aggregated result can quantitatively reflect the overall quality of a cloud service.

Whether a service is credible to satisfy the consumer’s personality preference or not is the main consideration in [20]. This paper utilized fuzzy clustering technology to classify services by the requesters’ preferences and gave a service selection algorithm to select a closest classification with the requester’s preference. With the integration of trust evaluation method and CF technique, Wang and Zhang [13] presented a trustworthy service selection model based on collaborative filtering. This model introduces consumer correlation to embody the impact of the requester’s personal characteristics on selection process, computes creditability of recommendation, and employs analytic hierarchy process to decide the weight of each factor in service reputation. Abedinzadeh and Sadaoui [14] presented ScubAA, a novel generic agent trust management framework based on the theory of human plausible reasoning. ScubAA recommends to the user a list of the most trusted services in terms of a single personalized value derived from several types of evidences such as user’s feedback, history of user’s interactions, context of the submitted request, references from third party users as well as from third party service agents, and structure of the society of agents.

As seen from the above literatures, selecting the appropriate recommendation users is the key to the service selection based on trust. In [21], an innovative idea for selecting the reliable recommendation users was explored. The authors thought that sparsity, cold-start and trustworthiness were major issues challenging service recommendation in adopting similarity-based approaches. With the prevalence of social networks, to a certain extent, the user’s characteristics and preference were exposed from the data in blogs and social-networking sites. So a social network-based service recommendation method with trust enhancement was proposed in the paper. The method assesses the degree of trust between users in social network by a matrix factorization, and then recommendation results are obtained by an extended random walk algorithm.

The above literatures used different technologies to realize an intelligent service selection for consumers from different angles, including quantifying the consumers’ demands or preferences, predicting the quality of service, matching the consumer’s demands and the service’s performance, and recommending the services based on trustworthy consumers. Different methods have different merits and limitations. Firstly, the method based on the parameters matching between the consumer’s demands and the service’s performance is relatively simple and intuitive. But this type of the method mostly requires the consumers to explicitly give his/her QoS demands, which is too difficult for consumers to provide these parameters. So its feasibility is poor in practice. Secondly, the method by predicting the quality of service only considers the general performance of the service but ignores the differences of QoS demands. Thirdly, the fuzziness and uncertainty of user preference imply that it is a difficult thing to mine the users’ preferences. The methods of quantifying the consumers’ preferences are still under study. In addition, there are still no efficient ways to validate the quantitative results. Finally, the method based on trust recommendation is now widely accepted in the field of electronic commerce and social networking service. Thus the method has good application prospects in terms of cloud services selection. The key problem of the method is the choice of trustworthy customers for service recommendation. But the existing methods for selecting the trustworthy customers are mostly unstable and inefficient, which affect the accuracy of the selection results and the feasibility of the selection process.

From what has been discussed above, we can see clearly that the core problem of service selection is to find a feasible method to identify different consumers’ preferences. To resolve the problem, a cloud service selection model based on simple evaluation information and the idea of the trust in community network will be built in the paper. At first, we need to consider how to mine the consumers’ preferences based on available information in the cloud so as to improve the feasibility of the method in practical application. Inspired by the literature [21], a consumer is a mapping for a real person in real society, whose behavior and preference are relatively stable. Due to different profession and background, the preferences of the consumers would show some group features. So we can cluster the consumers with similar preference to form a stable community, which is the basis of service selection. According to the theory of human plausible reasoning [14], the consumer would trust the consumers who are similar with him/her more than the others. Then we design a service selection model based on community trust.

3. Community Discovery Model Based on Consumer Preference

Predicting a consumer’s evaluation on an unknown service mainly relies on the recommendations from other consumers. So whether other consumers’ evaluations on a target service are worthy to trust or not will have a large impact on the accuracy of prediction. Therefore it is a key step to find the trusted recommendation users in the process of service selection. Much large-scale distributed complex system can be described in the form of a complex network, in which some modules or communities can be observed. A community may be a group of nodes based on some certain concept such as node similarity. Community structure can serve as a bridge between a single node and the macro network system. Although there is no direct interaction or contact between cloud consumers, a complex network based on the inner relationships between consumers is still formed through their selection and evaluation for cloud services. The network also tends to be obvious characteristics, community structure, as other complex networks. The study of consumer community structure in the cloud, which shows consumer groups with common features, has the important theoretical significance for analysis and prediction of consumer behavior. Based on the idea of community discovery in the complex network, this paper discovers consumer community based on preference similarity between consumers. In the community structure, the users in one community are similar in preference. In this section, we will model the process of community discovery through the following definitions and give an algorithm to perform the model.

The data foundation of our study is the service evaluation information in cloud, which includes three parts: service, consumer, and service evaluation. If a consumer buys a service according to his/her business requirements, he/she will give a comprehensive evaluation to the service at response time, security, reliability, availability, and other performances through his/her service experience, which is called service evaluation. This paper sets service evaluation on a scale of 1 to 5. The value 1 is for a very poor service and 5 for a completely satisfactory service. Assume that each entity is abstracted as a node; then a concept of bipartite network for service selection is given as follows.

Definition 1 (bipartite network for service selection). A bipartite network for service selection, , is a three-tuple, .(1) is a finite set of consumers, .(2) is a finite set of services, .(3)The set of consumers and the set of services are disjoint, .(4) is a mapping from consumers to services. , , if has bought , there exists an edge between and , whose weight is . represents the evaluation of on .

3.1. Consumer Preference Similarity

In current research results, there exist a lot of similarity measure methods based on network structure, such as CN, Salton, and Jaccard [22]. It can be seen from these results that common neighbor is an important indicator to describe the similarity between nodes. Obviously, the consumers who are interested in similar business tend to select the same service, so their similarity is much higher. However, it is not enough to judge the similarity between consumers only based on the same services chosen by them. Due to losing their evaluations on the services, it is likely to cause misjudgments. In this paper, a factor of evaluation difference [13] is introduced to compute the preference similarity between consumers. The related definitions are given as follows.

Definition 2 (service selection similarity). Let be a set of neighbor nodes of . , , the service selection similarity between and is a function of and , which is defined bywhere is for the number of elements in ; for the number of elements in the intersection of and .

Definition 3 (service evaluation difference). , , the service evaluation difference between and is a function of , , and , which is defined bywhere is for the absolute difference of the evaluations on between and ; for the normalizing factor ( is set to 4 because it is the maximum of the evaluation difference between consumers).

Definition 4 (consumer preference similarity). , , the consumer preference similarity between and is a function of and , which is defined by is directly proportional to and inversely proportional to . When , ; that is, both of and are completely different in service preference. When but , lies completely with . When and , is decided by and .

3.2. Community Discovery

The general idea of community discovery based on network structure is to translate the problem of community discovery into a clustering problem by using node similarity in the network. Regarding the center of cluster as an abstract node, the following part will present the definitions of similarity between node and cluster or between clusters in the clustering process.

Definition 5 (similarity between node and cluster). , , , the similarity between and is defined by where is for the number of elements in .

Definition 6 (similarity between clusters). , , the similarity between and is defined by

A measurement index is needed to evaluate how well the consumer community structure matches the golden standard groups. So an evaluation function for community structure, as the fitness function of the optimization process, is designed to compare the quality of different community partitions and validate different community discovery algorithms. The mostly used evaluation function for community structure, Modularity Q [23], does not apply to evaluate the quality of consumer community structure in the cloud because of only considering node degree, not edge weight. This paper built a quantization function suitable for consumer community structure in the cloud, called community degree. The index is used to look for a rigid network structure with strong ties in a highly dynamic environment. The related definitions of community degree are given as follows.

Definition 7 (community cohesion). , , the community cohesion of is defined byOf these, is the total number of pairs of consumers in .

Definition 8 (community separation). , , the community separation of is defined byOf these, is the total number of pairs of consumers, respectively, from inside and outside .
The definition reflects the interdependence between the nodes in a community and other nodes outside it. For each , the higher its community cohesion is and the lower its community separation is, the stronger its independence is. In other words, a community is a group, in which the nodes have strong ties. The definition of community independence is given by normalization as follows.

Definition 9 (community independence). , , the community independence of is defined byThe closer the value is to 1, the better the independence of the community is.

Definition 10 (community structure). A structure is called a community structure, if and only if satisfies the following conditions:(1);(2);(3), ;(4), , ;(5).

Definition 11 (community degree). For a community structure , the community degree of is defined by Of these, is the number of communities in community structure ; is the number of consumers.
The definition is a typical objective function in clustering to formalize the goal of attaining high intracommunity similarity (consumers within the same community are similar) and low intercommunity similarity (consumers from different communities are dissimilar). The community degree of a community structure is inevitably affected by the community independence and community cohesion of every community in . Besides that, after lots of trials and errors, the community size should also be considered as a factor. Introducing the community size can help to avoid generating too much small communities. For the different community structures and of , if , it shows that is superior to . Typically, after calculating of every community structure in the clustering process, the community structure corresponding to its peak will be optimal that represents the best division for the nodes in .

3.3. Algorithm Description

Through the definitions and equations above, we have completed the modeling for the whole process of consumer community discovery in the cloud. Then we will give an algorithm for the implementation of the model. There are many clustering methods based on node similarity, including partitioning cluster, hierarchy cluster, and density-based cluster. Because it is quite difficult to estimate clustering radius or observe some proper cluster centers in the cloud, this paper adopts a hierarchy cluster algorithm to discover consumer community based on consumer preference similarity. Three primary reasons for this are as follows: (1) In the cloud, there may be some consumer communities with different size and even complex shape. The hierarchy clustering method has stronger recognition ability in this respect [22]. (2) For a completely unknown environment, the hierarchy clustering method need not estimate the number of clusters and clustering radius in advance, which can control different levels of clustering size flexibly. (3) The hierarchy clustering method is not dependent on the choice of empirical parameters and not sensitive to noise data, so it can effectively filter isolated point.

In this paper, an improved agglomerative hierarchical clustering method is proposed, which maps preference similarity to Euclidean distance, and then iteratively merges the most similar pairs until meeting some preset ultimate condition. According to the definitions and equations above, once consumer preference similarities in bipartite network are provided, the clusters can be iteratively merged by the hierarchical cluster until all the consumers are into one community. The whole process forms a clustering tree, in which each layer corresponds to a community structure. The community structure , which maximizes , is the most optimal community structure in the cloud.

In order to improve the efficiency and accuracy of the traditional hierarchical clustering methods, the algorithm introduces multistep mergence and transitive mergence. In the multistep mergence strategy, there may be one or more pairs of clusters to be merged during each round. A threshold for mergence deviation is set up. For every round in the clustering process, select multiple pairs of clusters, whose similarity is not lower than the difference between the highest similarity between clusters and , and then merge them separately. The task of transitive mergence is that when a consumer and other multiple consumers have the same preference similarity, such as , , merge them into one cluster. A detailed description of the algorithm is given by Algorithm 1.

Input: (bipartite network for service selection),
(, , ) (triple collection of consumer preference similarity)
Output: = (community structure which Maximize ())
(1) Initialize community structure in , let , where , for the number of consumers.
(2) Initialize similarity between clusters to output a triple collection (, , ), where
(, , ) = (, , ).
(3) Select one or more pairs of clusters from the triple collection (, , ), whose similarity satisfies the conditions
of multi-step mergence or transitive mergence in this round, and merge them by making use of the two strategies.
(4) Update partition , , where for the number of merged pairs of
clusters in this round.
(5) Update the similarities of these merged clusters and other clusters.
(6) Calculate community degree .
(7) When > 1, repeat Step 2.
(8) Form a clustering tree,
(9) Select community structure which maximizes from , .

4. Service Selection Model Based on Consumer Community

According to human psychological cognitive habits, a consumer’s trust in other consumers who are in the same community with him is higher than the consumers who are in other communities. And a consumer’s trust in the consumers from different community has a certain difference because of different similarity between communities. The formal definition of community trust is given below.

Definition 12 (community trust). The symbol represents that how much consumer trusts the service evaluation of consumer . Let ; the following property is satisfied:

In this section, we will model the process of community trust-driven service selection and give an algorithm for the model.

4.1. Prediction of Service Evaluation

The consumer community structure forms a stable partition of the consumers in the cloud, which embodies different consumer groups with similar preferences. If the service has been chosen and evaluated by a certain amount of consumers who are in the same community with the requester, the requester’s evaluation on the service can be accurately predicted only through the evaluations of the consumers in the same community with him. However, if the bipartite network for service selection is sparse and a predicted service is rarely or has never been chosen by the consumers in the same community with the requester, it is not enough to predict his/her evaluation on the service only through the recommendations from the consumers in his/her community. Thus we need some evaluations from a wider range of the consumers who have interacted with the target service. When predicting the requester’s evaluation on unknown through the consumers in other communities, the similarities between the requester and other communities are regarded as the weights of service evaluations from these communities. For a new cloud service, an initial evaluation, which is less than half of the highest evaluation of this kind of service, is assigned. This is because higher initial value can lead to the attacks of malicious service nodes, while lower initial value can make the service never be chosen. The definitions involved in the process are as follows.

Definition 13 (community public service evaluation). , , the community public service evaluation of community on service is defined bywhere is for the consumers in who have interacted with service , for the number of the consumers in , and for the initial evaluation.

Definition 14 (intracommunity predictive evaluation). , , the intracommunity predictive evaluation of consumer on service is defined by

Definition 15 (intercommunity predictive evaluation). Let be a community structure. , , the intercommunity predictive evaluation of consumer on service is defined bywhere , which is normalized similarity weight between consumer and community .
In order to determine when to use the intracommunity prediction or when to use the intercommunity prediction, the definition of predictive decision parameter is given. If the parameter is greater than a given threshold value, we use the intracommunity prediction; otherwise, we use the intercommunity prediction.

Definition 16 (predictive decision parameter). , , the predictive decision parameter of user to service is defined bywhere is for a set of consumers who has chosen service , for a set of consumers in the community that consumer belongs to, and .

4.2. Service Selection Process and Algorithm

The overall process of service selection model consists of two phases, as shown in Figure 1. The first is consumer community discovery phase, which is the basic of the second phase. In this phase, the system collects service evaluation information to construct a bipartite network for service selection. Then the consumer preference similarity is integrated by calculating the service selection similarity and the service evaluation difference between consumers based on network structure and edge-weight (service evaluation) of the bipartite network for service selection, corresponding to Step 1. Finally, the system utilizes an improved hierarchical clustering algorithm to discover consumer community structure and chooses the optimal community structure by community degree, corresponding to Step 2, Step 3, and Step 4.

The second is service selection phase, which recommends a most satisfactory service to a requester according to his/her preference. The phase includes the following steps. (1) Return the candidate services which meet the requester’s functional requirements; (2) compute each service’s predictive decision parameter. If the parameter is higher than a predefined threshold, intracommunity prediction will be chosen; oppositely, intercommunity prediction will be chosen. (3) Sort listed candidate services according to the predictive evaluations, and then select a service with the highest value to recommend to the requester. The implementation of the process is given by Algorithm 2.

Input: (bipartite network for service selection), (consumer community structure)
   (triple collection of consumer preference similarity), requester , the candidate service list SC
Output: the sorted candidate services
Begin
for each service in SC
If ( is a new service)
Assign an initial evaluation to service ,
   Else
Calculate the predictive decision parameter ;
    If ()    // is a threshold of predictive decision parameter
     Select intra-community prediction to calculate
    Else
     Select inter-community prediction to calculate ;
 Sort the candidate services according to predictive evaluation
  End

5. Experiments and Analysis

The algorithms for the model were implemented by Matlab. The simulation data and public data sets were used to test and analyze the algorithms. The experiments on simulation data sets were used to validate the effectiveness and accuracy of the community discovery algorithm and the service selection algorithm. The experiments on public data sets were used to compare CC-PSM with other two kinds of methods.

5.1. Data Sets

Simulation data set is generated through emulating the consumers’ service selection and evaluation in the cloud environments. Assume that the number of services meeting essential functional requirements of a requester is in the experiments. For the requester, service selection is actually to choose a most satisfactory service from these services. Each service contains four QoS parameters ranged from 1 to 5, whose values are dynamically generated by the random function. Corresponding to different QoS parameters, the weightings of the requester’s preference are, respectively, set to ( and ). All the consumers are divided into four groups, and the consumers in the same group have the same preference. Initially, each consumer randomly selects services and evaluates the selected services according to his/her preference. Then a bipartite network for service selection is gradually formed. Specific parameters in simulation data sets are shown in Table 1.

Public data set, which is the same in structure as service evaluation data in the cloud, is selected to test the feasibility of the service selection algorithm in the cloud. We adopted Epinions dataset which was collected by Paolo Massa in a 5-week crawl. The dataset contains 49,290 consumers who have rated a total of 139,738 different items at least once.

5.2. Experimental Results and Analysis

Three standard metrics were adopted to evaluate CC-PSM, including purity [24], -measure [25], and MAE [26]. Purity and -measure are criteria for the quality of community structure and MAE is a criterion for predicting errors.

5.2.1. Validation of Consumer Community Discovery Model

In this section, the consumer community discovery algorithm was, respectively, applied under different consumer scale. First of all, we randomly generated four kinds of consumer preferences, which made the cloud consumers divided into four groups at the same size. The weightings of consumer preference in each group are enumerated in Table 2. At the same time, set the parameter , , and the others are shown in Table 1.

The experimental results are as shown in Figure 2 and Tables 3 and 4. Figure 2 shows how changes with the changing of the number of consumer communities. As a result of the limitation of display area, only the values of corresponding to the clustering number from 1 to 20 are given. As seen in Figure 2, the value of slowly increases with the reduction of the number of consumer communities in the clustering process, which reflects the fact that some consumers with high similarity have not been clustered into one community. When the value of achieves the peak at a certain clustering number, the consumer community structure is the best optimal. And when the clustering process continues, it can partition dissimilar consumers into one community, which inevitably leads to a rapid decline of . In addition, the experimental results have proved from another aspect that community degree can accurately evaluate the consumer community structure and is of great significance for the effect of consumer community discovery algorithm.

The specific situation of the optimal community structure corresponding to different consumer scale is shown in Table 3. Among them, the comparison between the members in the optimal community structure and the members in actual groups under 100 consumers is given in Table 4. As you can see from the experimental data in Tables 3 and 4, the clustering algorithm for consumer community has a good effect on community detection. The accuracy of the algorithm is not affected by consumer scale, and the purity of the discovered community structure is high.

5.2.2. Analysis on Community Discovery Model

(1) Relation between WD and the Accuracy of Community Structure. In order to observe the sensitivity of the algorithm for preference difference between consumers, the experiments adjusted the weightings of consumer preference in Table 2 to make the preference difference of four groups gradually strengthen. In this group of experiments, set , , the value of varying from 0.1 to 0.5, and the others the same as in Table 1. The experimental results are as shown in Figures 3, 4, and 5.

In Figures 3 and 4, the purity and -measure of the optimal community, respectively, achieve 93% and 77% when . As the increasing of , the accuracy of the community discovery algorithm becomes higher and higher. When , the purity of the community structure achieves 100%. When , the algorithm can detect the original consumer group division. Figure 5 shows how changes with the changing of the clustering number of consumer communities under different . As can be seen from Figure 5, for the simulation data at different , the clustering numbers of the optimal community structure are different. When the preference difference between consumer groups is relatively large, namely, , the value is the same as that in actual situation, while when the difference is not obvious, the value is slightly higher than the number of the original consumer groups because of the existence of some small communities. For example, the number of clusters is 5 when , while the number of clusters is 7 when .

(2) Relation between and the Accuracy of Community Structure. In order to observe the way of the probability affecting the accuracy of the final community structure, the experiment about the relation between and the accuracy of community structure was finished. In the experiment, set , , the value of varying from 0.5 to 0.9, and the others the same as in Table 1. The experimental results are shown in Figures 6, 7, and 8.

Figures 6 and 7 show the purity and -measure of the optimal community structure under different . As shown in Figures 6 and 7, when , 16% of consumers are clustered with dissimilar consumers and -measure is 0.72. With the growth of , the purity and -measure of the optimal community structure become higher and higher. When achieves 0.8, the recognition rate of the algorithm reaches 100%. Figure 8 also shows the algorithm can discover the optimal consumer community structure at that time. Similar to Figure 5, when the consumers in the same group had chosen less same services, the clustering number of the optimal community structure could be higher than that of the original consumer groups because of the existence of some small communities or some wrong partitions.

Through comprehensive analysis on experimental results, the following conclusions can be drawn. (a) The preference difference between consumers will affect the accuracy of the consumer community discovery algorithm. However, on the whole, the recognition rate of the algorithm still remains at a high level. (b) Whether a consumer’s behavior can reflect his/her preference or not, it will have a greater impact on the accurate of the community discovery algorithm than the consumers’ preference differences. In common sense, the consumer should not choose some services which deviate from his/her interests. So CC-PSM is effective when the value of keeps at a reasonable level.

5.2.3. Analysis on Service Selection Model

In this section, the accuracy of CC-PSM is compared with the other two service selection methods. The two methods, respectively, are service selection based on public evaluation and service selection based on nearest neighbors.

Service Selection Based on Public Evaluation (SSPE). The public evaluation of service is the average value of the evaluations from the consumers who have interacted with it. Formally,where is for a set of consumers who have chosen ; for the number of the elements in .

Service Selection Based on Nearest Neighbors (SSNN). The method computes the consumer’s predictive evaluation on a service through his/her -nearest neighbors. The predictive evaluation on service of consumer can be expressed as follows:where is for the number of the nearest neighbors; .

(1) Analysis Based on Simulation Data Set. Based on the data set of 400 consumers in Section 5.2.1, the experiment ran 10 simulations on different candidate services with random quality and evaluated the accuracy of the service selection algorithm by using MAE. The threshold value of predictive decision parameter is set to 0.1. The comparison results are shown in Figure 9. Seen from Figure 9, the MAE of CC-PSM is lower than that of the other two algorithms. And for SSNN, its MAE can be equal with that of CC-PSM, but in most cases, its MAE is significantly higher than that of CC-PSM. In general, prediction error of CC-PSM is still at a lower level.

(2) Analysis Based on Public Data Set. Due to the influence of uncertainty factors, it is inevitable that there is greater error when the algorithm is applied in real data than in simulation data. For example, whether a consumer’s preference is constant or not and whether a consumer’s service evaluation can accurately reflect his/her preference or not will have great influence on the accuracy of selection model. In the experiment, CC-PSM was compared with the above two methods on the public data set; the threshold value of predictive decision parameter is the same as the above. The experiment randomly selected five datasets with 500 consumers and then randomly divided each dataset into two parts of the training set and the test set with partition ratio of . The predictive results based on the training set of the three methods were compared, respectively, with the actual evaluation in the test set. The experimental results are shown in Figure 10. Through Figure 10, you can see that the error of CC-PSM is lower than that of SSPE and SSNN in most cases, so CC-PSM is feasible in application.

6. Conclusions

The cloud service selection is an active research topic in cloud computing. The main drawback of current work is the inability to accurately understand the consumers’ preferences. In the actual service environment, the consumers’ selection and evaluation information for the services objectively reflect the inner correlation between them. If a service is recommended by a stable group in which the consumers have high similarity with a requester, it will win the requester’s higher trust than other services. This viewpoint conforms with our cognition to real situation and has extensive explanatory power. Based on this idea, CC-PSM is built, and the corresponding algorithms are designed. The experimental results have shown that this model and its algorithms achieve the purpose of effectively detecting the consumer community structure and accurately predicting the requester’s service evaluation, which have certain extensibility and adaptive ability.

But, this method still has many insufficiencies. The future research can be done from the following aspects: (1) the efficiency problem of the model; (2) cold start problem of the model; (3) other factors that affect community trust.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors wish to thank Natural Science Foundation of China under Grants nos. 61262082, 61262017, and 61462066, Natural Science Foundation of Inner Mongolia under Grant no. 2015MS0608, and Research Program of Science and Technology at Universities of Inner Mongolia Autonomous Region Grant no. NJZY008.