Abstract

Service-oriented architecture (SOA) is widely used, which has fueled the rapid growth of Web services and the deployment of tremendous Web services over the last decades. It becomes challenging but crucial to find the proper Web services because of the increasing amount of Web services. However, it proves unfeasible to inspect all the Web services to check their quality values since it will consume a lot of resources. Thus, developing effective and efficient approaches for predicting the quality values of Web services has become an important research issue. In this paper, we propose UIQPCA, a novel approach for hybrid User and Item-based Quality Prediction with Covering Algorithm. UIQPCA integrates information of both users and Web services on the basis of users’ ideas on the quality of coinvoked Web services. After the integration, users and Web services which are similar to the target user and the target Web service are selected. Then, considering the result of integration, UIQPCA makes predictions on how a target user will appraise a target Web service. Broad experiments on WS-Dream, a web service dataset which is widely used in real world, are conducted to evaluate the reliability of UIQPCA. According to the results of experiment, UIQPCA is far better than former approaches, including item-based, user-based, hybrid, and cluster-based approaches.

1. Introduction

Service-oriented architecture (SOA) has become a significant tool when building intricate software systems by finding and clustering loosely connected Web services provided by different organizations [1, 2]. Executed by a system engine, e.g., a BPEL engine [3], the component services of such a service-oriented system (SOS) collectively realize the functionality of the system [4] which is often offered as SaaS (Software-as-a-Service) in the cloud environment [5, 6]. Under the help of cloud computing, business models such as e-business and e-commerce with pay-as-you-go model in particular are developing quickly and getting popular, which stimulates the quick development of Web services [7]. The statistics from ProgrammableWeb (http://www.programmableweb.com/), an online Web service directory, demonstrate that the number of published Web services has grown rapidly during the past few years. Because of the widespread use of Web services and SOA, we can build varied SOSs to meet the growing sophisticated business needs of various organizations [8, 9].

The quality of an SOS relies on its component services. To ensure the quality of an SOS, we must be able to predict the quality of the Web services used to compose the SOS, e.g., their response time, throughput, etc. Once Web services’ quality can be predicted, Web service recommendation systems can be built and employed to recommend Web services with appropriate quality values that fulfil system engineers’ quality requirements [10, 11]. Thus, quality prediction for Web services has been an extremely active research field in recent years [1115]. Among many prediction approaches, collaborative filtering has been the most popular and successful one for building recommendation systems [1618]. Memory-based collaborative filtering, which uses known preferences of some users to predict preferences of unknown users, has been widely employed by many researchers for the quality prediction on Web services [12, 13]. It predicts the quality of a given Web service for a target user, by using the historical quality values of Web services which belong to other users. Here, the users of a Web service refer to the systems/applications that use the Web service.

In this research, given a user u, we define the users who have similar opinions as user u on a same set of Web services commonly invoked by themselves and user u as . Given a Web service i, we define the Web services which have left same impression on a same group of users similar to user i as. Accordingly, memory-based collaborative filtering approaches can be further categorized into the following three groups:(1)User-based collaborative filtering: given a user u and a Web service i, user-based collaborative filtering techniques find and use the experience on the quality of Web service i by to predict how user u will like Web service i [19].(2)Item-based collaborative filtering (in the context of this research, terms “item” and “services” are interchangeable): given a user u and a Web service i, item-based collaborative filtering techniques find and use the quality of to predict how user u will like Web service i [20, 21].(3)Hybrid collaborative filtering: combining user-based and item-based techniques, hybrid collaborative filtering uses both and to predict how user u will like Web service i [14, 15].

The fundamental assumption of memory-based collaborative filtering (referred to as collaborative filtering in short hereafter) is that if two users u andrank n items similarly, or do similar things (e.g., buying, watching, listening, and consuming), they will have similar feedback on other items [22, 23]. Thus, the key to accurate predictions based on collaborative filtering is to identify similar users and similar Web services [18, 24]. Several approaches have employed the k-means clustering method to cope with this problem by grouping similar users and similar Web services [2527]. K-means is an unsupervised method, which parts given data points into a number (k) of clusters that can be chosen manually, with one randomly selected initial center for each cluster [28]. K-means-based quality prediction methods are mainly limited in two aspects. Firstly, in order to enable that the corresponding clustering results can show the similarity between users and Web services correctly, it is essential to solve the problem of finding an appropriate k, which can be very difficult. Secondly, the clustering results are closely related to initial centers randomly selected. Thus, the quality predictions of k-means-based methods are unreliable in both stability and accuracy.

Considering this problem, we propose UIQPCA, a novel hybrid User and Item-based Quality Prediction approach for Web services based on a Covering Algorithm without the requirement of a prespecified number of clusters or initial centers. Based on a set of users and Web services, UIQPCA firstly uses a covering algorithm to sort users into different groups according to their similarity in their opinions on the quality of coinvoked Web services. Similarly, UIQPCA then divides partition Web services into groups according to each user’s appraisal of them. Next, to predict target user u’s reflection on the quality of target Web service i, UIQPCA selects a large number of users and Web services based on the results of clustering which have the largest similarity to the user u and the Web service i. Finally, UIQPCA predicts how will user u appraise Web service.

In a word, the major contributions of this research are shown in the following aspects:(1)UIQPCA employs an improved covering-based clustering algorithm to partition similar users and similar Web services into clusters with both effectiveness and efficiency. This clustering method can be executed without a prespecified or the manually selected initial centers. Thus, UIQPCA ensures stable predictions.(2)Based on the clustering results, UIQPCA employs a unique way to quickly identify users and Web services which have the largest similarity to the target user and target Web service. Through this approach, we can determine the most appropriate k value by conducting experiments with different k values, which saves a lot of time by avoiding reclustering users and Web services.(3)We have done a lot of experiments on the basis of the dataset of WS-Dream in order to make a comparison between UIQPCA and eight existing methods. The results demonstrate UIQPCA’s significant advantages over those approaches in effectiveness and efficiency.

The remainder of this paper follows the following organization. Section 2 formally states the research topic, defines relevant concepts, and shows a motivating case. Section 3 discusses the technical details of UIQPCA. Section 4 experimentally compares UIQPCA with existing approaches about their predicting efficacy on the quality of Web services. Finally, Section 5 reviews the connected work and Section 6 summarizes this paper.

2. Issue Introduction and Motivating Example

Firstly, we give formal definition for two concepts, similar users and similar Web services.

Definition 1. Similar Users. Given a user u, we define the users who have similar opinions as user u on a same set of Web services commonly invoked by themselves and user u as .

Definition 2. Similar Web Services. Given a Web service i, we define the Web services which have left same impression on a same group of users similar to user i as .
Given a user u, a Web service i, a set of users U = {u1, u2, …, um}, and a set of Web services I = {i1, i2, …, in}, the predicting process of the quality of Web service i is mainly made up with two parts: (1) to identify from U and from I; (2) to predict user u’s opinion on the quality of Web service i with reference to and . This 2-phase procedure is depicted in Figure 1.
Figure 2 shows the case of 9 users, u1, …, u9, and 9 Web services, i1, …, i9. The table includes response time the users have responded to the Web services, which is called a user-item matrix, represented as , in which m represents the amount of users and n represents the amount of Web services. In the matrix, each element qu,I is the one-dimensional quality value of Web service i when user u invocated i in the past, 1 ≤ u ≤ m, 1 ≤ i ≤ n. Here, qu,i can be the quality value obtained after one invocation or the average quality value obtained after multiple invocations. An empty cell qu,i shows that user u has not invocated web service i, so user u does not know its quality. The set of Web services that user u has invocated in the past is represented by , and the set of users that have invocated a Web service i is represented by . Take the user-item matrix in Figure 2 for example; there are  = {i1, i3, i4, i5} and  = {u1, u2, u5, u6}.
For example, if we want to predict how user u9 will appraise the quality of Web service i4, we can represent the result as q9,4. According to Figure 2, we can get the conclusion that  = {i6, i7, i8, i9}, which means that both users u4 and u9 have used Web services i6, i7, i8, and i9. If users u4 and u9’s past quality experiences on i6, i7, i8, and i9 are similar, they are likely to have similar opinions on the quality of i4. In this way, we can get q4,4, which indicates how user u4 feels about the quality of service i4. Thus, taking into account the similarity between {q4,6, q4,7, q4,8, q4,9} and {q9,6, q9,7, q9,8, q9,9} into account, together with q4,4, q9,4 can be predicted. In the same way, q4,2 can also be predicted similarly. If we look at users u2 and u9, there is , meaning that the services invoked by users u2 and u9, respectively, do not overlap. Based on Definition 1, users u2 and u9 are not similar. Thus, u2 is not helpful and should not be used in the prediction of q9,4. If we want to improve the reliability of predicting q9,4, we need to include as many users which are similar to u9 as we can. Consequently, the identification of users similar to u9, e.g., u4, is of great importance. In the meantime, to exclude users dissimilar to u9, e.g., u2, is also important. The method is also applicable when identifying similar Web services.

3. Prediction Algorithm

In this section, we first present the procedure of UIQPCA and detail its four stages. Then, we analyze the computational complexity of UIQPCA.

3.1. Prediction Procedure

As is shown in Figure 2, given a user u, a Web service i, a set of users U = {u1, u2, …, um}, and a set of Web services I = {i1, i2, …, in}, the process of UIQPCA, an approach for the prediction of how user u will feel about the quality of Web service i, represented as qu,i, consists of four stages, i.e., Stages 1 and 2 in Phase 1 and Stages 3 and 4 in Phase 2:(1)With reference to U’s opinion on the quality of each Web service in I, UIQPCA employs an improved clustering algorithm that follows the minimum covering principal [29] to put similar users and Web services into clusters.(2)Based on the clustering results, UIQPCA selects a group of users analogous to user u from , denoted as , and a group of Web services analogous to Web service I selected from , denoted by .(3)UIQPCA calculates , the quality of Web service i, from user u on the basis of . At the same time, UIQPCA also calculates , based on .(4)UIQPCA predicts qu,i by combining and .

3.2. Stage 1: Clustering

To identify , UIQPCA first partitions the users in U into different groups according to their opinions on the quality of every individual Web service in I. For every Web service i that belongs to I, UIQPCA maps out the p-dimensional quality of all the users in U who have formerly used i into p-dimensional space, in which every dot means the appraisal of the quality of i by a user. Next, UIQPCA employs clustering algorithm to sort the data dots, which is represented as D = {d1, d2, …, }, , into many groups. The users belonging to the same group are similar to each other and those in varied groups are different from each other. In order to describe this iterative algorithm, we refer to the newly formed group after every iteration as current cluster and represent it as Ccr. The center of it, namely, ccr, is called the center of the current cluster. The radius rcr represents the radius of the current cluster. The data dots which do not belong to any group are represented as .(1)Determine the centroid of D, represented as cD = (, …, ), where and xi,p is the coordinate of di on the pth axis in the space.(2)Determine the data point closest to cD in D and let it be the center of the first cluster C1, denoted by (1) as ccr (cr = 1):where xD,j is the coordinate of cD on the jth axis.(3)Calculate rcr by averaging the distances between cc and the data points in Duc with the following equation:(4)Determine the data point furthest from ccr in Duc and let it be the new ccr with(5)Repeat Steps 3 and 4 until no data point can be identified at Step 4, which means that all data points in D belong to a single cluster, respectively.

To present the pseudocode for the aforementioned clustering algorithm, Algorithm 1 is used. This algorithm is executed on for every Web service i in I, i.e., a total of n executions. The clustering results are recorded in an m × m matrix called the user similarity matrix, which is represented by MU. An element represented by in MU, 1 ≤ u, , is the total number of times that users u and were divided into the same cluster in the n executions of the clustering algorithm. After obtaining the result of using the Algorithm 1 to process the user item matrix in Figure 2, the user similarity matrix is shown in Figure 3. Apparently, the user similarity matrix is a symmetric matrix, in which , , 1 ≤ u, , and if .

Input: and i
Output: A set of clusters C1, C2, …
Begin
(1)cr ⟵ 1;
(2)identify cD
(3)do
(4) identify ccr //(1) when cr = 1 and (3) otherwise
(5)Cc.center ⟵ ccr
(6) calculate rcr//(2)
(7)Ccr.radius ⟵ rcr
(8)cr ⟵ cr + 1
(9)while ()
End

An illustrative example is presented in Figure 4 to show the clustering process of Algorithm 1. To cluster the data points, Algorithm 1 performed four iterations to identify four clusters, C1, …, C4, with c1, …, c4 as the centers and r1, …, r4 as the radiuses, respectively. Our clustering algorithm can leave out the process of prespecifying the number of clusters and to some degree resist the impact of the preselected starting point(s) like k-means.

Then, we do analysis on the computational complexity of Algorithm 1 based on the 5-step procedure discussed above. At Step 1, the computational complexity is since there is a maximum of m data points in D. Similarly, the computational complexity of Step 2, Step 3, and Step 4 is also . We need to repeat Step 3 and Step 4 so that all data points in D are processed. At Step 3, the radius of a cluster is the average distances between the cluster center and all data points that are not covered by any clusters. On average, half of the data points uncovered are covered by each newly created cluster. The computational complexity is . In this way, the computational complexity of Algorithm 1 is . For n Web services, Algorithm 1 needs to be executed for n times. So, for all n Web services, the computational complexity of clustering m users is .

Based on a clustering algorithm analogous to Algorithm 1, according to quality values given by public users, the Web services belonging to I can also be put into different groups by UIQPCA. The clustering algorithm is repeated for m times in total, each for one of the m users. The clustering results are recorded in an n × n service similarity matrix denoted by MI. An element in this matrix, namely, yi,j, (1 ≤ i, j ≤ n), means the total number of times that Web services i and j were put into the same cluster during the m executions of the clustering algorithm. Like MU, MI is also a symmetric matrix. Similar to Algorithm 1, the computational complexity of the algorithm for clustering Web services is . As a whole, the computational complexity of the whole process of clustering at Stage 1 in Figure 1 is  +  = .

3.3. Stage 2: Selection

After partitioning similar users and Web services into different groups, we choose users similar to user u and Web services similar to Web service i. In order to select, those users and Web services that have the greatest similarity to user u and Web service i, respectively, should be prioritized. The two aspects reflecting the similarity between different users are as follows: (1) common quality experiences, namely, the similarity of their opinions on those Web services that they have both used formerly; (2) common Web services, namely, the number of Web services that they have both used formerly. The algorithm introduced in Section 3.2 has addressed the first issue. Given a Web service i which both user u and user have used, they will be put into the same group when they have similar opinions on i by executing Algorithm 1 for Web service i. If a Web service has not been used by both the two users, then the Web service will not be included in the process of Algorithm 1. Next, we have to come to the second issue.

If a Web service is used in large amount by both the two users, then its impression on the two users plays an important role in computing their similarity, which acts as a basis of collaborative filtering [22]. The filtering also follows the principle of statistical significance that high statistical significance excludes the impact of chance on the result. For instance, if two users have invocated a good many Web services and have experienced very similar or even the same response time, they will most likely experience similar or even the same response time for other Web services. The fact that they are located at the same regional network and use the same network infrastructure when approaching the same Web services can be a possible but not the sole explanation. Conversely, if the number of Web services used by the both two users is quite limited, then even if they have had similar opinions on the quality of these Web services, it is uncertain that they will react similarly to other Web services. For instance, if the number of Web services coevoked by two users is as small as one, then their similar response time or quality opinion on this Web service can be caused by accident.

Therefore, while choosing users who are similar to user u, UIQPCA first considers the number of Web services that those users and user u have both used. The similarity of users is determined according to the results from Stage 1. Given a set of users, we execute the clustering algorithm for many times for every coused Web service on the basis of these users’ feelings about the quality of the services. That is to say, users having similar experiences at each service can be found through UIQPCA. Suppose that there are two users, u1 and u2; if they are always partitioned into a same cluster, it means that they are quite similar in their experiences on all the services, which in turn shows a high similarity between them. Then, we figure out the frequency that the two users are put into a same cluster in Stage 1 in order to analyze their similarity.

According to the aforementioned MU, namely, the m × m users’ similarity matrix in Section 3.2, containing the results of users clustering, an element , 1 ≤ u, , is the total number of times that users u and have been partitioned into the same cluster. A high value of shows that both users u and users have used large number of Web services and have experienced similar quality experiences on those Web services. Take user u1 in Figure 3 as example. User u2 should have the highest priority during the selection of users similar to u1. Then, user u5 should be in the second place. The priorities of u4, u6, and u8 are lower compared with u2 and u5, but when be compared with u3, u7, and u9, they are higher. Different weights are given to different users with reference to their similarity to u1 in many ways when choosing similar users. For instance, a set of weights, , , …, , can be allocated to users u2, u3, …, u9, respectively, to indicate their priorities, where, , , and. The top (1 ≤ kU ≤ m − 1) users that have been divided into the same cluster for the most times can be selected by UIQPCA.

Given a Web service i, UIQPCA uses a similar approach to the one that is used in selecting similar users to select top kI (1 ≤ kI ≤ n − 1) Web services similar to Web service i. To select similar users and similar Web services, use k (kU or kI) as a domain-specific parameter. Different applications and datasets are often characterized with their own features and consequently have varied optimal k values. For one thing, if the k value is too small, then not all the useful information on the users can be employed while predicting quality. For another, if the k value is too large, those dissimilar users will be included and thus influence the accuracy of quality prediction. Considering the importance of its accuracy, based on the experiences and/or experiments, the value of k should be set in a specific domain. In Section 4.5.3, the effects of k on the accuracy of predicting will be studied with two datasets which are used in real world.

Now, we discuss the computational complexity of the selection process. In order to select the top k users or Web services, we need to sort the elements of each entry in MU and MI. The computational complexity of sorting process depends on the used clustering algorithm. In this paper, the computational complexity of comparison sorting algorithms is used in the worst-case scenario, namely, , for each entry in MU and MI. Overall, the computational complexity of selecting similar users and similar Web services at Stage 2 is .

3.4. Stage 3: Prediction

After getting the experiences on quality from the selected kU users similar to user u, namely, , the UIQPCA method executes (4) an approach based on users to make predictions on how user u will feel about the quality of Web service i with a weighted average of the quality experienced by the kU users:where is the number of times users u and have been partitioned into the same cluster regarding the user similarity matrix MU and is the quality of Web service i according to user .

Similar to the user-based approach, QPCA is based on items in predicting how user u will appraise the quality of Web service i using formula (5) with a weighted average of the quality of the selected kI services which are analogous to Web service i, i.e., :where xi,j is the number of times Web services i and j have been divided into the same cluster obtained from the service similarity matrix MI and qu,j is the quality of Web service j experienced by user u.

The computational complexity of (4) is since there are a maximum of users in . Likewise, the computational complexity of (5) is . As a whole, the computational complexity of the quality prediction at Stage 3 is .

3.5. Stage 4: Combination

Equations (4) and (5) are employed for the prediction of the quality of a Web service based on similar users and similar Web services, respectively. Then, we have to deal with the cases where only a few similar users or similar Web services are available because of the limited scale of the user-item matrix. In these cases, quality prediction by only similar users with (4) or only similar Web services with (5) is likely to have less accuracy for the insufficiency of data in the user-item matrix. If so, we need to use both the results obtained from both the user-based and item-based approaches. To address this issue, we combine (4) and (5) to get prediction results with more accuracy:where parameter λ (0 ≤ λ ≤ 1) is a parameter determined by the user of UIQPCA that indicates how much qu,i relies on similar users and similar Web services, i.e., (4) and (5). Parameter λ is a domain-specific parameter. Its optimal value depends on the features of the datasets. In reality, the number of users and Web services that are really similar to the given user and the target Web service may vary significantly from one domain to another. There is no single λ value that can optimize the prediction accuracy for all datasets in all domains. Thus, parameter λ needs to be determined empirically and domain specifically. Different applications and datasets are characterized with their own features. Accordingly, varied λ values are required to achieve optimal prediction results. A rule of thumb is to specify parameter λ according to the ratio of over . For example, if , the λ value should be close to 1.0, putting most weight on similar users. If , the λ value should be close to 0, putting most weight on similar Web services. The impact of λ on the accuracy of UIQPCA is studied experimentally with the real-world dataset in Section 4.5.2. In particular, UIQPCA employs λ = 1, if and and (6) is equivalent to (4). UIQPCA employs λ = 0, if and and (6) is equivalent to (5). To summarize,where null indicates that prediction is not available without similar users and similar Web services.

The complexity of the combination process at Stage 4 is since it is only a simple addition of the prediction results obtained from Stage 3 using (4) and (5).

3.6. Analysis on Computational Complexity

In this section, we will discuss the computational overhead of UIQPCA with a comparison with quality prediction approaches based on Pearson correlation coefficient (PCC) [14, 15, 19, 21] and k-means [25]. Considering a m × n user-item matrix which includes the quality values of n services used by m users, the computational complexity of user-based, item-based, and hybrid PCC prediction methods is , , and , respectively [14]. The computational complexity of the k-means-based prediction approach is [25].

The procedure of UIQPCA consists of four steps. As discussed at the end of Sections 3.23.5, the computational complexity of the operations at Stages 1, 2, 3, and 4 is , , , and respectively. In consequence, the computational complexity of UIQPCA as a whole is . The details of the experimental studies on the computational complexity of different prediction approaches are demonstrated Section 4.6.

4. Experimental Evaluation

In this section, we analyze how effective (using prediction accuracy to measure) and efficient (using computational overhead to measure) UIQPCA is. Sections 4.14.3 describe the dataset, the comparing methods, and the metrics used in the experiments, respectively. Section 4.4 introduces the setup of the experiments. Sections 4.5 and 4.6 present and discuss the effectiveness and efficiency of UIQPCA. In Section 4.7, the potential negative factors affecting the reliability of our evaluation will be discussed.

4.1. Dataset Description

Existing Web service recommendation approaches are based on either service quality or service ratings. Accordingly, we have conducted a series of experiments with the real-world dataset, WS-Dream (https://github.com/wsdream).

4.1.1. WS-Dream

This dataset contains the two-dimensional quality values of 5,825 real-world Web services, which includes response time as well as throughput. The data were gathered from 339 distributed users’ 1,974,6754 invocations on those Web services. This dataset has been used in a lot of important and representative research [21, 25] and has become the real dataset in research on quality prediction for Web services. To compare the performance of our method with existing approaches in a fair, objective with reproducible experimental results, we also used the WS-Dream dataset in our experiments, like many other researchers. In the experiments, two 339 × 5,825 user-item matrices were built based on the records in WS-Dream. The first one has 339 entries, each containing the response time of one of the 339 users when using the 5,825 Web services. The second matrix has the same structure as the first one, but contains the throughput of the Web services experienced by the users.

4.2. Comparing Methods

There are two main methods for personalized recommendation based on collaborative filtering, user-based and item-based filtering [30]. It is significantly important to identify the most similar users or items. Accordingly, we have implemented three prediction methods based on our algorithm introduced in Section 3:(i)UQPCA (User-based Quality Prediction with Covering Algorithm): this user-based method employs QPCA to cluster users and identifies similar users and then predicts those unshown quality values of unused Web services according to identified similar users.(ii)IQPCA (Item-based Quality Prediction with Covering Algorithm): this item-based method employs QPCA to cluster Web services, identifying Web services which are similar, and then predicts the unknown quality values of Web services that have not been used according to determined similar Web services.(iii)UIQPCA (Hybrid User and Item-based Quality Prediction with Covering Algorithm): this method combines UQPCA and IQPCA to predict the missing quality values of Web services on the basis of both similar users and Web services.

To evaluate UIQPCA, we compared UQPCA, IQPCA, and UIQPCA with the following advanced prediction methods:(i)UPCC (User-based Collaborative Filtering using PCC) [19]: this method identifies similar users using PCC and predicts the missing quality values of Web services that have not been used with the help of identified similar users.(ii)IPCC (Item-based Collaborative Filtering using PCC) [20]: this method also identifies similar Web services (items) using PCC and predicts the missing quality values of Web services that have not been used according to identified similar Web services.(iii)UIPCC (Hybrid User and Item-based using PCC) [15]: combining UPCC and IPCC, this method predicts the missing quality values of Web services that have not been used by taking both similar users and similar Web services into consideration.(iv)TOSEM [31]: this method improves UIPCC with an enhanced PCC for calculating the similarity between users as well as Web services.(v)NIMF [32]: this method employs a Neighborhood-Integrated Matrix Factorization method for collaborative and web service quality values prediction for personal uses.(vi)UCAPK [25]: this User-based Credibility Aware Prediction method employs K-means to sort out users, identifying similar users, and then predicts the missing quality values of Web services that have not been used according to similar users.

UCAPK follows the same pattern as our prediction method UQPCA—to group users in order to identify similar users. With the purpose of enabling that the comparison is sufficient, we employed the following two prediction methods as well:(i)ICAPK: this Item-based Credibility Aware Prediction method uses K-means method to group Web services, finding similar Web services, and predict the quality values of unknown Web services on the basis of similar Web services.(ii)UICAPK: combining UCAPK and ICAPK, this method makes predictions on those unknown quality values of Web services that have not been used on the basis of both similar users and similar Web services.

4.3. Metrics

We compare the effectiveness of different methods, which is reflected in their accuracy of prediction. We measure the accuracy with two metrics: mean absolute error (MAE) and root mean squared error (RMSE). MAE is defined as follows:where denotes the real quality values of service i experienced by user u, represents the predicted quality value, and N is the amount of all predicted quality values. A MAE value indicates the average difference between a prediction result and the corresponding experienced value. All the individual differences are given different weights equally when calculating MAE.

RMSE is defined as follows:

When calculating RMSE, the individual differences between the prediction result and the corresponding observed values are each squared and then averaged over the sample. The square root of the average is taken as the final result. Since all errors are squared before being averaged, RMSE is quite effective in showing large errors. Therefore, if we want to avoid large prediction errors, RMSE will be a more useful and practicable method.

Both MAE and RMSE vary from 0 to . They are both negatively oriented scores, which means that the lower the value is, the higher the accuracy of predicting is. They are both used widely when doing research on the prediction of Web service [14, 15, 21, 25].

To compare the efficiency of UIQPCA with other methods, we measure the computational overheads of all methods.

4.4. Experimental Setup

We employ the cross validation technique [33] to set up the experiments, which has also been employed in the start-of-the-art research [14, 15, 21, 25]. In an experiment on a user-item matrix built from WS-Dream, we randomly removed a certain amount of entries from the matrix according to the matrix density, which is the percentage of unremoved entries in the matrix. Different prediction methods are employed to predict the unknown values in the matrix. The prediction results are then evaluated against the original (and real) values of the removed Web services to obtain the MAE and RMSE for each prediction method. This way, we can compare the prediction accuracy of our methods, UQPCA, IQPCA, and UIQPCA with the other methods.

In the experiments on the WS-Dream dataset, the matrix density is changed from 0.2 to 0.4 in steps of 0.05, λ (see (6)) from 0.1 to 0.9, and k (see Section 3.3) from 1 to 5 for the comparison of the accuracy of the results obtained by different prediction methods, as well as the impacts of λ and k on UIQPCA. Through this method, the ability of different methods to cope with datasets under different parameter settings that simulate datasets of various characteristics is evaluated extensively.

All experiments are conducted on a machine with Intel Core i7-4970 3.6 GHZ, 16 G RAM, running Ubuntu 14.04 LTS.

4.5. Effectiveness

This section compares the accuracy of predicting of UIQPCA with existing prediction methods under different parameter settings.

4.5.1. Accuracy of Prediction

Table 1 demonstrates the MAE and RMSE achieved by using different prediction approaches in the experiments on the WS-Dream dataset under different matrix density with λ = 0.6 and k = 2. Thus, the presentation and discussion of those results are not included in this section.

As demonstrated by Table 1, UIQPCA, based on similar users and Web services, obtains better accuracy than all eight existing prediction methods for both response time (RT) and throughput (TP) in different matrix density settings. The advantages of UIQPCA significantly surpass the existing approaches. With regard to response time prediction, the average margins are 31.3%, 31.1%, 25.3%, 25.7%, 30.6%, 24.1%, 25.2%, and 19.7% in MAE and 15.3%, 12.9%, 7.7%, 2.5%, 9.7%, 3.7%, 7.9%, and 7.6% in RMSE versus UPCC, IPCC, UIPCC, UCAPK, ICAPK, UICAPK, TOSEM, and NIMF respectively. And for throughput prediction, the average margins are 30.7%, 52.2%, 38.8%, 23.9%, 46.6%, 18.1%, 34.6%, and 3.8% in MAE and 32.3%, 40.0%, 32.5%, 30.7%, 38.4%, 26.8%, 33.2%, and 5.6% in RMSE. As the matrix density increases from 0.2 to 0.4, UQPCA, IQPCA, and UIQPCA’s performances increase similarly as the other methods, as indicated by the corresponding decreases from 0.368 to 0.341 in MAE and 1.221 to 1.161 in RMSE.

4.5.2. Impact of λ

The experimental results presented in Section 4.5.1 have demonstrated that by using the information on both similar users and similar Web services, UIQPCA achieves a better prediction accuracy than UQPCA and IQPCA. In this section, the impact of parameter λ on UIQPCA is discussed. As discussed in Section 3.5, when λ = 0, UIQPCA is equal to IQPCA as it only considers similar Web services. Likewise, while λ = 1.0, UIQPCA is equal to UQPCA. Thus, the λ value can be varied from 0.1 to 0.9 to evaluate its impact on UIQPCA. In the experiments on the WS-Dream dataset, the matrix density is 0.2 and k = 2. The results of other experiments under different parameter settings are similar and thus are omitted.

The results are shown in Figure 5. It demonstrates that the λ value has great influence on the prediction accuracy of UIQPCA. Overall, the prediction accuracy improves with the increase of λ before going up to its optimum. Then, the prediction accuracy reduces in spite of the continuing increase of the λ value. This indicates that it is essentially important to find an accurate and proper λ value. A proper λ value enables the most accurate prediction since it makes sure that the information on users and web services will combine properly. Figure 5 also shows that the optimal λ value is not static. Instead, it varies from one dataset to another. Take RMSE for example. Figures 5(b) and 5(d) show that the RMSE of the prediction results is the lowest when λ = 0.5 for the WS-Dream dataset. Our findings in the experiments confirm that the optimal λ value is domain-specific, as discussed in Section 3.5.

4.5.3. Impact of k

As discussed in Section 3.3, parameter k is used to determine the numbers of similar users and/or similar Web services to be included in the prediction. Intuitively, the more similar users and/or Web services are considered when making predictions, the more information that can be employed to predict and the more accurate the result will be in consequence. However, this is not necessarily true. As shown in Figure 6, the increase in the k value does not always ensure more accurate prediction. In a word, with the continuing increase of k value, the accuracy of prediction increases from the very beginning until its maximum value and then begins to go down. This tells us that we do not always want to consider as many users and/or Web services as possible to make accurate predictions with UIQPCA. Similar to our findings about λ presented in Section 4.5.2, k must be proper. In the case of a too small k value, the information on similar users and Web services will be insufficient, which is shown in Figure 6 by the reduction in MAE and RMSE in both series of experiments while the k value begins to increase from 1. Secondly, when the k value is too large, users and/or Web services dissimilar to a certain user may be included, which will negatively influence the accuracy of predication to a large extent. Take Figures 6(a) and 6(c) for example. The MAE of the predictions increases from its lowest point at 0.355 to its highest point at 0.376 as k increases from 2 to 6 in the experiments. Furthermore, the optimal k value also varies from one dataset to another. As shown in Figure 6, in the experiments on the WS-Dream dataset, the optimal k values are 2 for MAE and 3 for RMSE. Our experimental findings about parameter k confirm that our analysis of parameter k in Section 3.3—its optimal value is domain-specific and varies from one dataset to another.

4.5.4. Impact of Matrix Density

In Section 4.5.1, for different approaches, the accuracy of prediction is compared when the matrix density settings are different. In this section, the impact of matrix density on UIQPCA is discussed. The experimental results shown in Figure 7 show the great impact of matrix density on the accuracy of prediction by UIQPCA. In all cases, both MAE and RMSE decrease with the increase of matrix density. In a dataset, the higher the matrix density is, the more information on users and Web services can be used for predicting, which in turn guarantees a more accurate result of prediction.

4.6. Efficiency

Besides the effectiveness measured by prediction accuracy, the efficiency of a prediction method is also critical regarding its feasibility. A prediction method is highly effective; however, the time consumption might not be practical. We have discussed how to determine the complexity of UIQPCA in Section 3.6. To experimentally evaluate the efficiency of UIQPCA, we compare the computation time of UIQPCA against UIPCC, UICAPK, and TOSEM. Those methods, including UIQPCA, consider both similar users and similar Web services when doing predictions and thus are more time consuming than their item-based and user-based versions. Thus, the computation time of their user-based and item-based versions is not presented. The computation time of UIQPCA consists of two major components: clustering time and prediction time. Here, the clustering component refers to Stage 1 discussed in Section 3.2 and the prediction component refers to Stage 3 discussed in Section 3.4. Accordingly, we compare the clustering time of UIQPCA with that of UICAPK in Figure 8(a), the only comparing method that has a clustering component with k-means. In Figure 8, WSD refers to the WS-Dream dataset, RT refers to response time, and TP refers to throughput. We also compare the prediction time of UIQPCA against UIPCC, UICAPK, and TOSEM in Figure 8(b). In the experiments on the WS-Dream dataset, the parameter settings are matrix density = 0.2, k = 2, and λ = 0.6.

Figure 8(a) shows that UIQPCA spends much less time than UICAPK on clustering the data points in the WS-Dream dataset. Specifically, it only takes 87 milliseconds to cluster the data points in the WS-Dream dataset based on the response time of the Web services, 64% less than UICAPK’s 244 milliseconds. Based on throughput, UIQPCA only takes 83 milliseconds, 66% less than UICAPK’s 245 milliseconds. The results show that UIQPCA is far more efficient than UICAPK in clustering the datasets. In particular, the clustering of a dataset with UIQPCA is a once-off operation, unless the dataset is updated. After the clustering is completed, the results can be employed to predict any Web services for any users. Thus, UIQPCA is a practical prediction method, especially in large-scale scenarios where a large number of data points need to be clustered.

Figure 8(b) compares the computation time taken by UIQPCA to make predictions against UIPCC, UICAPK, and TOSEM. Overall, all four methods take only 1–3 milliseconds to make a prediction. On average, UIQPCA and UICAPK take a similar amount of time to make a prediction, much less time than UIPCC and TOSEM. On the WS-Dream dataset, UIQPCA takes approximately half of the time taken by UIPCC and TOSEM to make a prediction. Thus, in scenarios on a much larger scale or when extremely fast predictions are required, UIQPCA is a better choice since it is sufficiently more efficient than UIPCC and TOSEM.

4.7. Threats to Validity

Here, we point out those factors that are possible to negatively influence the results our evaluation of the validity of UIQPCA, including the construct validity, external validity, internal validity, and conclusion validity.

4.7.1. Threats to Construct Validity

The comparison of success rate with the chosen prediction approaches is a major threat to the construct validity of our evaluation. The chosen prediction approaches are conducted on the basis of collaborative filtering technique, which enjoys the greatest popularity and application in current days. There are other methods, e.g., model-based methods [14] and semantics-based methods [13], which are not employed in the evaluation. This threat, however, is not significant since UIQPCA can be compared with those methods in an indirect way by referring to the evaluation shown in relevant literature, e.g., [13, 14], with the approaches used in the evaluation as a reference. The other major threat to the construct validity of our evaluation lies in the insufficiencies of consideration for aspects like timeliness [34] and locations [12] during the prediction process. This threat is also not significant because the consideration of such aspects enhances a prediction method but does not change the fundamental mechanism. The consideration of timeliness and locations can be included in UIQPCA in ways similar to [12, 34]. However, a direct comparison between UIQPCA and other methods in the same category of prediction ways is more direct and typical.

4.7.2. Threats to External Validity

The representativeness of the dataset employed in the evaluation, which might not be as representative and exact when it comes to applications in reality, is the major threat to the external validity in our evaluation. To reduce the effects of this threat to the least, we selected the dataset that is frequently used for experiments on prediction methods for Web services, namely, the WS-Dream dataset [21, 25]. In this way, we could compare UIQPCA with existing methods in a fair and objective manner. In the experiments, we changed the matrix density and parameter λ to simulate datasets with different characteristics and inspected their impacts on UIQPCA. The experimental results demonstrate the effectiveness of UIQPCA on datasets with similar characteristics. This also plays a significant role in reducing the threat to the external validity of our evaluation.

4.7.3. Threats to Internal Validity

The main threat to the internal validity of our evaluation is the comprehensiveness of our experiments. Since the length of this paper is limited, only part of the results of experiments under some parameter settings are presented, which means that we cannot show the results after combining more matrix density and λ values. However, this threat is not believed to be significant. It is possible that the exact values of MAE and RMSE achieved from the experiments under other parameter settings will be different, but the advantages of UIQPCA over the compared methods are similar as presented in the paper.

4.7.4. Threats to Conclusion Validity

The inadequate number of statistical tests—chi-square tests—plays a major role in negatively influencing the validity of conclusion in our evaluation. Chi-square tests were applicable in drawing conclusions when evaluating UIQPCA. But we conducted the experiments for 100 times in each set and averaged the results every time we changed the parameter setting instead, which provided a large amount of test cases and thus tended to lead to a small value in the chi-square tests and decrease the practical importance of the test results [35]. However, the number of repetition of experiments, when compared with that of the observation samples that concern Lin et al. in [35], is quite small. That is to say, the threat to the conclusion validity caused by the lack of statistical tests might be high in an insignificant way.

Web service recommendation has been a hot topic of research in recent years. Collaborative filtering-based prediction is currently the mainstream method and has been studied intensively. Over the past years, a lot of researchers have used clustering techniques to improve the collaborative filtering-based prediction methods. Many related works have been done regarding this issue.

5.1. Collaborative Filtering-Based Methods

Collaborative filtering-based methods for Web service recommendation have continued to attract many researchers’ attention recently [3638]. Shao et al. [19] proposed a Web service recommendation method on the basis of collaborative filtering technique. The method first calculates the similarity measured according to PCC between users by taking both their historical feelings about the quality of Web services. Next, it predicts the quality values of the target services on the basis of the information on those users similar to the target user. The approach is implemented in our experiments as UPCC to be compared with UIQPCA. We also implemented a prediction method called IPCC, which is only on the basis of similar Web services. This method was employed by Amazon [39] and has undergone comprehensive investigation and enjoyed great popularity [20]. The principle of this method is to identify similar Web services based on PCC and then make predictions on the unknown quality values of Web services that have not been used on the basis of identified similar Web services.

If predictions are made only on the basis of similar users similar to UPCC under the circumstance that the number of similar users is inadequate, then the result of them can be unreliable and inaccurate. This also applies to IPCC if the number of similar Web services is inadequate. To solve this problem, Zheng et al. [15] proposes a hybrid prediction method that combines information on both similar users and similar Web services when making predictions. This method identifies not only similar users but also similar Web services according to their similarity measured by PCC. It increases the prediction accuracy to a large extent. Later on, the authors improved their own prediction method presented in [15] with an enhanced similarity computation approach that employs a logistic function to reduce the influence of a small set of Web services coinvoked by dissimilar users [14]. This approach is used as TOSEM in our experiments for comparison with UIQPCA. UIPCC [15] and TOSEM [14] have become the most investigated methods in the field of Web service recommendation. Many researchers have tried to make improvement on them in different ways. To name a few, Zhao et al. [34] put forward a time-aware QoS prediction model which integrates the timeliness of historical QoS data into the process of Web service recommendation. The proposed model cuts the historical QoS data into time slices, each represented by a 2-dimensional matrix to be processed dividedly. Next, it makes predictions on the current QoS values of the target Web services. Yao et al. [13] attempted to unify collaborative filtering and content-based Web service recommendations. Their method considers both the quality values and semantic contents of Web services with the help of probabilistic generative model. Users’ preferences for Web services are represented as a set of latent variables, which can be statistically estimated. This way, Web services that are estimated to be preferable to the target user are used to make predictions. Chen et al. [12] proposed a method that employs both the quality values and the location of Web services to make predictions. To estimate a user’s location, its IP address is taken into consideration. Similar users are sorted out for their similar opinions on the quality of Web services as well as their locations. This way, a location-aware Web service recommendation system can be built. Wang et al. [37] pointed out a reputation measurement approach, which can avoid the quality comments on web services made by those users who bear biases or malice in their minds. It first identifies those biased and malicious quality ratings by using the cumulative sum control chart and then decreases the influence of subjective users’ quality ratings using PCC. The accuracy of prediction can thus be improved.

With the purpose of making improvements on prediction accuracy, the authors of [14, 15] proposed the use of a matrix factorization model to enable the combination of both regional information of similar users and worldwide information of the whole user-service matrix in prediction [32]. This method is implemented as NIMF in experiments for comparison with UIQPCA. Many research studies have been following this piece of work to allow MF-based quality prediction for Web services to accommodate more sophisticated scenarios. For example, Chen et al. proposed an incremental tensor factorization method by considering the temporal information [10]. You et al. proposed a method that models the high-dimensional QoS data as tensors with an important tensor operation for the purpose of predicting the unknown QoS property values [16].

5.2. Clustering-Based Methods

The key to accurate predictions for approaches based on collaborative filtering is to identify similar users and similar Web services. In order to find out similar users and Web services, researchers have tried to sort different users and Web services before making predictions.

Zhu et al. [40] employed the hierarchical clustering technique to put users and Web services into groups with regard to their quality experiences and quality data, respectively. Their prediction approach requires the collection of real-time QoS data, which is not trivial and thus often impractical. Chen et al. [12] also employed the hierarchical clustering technique to put users into different regional groups regarding their locations and opinions on quality. Web services are grouped only on the basis of the similarity in their quality values. Chen et al. [41] employed the agglomerative hierarchical clustering technique to cluster Web services with reference to their historical QoS data. As a result, Web services within the same cluster are under similar physical environments. At the later stage where users similar to the target user are identified, those Web services within the same cluster are treated as one Web service. In this way, the accuracy and efficiency of the prediction can be improved. However, their method ignores the information on similar users. Zhang et al. [42] pointed out an approach that adopts a fuzzy clustering method to cluster users. The approach makes good use of fuzzy clustering as well as PCC. The approach is mainly limited by the significantly important centroids, which are selected at the very beginning and influential to the clustering results. The k-means algorithm has also aroused lots of researchers’ interest as a way for clustering users and Web services because it is popular and easy to execute. Yu and Huang proposed CluCF, an approach [26] that clusters users and Web services on the basis of k-means algorithms and pays great attention to reducing the complexity of updating clusters when new users and new Web services are introduced. Similar to [26], Wu et al. [25] also used the k-means algorithm to cluster users, but with a different purpose, namely, to screen out those unreliable users. The prediction method presented in [25] is implemented as UCAPK and extended to ICAPK and UICAPK in our experiments to be compared with our prediction methods. The heavy reliance on the prespecified number of clusters and the centroids selected at the very beginning largely limits the predicting methods which takes k-means as its basis.

The authors of this paper put forward a new method to predict the quality of Web services—UIQPCA—to solve the problems of existing approaches with a unique covering-based clustering technique. It is highly efficient without the need of specifying the number of clusters and initial centroids in advance. The experimental results demonstrate that it is far more effective and efficient than existing prediction methods.

6. Conclusion

The authors of this paper propose UIQPCA as a unique way of quality prediction with the help of covering algorithm. UIQPCA clusters users and Web services based on their opinions on quality and historical quality data, respectively. Given a target Web service for a target user, the similar users and Web services are found out on the basis of clustering results. Then, UIQPCA employs the information on both similar users and Web services to make predictions. The consequences of the experiments on the real-world dataset demonstrate that UIQPCA significantly surpasses existing representative prediction methods, including user-based methods, item-based methods, hybrid methods, and clustering-based methods. It is also demonstrated that UIQPCA surpasses existing methods in integrating and predicating.

Because of the development of cloud computing and big data analytics [4348], the Web services ecosystem is becoming more complex and large [4, 4956]. For further studies, we will do research on how to ensure and improve the efficiency of UIQPCA in large-scale scenarios on the Spark platform.

Data Availability

The QoS data used to support the findings of this study can be accessed publicly in the website https://wsdream.github.io/dataset/wsdream_dataset1.html.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 61872002) and the Natural Science Foundation of Anhui Province of China (no. 1808085MF197).