Robust and Privacy-Preserving Service Recommendation over Sparse Data in Education

Chen, Xuening; Liu, Hanwen; Xu, Yanwei; Yan, Chao

doi:https://doi.org/10.1155/2019/2401857

Wireless Communications and Mobile Computing

On this page

Abstract Introduction Related Work Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

IoT Big Data Analytics

View this Special Issue

Research Article | Open Access

Volume 2019 | Article ID 2401857 | https://doi.org/10.1155/2019/2401857

Robust and Privacy-Preserving Service Recommendation over Sparse Data in Education

Xuening Chen,¹Hanwen Liu,²Yanwei Xu,³and Chao Yan²

Guest Editor: Qingchen Zhang

Received20 Nov 2018

Revised28 Dec 2018

Accepted28 May 2019

Published20 Jun 2019

Abstract

Service recommendation has become one of the most effective approaches to quickly extract insightful information from big educational data. However, the sparsity of educational service quality data (from multiple platforms or parties) used to make service recommendations often leads to few even null recommended results. Moreover, to protect sensitive business information and obey laws, preserving user privacy during the abovementioned multisource data integration process is a very important but challenging requirement. Considering the above challenges, this paper integrates Locality-Sensitive Hashing (LSH) with hybrid Collaborative Filtering (HCF) techniques for robust and privacy-aware data sharing between different platforms involved in the cross-platform service recommendation process. Furthermore, to minimize the “False negative” recommended results incurred by LSH and enhance the success of recommended results, we propose two optimization strategies to reduce the probability that similar neighbours of a target user or similar services of a target service are overlooked by mistake. Finally, we conduct a set of experiments based on a real distributed service quality dataset, i.e., WS-DREAM, to validate the feasibility and advantages of our proposed recommendation approach. The extensive experimental results show that our proposal performs better than three competitive methods in terms of efficiency, accuracy, and successful rate while guaranteeing privacy-preservation.

1. Introduction

With the advent of the Web of Things (WoT), tremendous computing resources or services (e.g., web APIs) are emerging rapidly on the Web [1–4], imposing a heavy burden on the service selection decisions of target users in education domain. In this situation, various lightweight service recommendation techniques, e.g., Collaborative Filtering (CF), are proposed to alleviate the abovementioned service selection burdens. Typically, by analysing the historical service usage data (e.g., the quality data of services invoked by users), a recommender system can capture the personalized preferences of a user and output appropriate services to him/her; this way, complex requirements from the user could be satisfied [5–7].

However, in the big data environment, the recommendation bases for educational decisions are sometimes not centralized but are distributed across multiple platforms [8, 9]. Considering the example in Figure 1, user u₁ invoked web service s₁ from platform P₁ and user u₂ invoked web service s₂ from platform P₂. Thus previous service quality values of s₁ and s₂ are recorded in platforms P₁ and P₂, respectively. In this situation, to make comprehensive and accurate service recommendations to the target user, it is necessary for the recommender system to integrate or fuse the distributed educational data across platforms P₁ and P₂ properly.

However, there are still several challenges in the abovementioned data integration process. First, to protect sensitive business information [10, 11] and obey laws, platform P₁ is often reluctant to share its data with P₂ and vice versa [12]. Such a cross-platform data sharing failure severely impedes the subsequent service recommendations. Besides, the possible sparsity of service quality data [13, 14] stored in platforms P₁ and P₂ often leads to few (even null) recommended results, which decreases the target user’s satisfaction degree significantly. Namely, the robustness of the recommender system is not as high as expected.

Considering the drawbacks, a time-efficient and privacy-preserving neighbour search technique, Locality-Sensitive Hashing (LSH), is employed for cross-platform service recommendations, so that the multiple platforms involved in the distributed recommendation process can share their data with each other efficiently and securely. Furthermore, we combine the LSH technique with hybrid CF (i.e., HCF, including user-based CF and item-based CF), to propose a novel privacy-preserving cross-platform service recommendation approach, named . Benefiting from the advantages of LSH in terms of search efficiency and privacy-preservation, our proposal can achieve a good trade-off among recommendation efficiency, accuracy, successful rate, and privacy-preservation.

In summary, our contributions are three-fold.

(1) We integrate the LSH technique with hybrid CF to guarantee efficient and secure data sharing between different platforms involved in the cross-platform service recommendations in a big educational data environment.

(2) Two solutions are suggested to reduce the probability of “False negative” (i.e., high-quality recommended results are overlooked by mistake) incurred by the inherent shortcoming of LSH and thereby increase the success ratio of recommended list.

(3) A wide range of experiments are conducted on a real distributed service quality dataset, i.e., WS-DREAM to validate the feasibility of our proposal. Experiment results show that our proposed approach outperforms the other state-of-the-art approaches.

The remainder of this paper is structured as follows. Related work is presented in Section 2. In Section 3, we introduce the preliminary knowledge of the LSH technique to be used in our approach. In Section 4, we introduce the details of our proposed privacy-preserving cross-platform service recommendation approach, i.e., . In Section 5, a set of experiments are conducted on WS-DREAM dataset to validate the feasibility of our proposal. Finally, in Section 6, we conclude the paper and discuss the future research directions.

To the best of our knowledge, the existing privacy-preservation techniques adopted in the field of service recommendations can be divided into the following four categories: K-anonymity, data obfuscation, data decomposition, and Locality-Sensitive Hashing. Next, we introduce the related work from these four perspectives, respectively.

2.1. K-Anonymity

As an effective privacy-preservation technique, K-anonymity is successfully applied in [15] to protect the sensitive data of users. The authors in [16] recruit K-anonymity technique to generalize the location information that users left in the past so as to protect the users’ location privacy when making a recommendation decision. Generally, a larger K value often means better privacy-preservation performance. However, when the K value becomes larger, the availability of anonymous data would be reduced significantly, thereby decreasing the accuracy of recommended results.

2.2. Data Obfuscation

Random data obfuscation technique is proposed in various applications where the real service quality data are replaced by the obfuscated data so that the private information hidden in real service quality data can be protected. However, as the data used to make recommendation decisions have already been obfuscated beforehand, the service recommendation accuracy is reduced accordingly. Differential Privacy (DP) technique is recruited in [17] to obfuscate the sensitive service quality data by noise injection so as to hide the real service quality data when making service recommendation decisions. However, the time complexity of Differential Privacy technique is often high. Besides, when the service quality data for recommendation decisions are updated frequently, the accumulated noise amount will become increasingly larger; in this situation, the data availability is reduced, which influences the accuracy of returned results to some extent.

2.3. Data Decomposition

In [18], the authors propose a data decomposition mechanism to achieve the privacy-preservation goal in service recommendations. Concretely, each sensitive quality data is transformed to be multiple segments with less privacy information; afterwards, these service quality segments with little privacy are sent to different user clients for storage. Thus when a user requests service recommendations, the multiple service quality segments kept by each user client are integrated together for subsequent recommendation decision-making process. As each user only possesses multiple service quality segments from different quality data, instead of the whole service quality data, the sensitive information from users is secured, while this approach still fails to secure certain privacy, for example, the intersection of services executed by different users.

2.4. Locality-Sensitive Hashing (LSH)

As an effective technique for quick neighbour search from massive and high-dimensional data, LSH has recently been introduced into service recommendation for privacy-preservation. In our previous work [19, 20], the LSH technique is combined with user-based CF to protect the sensitive service quality data engaged in recommendation process. In [21], LSH is recruited to build service indices in the distributed environment, so as to reduce the cross-platform data communication cost and improve the recommendation efficiency. However, these LSH-based service recommendation approaches do not consider the low successful rate incurred by the possible sparsity of recommendation data. Moreover, they seldom study the “False negative” recommended results as well as the corresponding resolutions.

With the above analyses, we can conclude that existing privacy-preserving service recommendation approaches either fall short in the efficiency and the capability of privacy-preservation, or they probably overlook high-quality recommended results so that the users’ satisfaction degree is decreased. Considering these drawbacks, we integrate the LSH technique and hybrid CF in this paper to propose a novel privacy-preserving service recommendation approach named . The details of our proposal will be introduced in Section 4.

3. Preliminary Knowledge

In Section 3.1, we first formulate the privacy-preserving service recommendation problems to be addressed in this paper. Afterwards, in Section 3.2, we briefly introduce the rationale of the LSH technique to be used in our service recommendation approach.

3.1. Problem Formulation

To facilitate the following discussions, we introduce the symbols used in this paper below. and mean user set and service set, respectively; u_target and s_target denote a target user and a target service (i.e., a service preferred by the target user), respectively; q is a quality dimension of web services, e.g., response time or throughput (for simplicity, only one quality dimension is considered in this paper); q_i,j denotes the quality of q of service s_j (∈WS) ever-invoked by user u_i (∈U) and the q_i,j data are often distributed across different platforms in the big data environment.

With the above formulation, our focused privacy-preserving service recommendation problems can be specified more formally as follows: recommend appropriate services from set WS to target user u_target based on the historical q_i,j data across different platforms and meanwhile protect the real value of q_i,j so that the users’ private information hidden in q_i,j data is still secure.

3.2. Locality-Sensitive Hashing

Locality-Sensitive Hashing has been considered as one of the most effective techniques for similar neighbour search due to the following two properties [22]. Here, A and B are two points in original data space, and h(.) denotes a LSH function that is responsible for transforming points A and B into corresponding hash values h(A) and h(B), respectively.

Property 1. If A and B are close in original data space, then they will be projected into the same bucket (i.e., h(A) = h(B)) after hashing with high probability.

Property 2. If A and B are not close in original data space, then they will be projected into different buckets (i.e., h(A) ≠ h(B)) after hashing with high probability.

Thus, inspired by these two properties, we can utilize the hash values h(A) and h(B) (with little or no privacy) to evaluate the approximation degree of original points A and B, without revealing the details of A and B. This way, the private information of points A and B can be protected.

4. : Service Recommendation Based on LSH and Hybrid CF

In this section, we introduce our proposed privacy-preserving service recommendation approach, i.e., . Concretely, in Section 4.1, we utilize the LSH technique and user-based CF to make service quality prediction; in Section 4.2, we utilize the LSH technique and item-based CF to make service quality prediction. Finally, in Section 4.3, we integrate the predicted results of Sections 4.1 and 4.2 and then make service recommendations accordingly.

4.1. Service Quality Prediction Based on LSH and User-Based CF

In this subsection, we utilize user-based CF and LSH to look for a target user’s similar neighbours (denoted by set Neighbour_set(u_target)) in a privacy-aware and scalable manner, and then the method makes service quality prediction based on the derived similar neighbours in Neighbour_set(u_target).

First, for any u∈U, the quality data over dimension q are simply converted into an n-dimensional quality vector = (q_u,1, …, q_u,n). Here, q_u,j denotes the quality value of q of s_j invoked by user u (typically, q_u,j = 0 if user u did not rate s_j in the past) and n is the number of candidate web services. Next, we introduce how to utilize the LSH technique to transform vector with much private information into corresponding user index h(u) with little privacy, based on a pre-selected LSH function h(.).

Concretely, the concrete forms of LSH function h(.) heavily rely on the “distance” for user similarity measurement; in other words, different types of similarity “distance” correspond to different kinds of LSH functions. The Pearson Correlation Coefficient (PCC) is often utilized to calculate user similarity in existing recommender systems, so we choose the LSH function corresponding to the PCC distance in this paper. More concretely, the LSH function h(.) in (1) is adopted [23]. Here, is an n-dimensional vector (v₁, …, v_n), where v_j (1 ≤ j ≤ n) is a random value in the range ; symbol “” represents the dot product between two vectors. This way, through (1), we can transform with much privacy into a Boolean value h(u) with little privacy.

As LSH is essentially a probability-based similar neighbour search technique, one hash function h(.) is often not enough for finding the similar neighbours of a target user accurately. In view of this observation, we amplify the performance of LSH by adopting r hash functions and L hash tables into the similar neighbour search processes. Concretely, in each hash table, we can build an index for user u, denoted by H(u) = (h₁(u), …, h_r(u)). Furthermore, two users u₁ and u₂ are regarded as similar iff condition in (2) holds, where H_x(u₁) and H_x(u₂) denote the indices of u₁ and u₂ in the x-th hash table (i.e., Table_x), respectively.

Likewise, for the target user, i.e., u_target, we can calculate his/her user index value H_x(u_target) in Table_x (x = 1, …, L), according to the same LSH functions and LSH tables. Then, through the condition in (2), we can determine the similar neighbours of u_target and put them into set Neighbour_set(u_target). The pseudocode of the above neighbour search process is presented in Algorithms 1 and 2, where Algorithm 1 is used to build the L hash tables for users offline and Algorithm 2 is used to search for the similar neighbours of the target user.

Inputs:

L: number of LSH tables
r: number of LSH functions
Output: Table₁, …, Table_L
For x = 1, …, L do // build hash tables offline
For k = 1, …, r do
For i = 1, …, m do
Build user sub-index h_k(u_i) based on random LSH function h_k(.)
For i = 1, …, m do
Build user index H_x(u_i) = (h₁(u_i), …, h_r(u_i))
Return hash table Table_x constituted by all the “” mappings

Inputs: u_target // a target user

Output: Neighbour_set(u_target)
For x = 1, …, L do
Find the bucket bt corresponding to H_x(u_target) in Table_x
If u_i∈bt and u_i ≠ u_target
Then put u_i into Neighbour_set(u_target)
Return Neighbour_set(u_target)

However, as LSH is a probability-based neighbour finding technique, the “False negative” search results are inevitable. In other words, some similar neighbours of a target user may be overlooked by mistake according to the abovementioned LSH-based neighbour search process. In view of this drawback, we propose two optimization strategies to reduce the “False negative” probability and improve the successful rate of neighbour search. Next, we introduce these two strategies, respectively.

Strategy 1 (neighbour propagation (for users)). The neighbour relationship between different users is essentially depicted by the similarity of user preferences, while the latter (i.e., user preference similarity) obeys a kind of propagation rule. Let us consider the example in Figure 2 where three users and four web services are present. The user-service ratings (1~5) are shown in Figure 2(a), according to which we can determine the neighbour relationship between u₁ and u₂ as well as the neighbour relationship between u₁ and u₃. In this situation, we can infer that u₂ and u₃ are possible neighbours (marked with dotted line in Figure 2(b)) as both of them hold the same or similar preferences with u₁, although u₂ and u₃ are not direct neighbours based on the user-service rating data in Figure 2(a). This way, through the neighbour propagation rule illustrated in Figure 2, we can find more possible neighbours of a target user (in an indirect manner) so as to reduce the “False negative” probability.

(a) User-service rating matrix

(b) Neighbouring user propagation

Next, we introduce how to integrate the neighbour propagation strategy (for users) into the abovementioned LSH-based neighbouring user search process. Concretely, if users u_a and u_target are projected into an identical bucket in any of the L hash tables and users u_a and u_b are projected into an identical bucket in any of the L hash tables, then according to the neighbour propagation strategy (for users), we can infer that u_b is a possible neighbour of u_target and put u_b into Neighbour_set(u_target). The pseudocode of Strategy 1 is presented in Algorithm 3.

Inputs:
u_target // a target user
Neighbour_set(u_i) // before neighbour propagation (for users)
Output: Neighbour_set(u_target) // after neighbour propagation (for users)
For each u_a∈Neighbour_set(u_target) do
For each u_b∈Neighbour_set(u_a) do
If u_b Neighbour_set(u_target)
Then put u_b into Neighbour_set(u_target)
Return Neighbour_set(u_target)

Strategy 2 (condition relaxation for neighbour search (for users)). According to the inherent characteristic of the LSH technique, the number of hash functions (i.e., r) plays an important role in the neighbour search process. Generally, a larger r value often means stricter filtering condition for neighbour search and thereby leads to higher probability of “False negative” search results. Considering this, we relax the search condition for neighbours of the target user to reduce the “False negative” probability. Next, we introduce the concrete condition relaxation process.

According to the neighbour search condition in (2), u_i is regarded as a neighbour of u_target iff H(u_i) = H(u_target) holds in any hash table, where H(u_i)= (h₁(u_i), …, h_r(u_i)) and H(u_target) = (h₁(u_target), …, h_r(u_target)). Namely, all the r bit values in H(u_i) are required to be equal to the r bit values in H(u_target), respectively. Hence, to relax the search condition for neighbours of u_target and guarantee high similarity between u_target and his/her neighbours u_i, one bit difference between the indices of u_target and u_i is permitted.

For example, if H_x(u_target) = (1, 1, 1) holds in hash table Table_x, then the neighbour u_i’s index in Table_x is permitted to be (0, 1, 1) or (1, 0, 1) or (1, 1, 0). In other words, any user whose index is equal to (0, 1, 1) or (1, 0, 1) or (1, 1, 0) is a possible neighbour of u_target. This is the main idea of our proposed search condition relaxation strategy (for users). The pseudocode of Strategy 2 is presented in Algorithm 4, where H_x(u_target)_k denotes the relaxed search condition (i.e., k-th bit is different from that of H_x(u_target)) for u_target in Table_x; for example, if H_x(u_target) = (1, 1, 1), then H_x(u_target)₂ = (1, 0, 1) holds.

Inputs:
u_target // a target user

Output: Neighbour_set(u_target) // after condition relaxation for neighbour search (for users)
For x = 1, …, L do
For k = 1, …, r do
H_x(u_target) = H_x(u_target)_k
Neighbouring-user-search (u_target, TB) // Algorithm 2
Return Neighbour_set(u_target)

Through Strategies 1 and 2, we can obtain an enlarged set of neighbours of the target user, i.e., Neighbour_set(u_target). Next, we make service quality prediction based on the elements in Neighbour_set(u_target). Concretely, for web service s_j never invoked by u_target before, its predicted quality over dimension q by u_target, denoted by q_target,j, can be calculated by

4.2. Service Quality Prediction Based on LSH and Item-Based CF

Similar to Section 4.1, in this subsection, we first utilize item-based CF and LSH techniques to look for the similar services (named “neighbouring services”) of target service s_target (denoted by set Neighbour_set(s_target)) in a privacy-aware and scalable manner, and the techniques then make service quality predictions based on the elements in Neighbour_set(s_target).

First, for any web service s_j∈ WS, its historical quality data over dimension q ever-invoked by users can be specified by an m-dimensional vector = (q_1,j, …, q_m,j), where q_i,j (1 ≤ i ≤ m) denotes the quality value of q of service s_j invoked by user u_i and m is the number of users. Next, we utilize the LSH technique to transform with private information into a corresponding service index h(s_j) with little privacy, based on the random LSH function h(.) in (1). Here, is an m-dimensional real vector (v₁, …, v_m), where v_i (1 ≤ i ≤ m) is a random value in the range .

This way, we can transform with much privacy into a Boolean value h(s_j) with little privacy.

Likewise, we amplify LSH through integrating r hash functions and L hash tables . Then, in each hash table, we build an index for service s_j, denoted by H(s_j) = (h₁(s_j), …, h_r(s_j)). Furthermore, two services s₁ and s₂ are regarded as neighbouring services if the condition in (4) holds where H_x(s₁) and H_x(s₂) denote the indices of services s₁ and s₂ in the x-th hash table (i.e., Table_x), respectively.

Then, through (4), we can find out the neighbouring services of s_target and put them into Neighbour_set(s_target). Note that if multiple target services are present, then it is necessary to repeat the above process for each target service to discover all the qualified neighbours. The pseudocode is presented in Algorithms 5 and 6, where Algorithm 5 is used to build the L hash tables for services offline and Algorithm 6 is used to search for the neighbouring services of the target service (repeat Algorithm 6 if multiple target services are present).

Inputs:

L: number of LSH tables
r: number of LSH functions
Output: Table₁, …, Table_L
For x = 1, …, L do // build hash tables offline
For k = 1, …, r do
For j = 1, …, n do
Build service sub-index h_k(s_j) based on random LSH function h_k(.)
For j = 1, …, n do
Build service index H_x(s_j) = (h₁(s_j), …, h_r(s_j))
Return hash table Table_x constituted by all “” mappings

Inputs: s_target // a target service

Output: Neighbour_set(s_target)
For x = 1, …, L do
Find the bucket bt corresponding to H_x(s_target) in Table_x
If s_j∈bt and s_j ≠ s_target
Then put s_j into Neighbour_set(s_target)
Return Neighbour_set(s_target)

However, similar to Section 4.1, “False negative” search results are also inevitable; in other words, certain real neighbours of a target user are probably deemed as non-neighbors. Considering the drawback, Strategies 3 and 4 (actually the variants of Strategies 1 and 2 in Section 4.1) are proposed to reduce the “False negative” probability.

Strategy 3 (neighbour propagation (for services)). Let’s consider the example in Figure 3 where four users and three web services are present. The user-service ratings (1~5) are shown in Figure 3(a), according to which we can determine that s₁ and s₂ are neighbouring services and s₁ and s₃ are neighbouring services. In this situation, we can infer that s₂ and s₃ are possible neighbouring services (marked with dotted line in Figure 3(b)). Thus through the propagation rule illustrated in Figure 3, we can obtain more neighbouring services of a target service so that the “False negative” probability is reduced.

(a) User-service rating matrix

(b) Neighbouring service propagation

Next, we introduce how to integrate the neighbour propagation strategy (for services) into the abovementioned LSH-based neighbouring service search process. Concretely, if services s_a and s_target are projected into an identical bucket in any of the L hash tables and services s_a and s_b are projected into an identical bucket in any of the L hash tables, then according to the neighbour propagation strategy (for services), we can infer that s_b is probably a neighbouring service of s_target and hence put s_b into Neighbour_set(s_target). The pseudocode of Strategy 3 is presented in Algorithm 7.

Inputs:
s_target // a target service
Neighbour_set(s_j) // before neighbour propagation (for services)
Output: Neighbour_set(s_target) // after neighbour propagation (for services)
For each s_a∈Neighbour_set(s_target) do
For each s_b∈Neighbour_set(s_a) do
If s_b Neighbour_set(s_target)
Then put s_b into Neighbour_set(s_target)
Return Neighbour_set(s_target)

Strategy 4 (condition relaxation for neighbour search (for services)). Similar to Strategy 2, in Strategy 4, we relax the search condition for neighbouring services of the target service to reduce the “False negative” probability of search results. Concretely, according to the neighbouring service search condition in (4), service s_j is regarded as a neighbouring service of s_target iff all the r bit values in H(s_j) are equal to the r bit values in H(s_target), respectively. Therefore, to relax the search condition for neighbouring services of s_target and meanwhile guarantee the high similarity between s_target and its neighbouring services s_j, one bit difference between the indices of s_target and s_j is permitted.

For example, if condition H_x(s_target) = (1, 1, 1) holds in hash table Table_x, then any service whose index is equal to (0, 1, 1) or (1, 0, 1) or (1, 1, 0) is a possible neighbouring service of s_target. This is the main idea of our proposed search condition relaxation strategy (for services). The pseudocode of Strategy 4 is presented in Algorithm 8, where H_x(s_target)_k denotes the relaxed search condition (i.e., the k-th bit is different from that of H_x(s_target)) for s_target in Table_x; e.g., H_x(s_target)₂ = (1, 0, 1) holds if H_x(s_target) = (1, 1, 1).

Inputs:
s_target //a target service

Output: Neighbour_set(s_target)//after condition relaxation for neighbour search (for services)
For x = 1, …, L do
For k = 1, …, r do
H_x(s_target) = H_x(s_target)_k
Neighbouring-service-search (s_target, TB) //Algorithm 6
Return Neighbour_set(s_target)

Through Strategies 3 and 4, we can obtain an enlarged set of neighbouring services of the target services, i.e., Neighbour_set(s_target). Next, for each service s_j never invoked by u_target, its predicted quality over dimension q rated by u_target, denoted by q_target,j, is calculated by equation (5) where s_j∈Neighbour_set(s_target). Here, q_{target,target} denotes the real service quality of s_target observed by u_target. Furthermore, if service s_j appears multiple times in Neighbour_set(s_target), then the average predicted quality is adopted.

4.3. Aggregation of Predicted Service Quality and Service Recommendation

We aggregate the two pieces of quality data predicted by (3) and (5) into a comprehensive quality in (6). Here, q_user and q_item denote the q_target,j values predicted in (3) and (5), respectively; α and β (0 ≤ α, β ≤ 1 and α + β = 1) are the aggregation coefficients. At last, we choose the service s_j whose predicted value (i.e., q_target,j in (6)) is the best and return it to u_target.

5. Experiments

In this section, we deploy a group of experiments to validate the feasibility of our proposed approach in terms of service recommendation efficiency, accuracy, and successful rate. Concretely, in Section 5.1, we introduce the experiment dataset and configurations that we adopted for experiments; in Section 5.2, experiment comparison results are presented; in Section 5.3, further discussions are given.

5.1. Experiment Configurations

Our experiments are based on a real distributed web service quality dataset WS-DREAM [24] that collects real-world service quality data from 339 users on 5825 web services (hosted in different countries). Each country that hosts a group of services is considered to be an individual platform for recommendation scenario simulation. Additionally, partial real values in the dataset are dropped for prediction needs. Moreover, only one quality dimension of services, i.e., response time, is considered in our experiments for simplicity. The target user is selected randomly from the user set in WS-DREAM, whose invoked services are regarded as the target services recruited in Section 4.2.

In order to validate the feasibility of our proposed approach, we test the time cost and MAE of our proposal and compare them with three other state-of-the-art recommendation approaches including UPCC [25], P-UIPCC [17], and PPICF [18]. Concretely, UPCC is the benchmark service recommendation approach that is based on user-based CF; P-UIPCC utilizes the “divide-merge” operations over sensitive service quality data; while in PPICF, the real service quality data is transformed into the obfuscated data and then the obfuscated data are used to make service quality prediction and service recommendations.

The experiments were conducted on a Dell laptop with 2.80 GHz processors and 2.0 GB RAM. The machine runs Windows XP and JAVA 1.5. Each experiment was carried out ten times, and the average experimental results were adopted finally.

5.2. Experiment Results and Analyses

In the experiments, five profiles are tested and compared to validate the feasibility of our proposal. Here, denotes the density of the user-service quality matrix recruited to make service recommendations; L and r denote the number of hash tables and the number of hash functions, respectively; α = β = 0.5 holds in (6).

Profile (computational time of four approaches w.r.t. ). Next, we measure the service computational time for recommendation process and scalability of four approaches with respect to matrix density . Concrete experimental parameters are set as follows: is varied from 5% to 25%, L = 10 and r = 14 hold. Experimental results are shown in Figure 4.

As the experimental results in Figure 4 indicate, the computational time of the four different approaches all increase with the growth of service quality matrix density, i.e., , because all the user-service quality data need to be considered in the four approaches and, therefore, more computational time is often required when the quality matrix becomes denser (i.e., when grows). However, our proposed approach outperforms the other three approaches in terms of recommendation efficiency and scalability because most jobs in our approach (e.g., user indices building) can be done offline before a service recommendation request arrives, while the time complexity of the remaining jobs (e.g., online neighbour search) is rather small. So generally, our proposal can satisfy the quick response requirements of target users.

Profile (accuracy of returned results by four approaches w.r.t. ). We test and compare the recommendation accuracy (i.e., MAE, the smaller the better) of four approaches. The following are the experiment parameter settings: is varied from 5% to 25%, L = 10, and r = 14. Concrete comparison results are presented in Figure 5.

Figure 5 indicates that the accuracy of returned results by P-UIPCC and PPICF are not high; the reason is that, in order to secure the sensitive user privacy, the service quality data engaged in recommendation process have already been obfuscated in UIPCC and PPICF, while our approach performs better than the other three approaches in terms of recommendation accuracy; this is because only the “most similar” neighbouring users and neighbouring services can be returned by LSH and recruited to make service recommendations. Therefore, the recommendation accuracy is improved considerably.

Profile (recommendation efficiency of w.r.t. and ). In this profile, we test the recommendation efficiency of our approach with respect to L and r. The parameters are set as follows: = 25%, L is varied from 6 to 14, and r is varied from 8 to 14. Experimental results are shown in Figure 6.

(a)

(b)

As shown in Figure 6(a), the time cost of our proposal increases approximately with the growth of L, as all the L hash tables need to be traversed in order to find the similar neighbours of the target user by (2) and find the similar neighbouring services of the target services by (4), respectively, while Figure 6(b) shows that the time cost decreases when r grows. This is because a larger r value often means stricter search condition for neighbouring users or neighbouring services; and therefore, few search results are obtained when r is large; in this situation, less time is needed to evaluate and rank the few search results.

Profile (recommendation accuracy of w.r.t. and ). We test the recommendation accuracy of our proposed approach with respect to L and r. The following are the experimental parameter settings: = 25%, L is varied from 6 to 14, and r is varied from 8 to 14. The experiment results are offered in Figure 7.

As Figure 7 shows, the recommendation accuracy of increases (i.e., MAE drops) with the decrease of L and the growth of r. This is because a smaller L value or a larger r value often means stricter search condition for neighbouring users and services; in this situation, only the “most similar” neighbouring users or neighbouring services are returned to make service recommendations. Therefore, the recommendation accuracy is improved accordingly.

Profile (recommendation successful rate comparison). LSH is essentially a probability-based similar neighbour search technique; therefore, our proposed LSH-based service recommendation approach cannot always guarantee to return a satisfying recommended result to the target user. In other words, recommendation failure is inevitable. However, as discussed in Section 2, the hybrid CF method can reduce the failure rate to some extent. Therefore, in this profile, we test the recommendation successful rate of our proposal and compare it with the following two benchmark approaches: (i.e., the DistSR_LSH approach in [26]) and .(1): integrate LSH with user-based CF(2): integrate LSH with item-based CF

Here, we define the successful rate of a recommendation approach as the ratio between the successful recommendation times and the total recommendation times (∈[0, 100%]). Parameters settings are = 25%, L = 1, and r is varied from 8 to 14. Concrete experimental results are presented in Figure 8.

As Figure 8 shows, the successful rates of three recommendation approaches all decrease with the growth of r. This is because a larger r value often means stricter filtering condition for the search of neighbouring users and neighbouring services; and, therefore, the successful rate of recommendations is reduced accordingly. Namely, there is a trade-off between successful rate and r; specifically, when r is large enough (e.g., when r = 14, 15…), the successful rate approaches 0. However, as Figure 6 shows, our approach still outperforms the other two approaches in terms of successful rate as our approach recruits hybrid CF for recommendation, integrating the advantages of both user-based CF and item-based CF.

5.3. Further Discussions

Our experiments only adopt one service quality dimension, i.e., response time, without considering the probably existed multiple dimensions [27–37] and their respective weight significance values [38–44]. In the future research, we will integrate the dimension and weight information into to make the approach more comprehensive. Besides, only one type of service quality data is considered in the experiments. So in the future, we will further extend our proposal by considering the possible data diversity in the big data environment [45–50].

6. Conclusions

Collaborative service recommendation has become an effective technique to quickly extract insightful information from big educational data. However, traditional service recommendation approaches often assume that the service usage data used to make recommendations are centralized, without considering the multisource property of service usage data as well as the privacy leakage risks during the multisource educational data integration. Besides, existing service recommendation approaches often suffer from low robustness due to the possible data sparsity. In view of these drawbacks, we combine the LSH technique and hybrid Collaborative Filtering (HCF) for distributed service recommendations in the big data environment. Furthermore, to minimize the “False negative” recommended results incurred by the inherent shortcoming of LSH, two solutions are introduced in this paper, to reduce the probability that similar users and similar services are overlooked by mistake and thereby enhance the success rate. A wide range of experiments deployed on real-world dataset shows the performances of in terms of efficiency, accuracy, and successful rate while securing the sensitive user information.

However, only one quality dimension of web services is considered in the recommendation model, which is often not enough for the practical recommendation requirements. In the future, we will further refine our work by considering multiple quality dimensions as well as their linear correlations [51–53] and nonlinear correlations [54–58]. Besides, data type diversity is another challenge in the big data environment. Therefore, in the future research, we will continue to extend our proposal by integrating the multisource data with diverse data types, e.g., discrete data [59–63], binary data [64], and fuzzy data [65–67].

Data Availability

The [web service quality] data used to support the findings of this study have been deposited in the [WS-DREAM] repository (http://inpluslab.com/wsdream/)

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This paper is partially supported by the Natural Science Foundation of China (No. 61872219).

References

X. Wang, L. T. Yang, X. Xie, J. Jin, and M. Jamal Deen, “A cloud-edge computing framework for cyber-physical-social services,” IEEE Communications Magazine, vol. 55, no. 11, pp. 80–85, 2017.
View at: Publisher Site | Google Scholar
Q. Zhang, M. Lin, L. T. Yang, Z. Chen, S. U. Khan, and P. Li, “A double deep q-learning model for energy-efficient edge scheduling,” IEEE Transactions on Services Computing, 2018.
View at: Publisher Site | Google Scholar
L. Qi, P. Dai, J. Yu, Z. Zhou, and Y. Xu, “"Time-location-frequency”-aware internet of things service selection based on historical records,” International Journal of Distributed Sensor Networks, vol. 13, no. 1, pp. 1–9, 2017.
View at: Publisher Site | Google Scholar
Q. Zhang, L. T. Yang, Z. Chen, P. Li, and F. Bu, “An adaptive droupout deep computation model for industrial IoT big data learning with crowdsourcing to cloud computing,” IEEE Transactions on Industrial Informatics, 2018.
View at: Publisher Site | Google Scholar
X. Wang, W. Wang, L. T. Yang, S. Liao, D. Yin, and M. J. Deen, “A distributed HOSVD method with its incremental computation for big data in cyber-physical-social systems,” IEEE Transactions on Computational Social Systems, vol. 5, no. 2, pp. 481–492, 2018.
View at: Publisher Site | Google Scholar
K. Dou, B. Guo, and L. Kuang, “A privacy-preserving multimedia recommendation in the context of social network based on weighted noise injection,” Multimedia Tools and Applications, pp. 1–20, 2017.
View at: Google Scholar
L. Qi, W. Dou, and J. Chen, “Weighted principal component analysis-based service selection method for multimedia services in cloud,” Computing, vol. 98, no. 1, pp. 195–214, 2016.
View at: Publisher Site | Google Scholar
L. T. Yang, X. Wang, X. Chen et al., “A multi-order distributed HOSVD with its incremental computing for big services in cyber-physical-social systems,” IEEE Transactions on Big Data, 2018.
View at: Publisher Site | Google Scholar
X. Wang, L. T. Yang, H. Liu, and M. J. Deen, “A big data-as-a-service framework: state-of-the-art and perspectives,” IEEE Transactions on Big Data, vol. 4, no. 3, pp. 325–340, 2018.
View at: Publisher Site | Google Scholar
Q. Zhang, L. T. Yang, A. Castiglione, Z. Chen, and P. Li, “Secure weighted possibilistic c-means algorithm on cloud for clustering big data,” Information Sciences, vol. 479, pp. 515–525, 2018.
View at: Publisher Site | Google Scholar
S. Zhang, G. Wang, M. Z. Bhuiyan, and Q. Liu, “A dual privacy preserving scheme in continuous location-based services,” IEEE Internet of Things Journal, vol. 5, no. 5, pp. 4191–4200, 2018.
View at: Publisher Site | Google Scholar
L. Qi, X. Zhang, W. Dou, C. Hu, C. Yang, and J. Chen, “A two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platform edge environment,” Future Generation Computer Systems, vol. 88, pp. 636–643, 2018.
View at: Publisher Site | Google Scholar
L. Qi, X. Xu, W. Dou et al., “Time-aware IoE service recommendation on sparse data,” Mobile Information Systems, vol. 2016, Article ID 4397061, 12 pages, 2016.
View at: Publisher Site | Google Scholar
L. Qi, Z. Zhou, J. Yu, and Q. Liu, “Data-sparsity tolerantweb service recommendation approach based on improved collaborative filtering,” IEICE Transaction on Information and Systems, vol. E100D, no. 9, pp. 2092–2099, 2017.
View at: Publisher Site | Google Scholar
S. Zhang, X. Li, Z. Tan, T. Peng, and G. Wang, “A caching and spatial K-anonymity driven privacy enhancement scheme in continuous location-based services,” Future Generation Computer Systems, vol. 94, pp. 40–50, 2019.
View at: Publisher Site | Google Scholar
S. Zhang, X. Mao, K. R. Choo, T. Peng, and G. Wang, “A trajectory privacy-preserving scheme based on a dual-K mechanism for continuous location-based services,” Information Sciences, 2019.
View at: Publisher Site | Google Scholar
J. Zhu, P. He, Z. Zheng, and M. R. Lyu, “A privacy-preserving qos prediction framework for web service recommendation,” in Proceedings of the IEEE International Conference on Web Services (ICWS '15), pp. 241–248, New York, NY, USA, July 2015.
View at: Publisher Site | Google Scholar
D. Li, C. Chen, Q. Lv et al., “An algorithm for efficient privacy-preserving item-based collaborative filtering,” Future Generation Computer Systems, vol. 55, pp. 311–320, 2016.
View at: Publisher Site | Google Scholar
Y. Xu, L. Qi, W. Dou, and J. Yu, “Privacy-preserving and scalable service recommendation based on simhash in a distributed cloud environment,” Complexity, vol. 2017, Article ID 3437854, 9 pages, 2017.
View at: Publisher Site | Google Scholar
W. Gong, L. Qi, and Y. Xu, “Privacy-aware multidimensional mobile service quality prediction and recommendation in distributed fog environment,” Wireless Communications and Mobile Computing, vol. 2018, Article ID 3075849, 8 pages, 2018.
View at: Publisher Site | Google Scholar
C. Yan, X. Cui, L. Qi, X. Xu, and X. Zhang, “Privacy-aware data publishing and integration for collaborative service recommendation,” IEEE Access, vol. 6, pp. 43021–43028, 2018.
View at: Publisher Site | Google Scholar
A. Gionis, P. Indyk, and R. Motwani, “Similarity search in high dimensions via hashing,” The VLDB Journal, vol. 99, no. 6, pp. 518–529, 1999.
View at: Google Scholar
Data Mining and Query Log Analysis for Scalable Temporal and Continuous Query Answering, 2015, http://www.optique-project.eu/.
Z. Zheng, Y. Zhang, and M. R. Lyu, “Investigating QoS of real-world web services,” IEEE Transactions on Services Computing, vol. 7, no. 1, pp. 32–39, 2014.
View at: Publisher Site | Google Scholar
J. S. Breese, D. Heckerman, and C. Kadie, “Empirical analysis of predictive algorithms for collaborative filtering,” in Proceedings of the International Conference on Uncertainty in Artificial Intelligence, pp. 43–52, 1998.
View at: Google Scholar
L. Qi, H. Xiang, W. Dou, C. Yang, Y. Qin, and X. Zhang, “Privacy-preserving distributed service recommendation based on locality-sensitive hashing,” in Proceedings of the 24th IEEE International Conference on Web Services, ICWS 2017, pp. 49–56, USA, June 2017.
View at: Google Scholar
X. Wang, L. T. Yang, L. Kuang, X. Liu, Q. Zhang, and M. J. Deen, “A tensor-based big-data-driven routing recommendation approach for heterogeneous networks,” IEEE Network Magazine, vol. 33, no. 1, pp. 64–69, 2019.
View at: Publisher Site | Google Scholar
M. Wang and G.-L. Tian, “Robust group non-convex estimations for high-dimensional partially linear models,” Journal of Nonparametric Statistics, vol. 28, no. 1, pp. 49–67, 2016.
View at: Publisher Site | Google Scholar
X. Wang and M. Wang, “Variable selection for high-dimensional generalized linear models with the weighted elastic-net procedure,” Journal of Applied Statistics, vol. 43, no. 5, pp. 796–809, 2016.
View at: Publisher Site | Google Scholar
P. Wang and L. Zhao, “Some geometrical properties of convex level sets of minimal graph on 2-dimensional Riemannian manifolds,” Nonlinear Analysis: Theory, Methods & Applications, vol. 130, pp. 1–17, 2016.
View at: Publisher Site | Google Scholar
X. Wang and M. Wang, “Adaptive group bridge estimation for high-dimensional partially linear models,” Journal of Inequalities and Applications, vol. 2017, article no. 158, pp. 1–18, 2017.
View at: Publisher Site | Google Scholar
X. Wang, S. Zhao, and M. Wang, “Restricted profile estimation for partially linear models with large-dimensional covariates,” Statistics & Probability Letters, vol. 128, pp. 71–76, 2017.
View at: Publisher Site | Google Scholar
H. Tian and M. Han, “Bifurcation of periodic orbits by perturbing high-dimensional piecewise smooth integrable systems,” Journal of Differential Equations, vol. 263, no. 11, pp. 7448–7474, 2017.
View at: Publisher Site | Google Scholar
P. Wang and X. Wang, “The geometric properties of harmonic function on 2-dimensional Riemannian manifolds,” Nonlinear Analysis: Theory, Methods & Applications, vol. 103, pp. 2–8, 2014.
View at: Publisher Site | Google Scholar
M. Wang and X. Wang, “Adaptive Lasso estimators for ultrahigh dimensional generalized linear models,” Statistics & Probability Letters, vol. 89, no. 1, pp. 41–50, 2014.
View at: Publisher Site | Google Scholar
X. Wang, M. Wang, and X. Wang, “A note on the one-step estimator for ultrahigh dimensionality,” Journal of Computational and Applied Mathematics, vol. 260, pp. 91–98, 2014.
View at: Publisher Site | Google Scholar
G.-L. Tian, M. Wang, and L. Song, “Variable selection in the high-dimensional continuous generalized linear model with current status data,” Journal of Applied Statistics, vol. 41, no. 3, pp. 467–483, 2014.
View at: Publisher Site | Google Scholar
S. Yang, Z.-A. Yao, and C.-A. Zhao, “The weight distributions of two classes of p-ary cyclic codes with few weights,” Finite Fields and Their Applications, vol. 44, pp. 76–91, 2017.
View at: Publisher Site | Google Scholar
Y.-F. Wang, C.-C. Yin, and X.-S. Zhang, “Uniform estimate for the tail probabilities of randomly weighted sums,” Acta Mathematicae Applicatae Sinica, vol. 30, no. 4, pp. 1063–1072, 2014.
View at: Publisher Site | Google Scholar
S. Yang, Z.-A. Yao, and C.-A. Zhao, “A class of three-weight linear codes and their complete weight enumerators,” Cryptography and Communications, vol. 9, no. 1, pp. 133–149, 2017.
View at: Publisher Site | Google Scholar
J. Cai, “An implicit sigma (3) type condition for heavy cycles in weighted graphs,” Ars Combinatoria, vol. 115, pp. 211–218, 2014.
View at: Google Scholar
S. Yang and Z.-A. Yao, “Complete weight enumerators of a family of three-weight linear codes,” Designs, Codes and Cryptography, vol. 82, no. 3, pp. 663–674, 2017.
View at: Publisher Site | Google Scholar
S. Yang and Z.-A. Yao, “Complete weight enumerators of a class of linear codes,” Discrete Mathematics, vol. 340, no. 4, pp. 729–739, 2017.
View at: Publisher Site | Google Scholar
S. Yang, X. Kong, and C. Tang, “A construction of linear codes and their complete weight enumerators,” Finite Fields and Their Applications, vol. 48, pp. 196–226, 2017.
View at: Publisher Site | Google Scholar
H. Liu and F. Meng, “Some new generalized Volterra-Fredholm type discrete fractional sum inequalities and their applications,” Journal of Inequalities and Applications, vol. 2016, no. 1, article no. 213, 2016.
View at: Publisher Site | Google Scholar
P. Li and G. Ren, “Some classes of equations of discrete type with harmonic singular operator and convolution,” Applied Mathematics and Computation, vol. 284, pp. 185–194, 2016.
View at: Publisher Site | Google Scholar | MathSciNet
P. Li, “Singular integral equations of convolution type with Hilbert kernel and a discrete jump problem,” Advances in Difference Equations, vol. 2017, no. 1, article no. 360, 2017.
View at: Publisher Site | Google Scholar
Y. Bai and L. Liu, “New oscillation criteria for second-order delay differential equations with mixed nonlinearities,” Discrete Dynamics in Nature and Society, vol. 2010, Article ID 796256, 9 pages, 2010.
View at: Publisher Site | Google Scholar
Y. Wang and C. Yin, “Approximation for the ruin probabilities in a discrete time risk model with dependent risks,” Statistics & Probability Letters, vol. 80, no. 17-18, pp. 1335–1342, 2010.
View at: Publisher Site | Google Scholar
Q. Feng, F. Meng, and Y. Zhang, “Generalized gronwall-bellman-type discrete inequalities and their applications,” Journal of Inequalities and Applications, vol. 2011, article no. 47, 2011.
View at: Publisher Site | Google Scholar
G. Guo, W. Shao, L. Lin, and X. Zhu, “Parallel tempering for dynamic generalized linear models,” Communications in Statistics—Theory and Methods, vol. 45, no. 21, pp. 6299–6310, 2016.
View at: Publisher Site | Google Scholar | MathSciNet
L. L. Liu and Y. Li, “Recurrence relations for linear transformations preserving the strong q-log-convexity,” The Electronic Journal of Combinatorics, vol. 23, no. 3, pp. 1–11, 2016.
View at: Google Scholar
H. Li and S. Wang, “Partial condition number for the equality constrained linear least squares problem,” Calcolo. A Quarterly on Numerical Analysis and Theory of Computation, vol. 54, no. 4, pp. 1121–1146, 2017.
View at: Publisher Site | Google Scholar | MathSciNet
H. Liu and F. Meng, “Some new nonlinear integral inequalities with weakly singular kernel and their applications to FDEs,” Journal of Inequalities and Applications, vol. 2015, no. 209, pp. 1–17, 2015.
View at: Google Scholar
X. Zhang, L. Liu, Y. Wu, and L. Caccetta, “Entire large solutions for a class of Schrödinger systems with a nonlinear random operator,” Journal of Mathematical Analysis and Applications, vol. 423, no. 2, pp. 1650–1659, 2015.
View at: Publisher Site | Google Scholar
Z. Zong, F. Hu, C. Yin, and H. Wu, “On Jensen’s inequality, Hölder’s inequality, and Minkowski’s inequality for dynamically consistent nonlinear evaluations,” Journal of Inequalities and Applications, vol. 2015, no. 1, pp. 1–18, 2015.
View at: Google Scholar
X. Hao, L. Liu, and Y. Wu, “Positive solutions for nonlinear fractional semipositone differential equation with nonlocal boundary conditions,” Journal of Nonlinear Science and Applications, vol. 9, no. 6, pp. 3992–4002, 2016.
View at: Publisher Site | Google Scholar
X. Hao, L. Liu, and Y. Wu, “Iterative solution for nonlinear impulsive advection- reaction-diffusion equations,” Journal of Nonlinear Science and Applications, vol. 9, no. 6, pp. 4070–4077, 2016.
View at: Publisher Site | Google Scholar
P. Li, “Two classes of linear equations of discrete convolution type with harmonic singular operators,” Complex Variables and Elliptic Equations, vol. 61, no. 1, pp. 67–75, 2016.
View at: Publisher Site | Google Scholar
Z. Zheng, “Invariance of deficiency indices under perturbation for discrete Hamiltonian systems,” Journal of Difference Equations and Applications, vol. 19, no. 8, pp. 1243–1250, 2013.
View at: Publisher Site | Google Scholar
M. Han, X. Hou, L. Sheng, and C. Wang, “Theory of rotated equations and applications to a population model,” Discrete and Continuous Dynamical Systems- Series A, vol. 38, no. 4, pp. 2171–2185, 2018.
View at: Publisher Site | Google Scholar
J. Cai and H. Li, “A new sufficient condition for pancyclability of graphs,” Discrete Applied Mathematics, vol. 162, pp. 142–148, 2014.
View at: Publisher Site | Google Scholar
L. L. Liu and B.-X. Zhu, “Strong q-log-convexity of the Eulerian polynomials of Coxeter groups,” Discrete Mathematics, vol. 338, no. 12, pp. 2332–2340, 2015.
View at: Publisher Site | Google Scholar
B. Zhang, “Remarks on the maximum gap in binary cyclotomic polynomials,” Bulletin Mathematique De La Societe Des Sciences Mathematiques De Roumanie, vol. 59, no. 1, pp. 109–115, 2016.
View at: Google Scholar
L. Wang, “Intuitionistic fuzzy stability of a quadratic functional equation,” Fixed Point Theory and Applications, vol. 2010, Article ID 107182, 7 pages, 2010.
View at: Publisher Site | Google Scholar
X. Du and Z. Zhao, “On fixed point theorems of mixed monotone operators,” Fixed Point Theory and Applications, vol. 2011, Article ID 563136, 8 pages, 2011.
View at: Publisher Site | Google Scholar
B. Zhu, L. Liu, and Y. Wu, “Local and global existence of mild solutions for a class of nonlinear fractional reaction-diffusion equations with delay,” Applied Mathematics Letters, vol. 61, pp. 73–79, 2016.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2019 Xuening Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

695

Downloads

1021

Citations

Wireless Communications and Mobile Computing

IoT Big Data Analytics

Robust and Privacy-Preserving Service Recommendation over Sparse Data in Education

Abstract

1. Introduction

2. Related Work

2.1. K-Anonymity

2.2. Data Obfuscation

2.3. Data Decomposition

2.4. Locality-Sensitive Hashing (LSH)

3. Preliminary Knowledge

3.1. Problem Formulation

3.2. Locality-Sensitive Hashing

4. : Service Recommendation Based on LSH and Hybrid CF

4.1. Service Quality Prediction Based on LSH and User-Based CF

4.2. Service Quality Prediction Based on LSH and Item-Based CF

4.3. Aggregation of Predicted Service Quality and Service Recommendation

5. Experiments

5.1. Experiment Configurations

5.2. Experiment Results and Analyses

5.3. Further Discussions

6. Conclusions

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright