Abstract

The fast developing social network is a double-edged sword. It remains a serious problem to provide users with excellent mobile social network services as well as protecting privacy data. Most popular social applications utilize behavior of users to build connection with people having similar behavior, thus improving user experience. However, many users do not want to share their certain behavioral information to the recommendation system. In this paper, we aim to design a secure friend recommendation system based on the user behavior, called PRUB. The system proposed aims at achieving fine-grained recommendation to friends who share some same characteristics without exposing the actual user behavior. We utilized the anonymous data from a Chinese ISP, which records the user browsing behavior, for 3 months to test our system. The experiment result shows that our system can achieve a remarkable recommendation goal and, at the same time, protect the privacy of the user behavior information.

1. Introduction

We are now embracing the era of the mobile social network. This network connects people with similar interests and characteristics through mobile devices like smartphones, tablets, and so forth. Users on a mobile social networks platform can share their states conveniently. The mobile social network platform is an open platform: it is established on the base of actual social relationship and developed by further expanding the social circle. Aiming at recommending new links for each user, almost all the mobile social networking systems provide a service to recommend friends. Considering the fact that the size of the network grows exponentially, many service providers and researchers are trying to import distributed management to implement a recommendation system as a way to ease the pressure of services on a centralized management system [1] and improve user experience at the same time.

The recommendation system can achieve a promising result using the information of user behavior [2]. However, some of the information contains personal privacy data, and users are unwilling to share their certain behavioral information to others. They prefer to exchange their information with people who share common interests, instead of all the strangers. Particularly in the distributed systems, because there are no regulators in the interactive process, the privacy security will be a problem. Thus, “how to realize high quality recommendation as well as protecting the privacy of user behavior information of the user” is becoming a research hotspot [35].

To solve these problems, we design a secure friend recommendation system based on the user behavior, called PRUB. This system aims at achieving fine-grained recommendation to friends who share common interests without exposing the actual user behavior. PRUB provides a modified matching protocol and authorization protocol to ensure the security of user behavior information in the hybrid management which combines the centralized and distributed management.

PRUB works mainly in two steps. The first step is to realize the coarse friend recommendation. The authentication server classifies users using the KNN classification algorithm which is based on user behavior information, such as users’ browsing records, and returns friend recommendation result based on people who share the same interests as the users. The second step is to realize the fine-grained friend recommendation. Each user uses coarse grained friend recommendation on similarity calculation using the matching protocol. If the similarity degree is greater than the user defined threshold, PRUB adds the person to the fine-grained friend recommendation list.

This paper aims to make the following contributions:(1)We present a hybrid management architecture according to the mobile social networks platform characteristics, easing the pressure from the server and improving the user experience.(2)We propose a privacy supported matching protocol. Utilizing the users’ behavior, it can realize the personalized instead of the blindly recommended result. At the same time, the protocol ensures personal privacy security; users can protect their own sensitive information and avoid exposing it to all the strangers on the platform.(3)We define a security model and theoretically analyze the security of our protocol. We aim to prove that our protocol can defend against attack in the position of initiator and matching target, respectively.

To evaluate our system PRUB, we utilize anonymous data from a Chinese ISP, which records the user browsing behavior for 3 months. The experiment result shows that our system can achieve a fine-grained recommendation and protect the privacy of the user behavior information at the same time.

The rest of the paper is organized as follows. Section 2 discusses related works. Section 3 provides the system overview of PRUB. The secure matching protocol is discussed in Section 4. Section 5 analyzes the security of the system. The experiment result is presented in Section 6. We conclude our work in Section 7.

Corresponding to the structure of the mobile social network, there are three application patterns for friend recommendation: centralized management, distributed management, and hybrid management.

Centralized Management. In this pattern, all the user information is stored on the central server. As a trusted third party, the central server manages the whole system and handles all the processes. Users only need to access the server using a mobile client and they can acquire the desired service. The central server will complete the characters matching and return the recommended friend list to the users [8]. Throughout the process, there is no interaction between the users. In the centralized management, the central server holds all the information about user characteristics. It can effectively protect the user privacy through enhancing the server security. However, the server cannot be accessed at any time; this depends on the network conditions. Thus, user experience decreases under unstable network condition. Apart from this, not all the users are willing to deliver their behavior information to server providers, especially some personal privacy information. Some server providers may utilize the information for illegal activities [9].

Distributed Management. In distributed social networks, data are stored and handled at the local clients, and users can directly interact with each other. Clients broadcast their own information and receive information of others at the same time and then match characteristics based on the information to discover target users. In this pattern, the whole recommendation process can be realized among clients without any server participation [10]. This pattern certainly relieves the pressure of the server. However, clients may unintentionally publish some unnecessary information, even user privacy [11]. In the distributed system, the interaction process lacks control and fails to ensure the security of the recommendation process. Many researchers presented secure matching protocols to avoid the shortage, such as [12]; it utilizes Shamir secret dividing to complete matching.

Hybrid Management. In order to integrate the advantages of centralized and distributed management, some researchers presented hybrid management [13]. In this pattern, users could interact directly, but, during the process of matching, appropriate server control is required, such as storing temporary data and arbitration. The hybrid management pattern can reduce the server stress and provide security mechanism as well. The problem for researchers is how to minimize the information provided by users, how to realize secure matching with server involvement as little as possible, and how to realize arbitration using minimal user information and ensure accuracy [14].

Our recommendation system PRUB is built on the hybrid management. Matching protocols is the core of hybrid management; users can find some friends who share common interests, such as common browsing habits, and also protect their private information. The process of matching can be regarded as PSI (private set intersection) problem or PCSI (private cardinality of set intersection) problem [15]. Currently, the popular resolution algorithm can be categorized into three types.

Matching Protocol Based on Commutative Encryption Function. Agrawal et al. [7] presented a commutative encryption protocol to solve PSI/PCSI problem. It used a pair of encryption functions and and ; the property of each function is that the encrypted result is independent of the calculation order, such as   , where is safe prime. This protocol is secure under the hypothesis of DDH (Decisional Diffie-Hellman) hypothesis, and only one of the participators knows the intersection; the other cannot acquire anything. However, this protocol cannot defend against malicious attacks.

Von Arb et al. [16] presented a social platform VENETA based on the algorithm of Agrawal et al. They rely on a construction based on commutative encryption. Compared to Agrawal et al., they assume that the attack cannot cause any serious damage. A victim only reveals a contact he was willing to share, without getting this information in return. VENETA enables two users to calculate the intersection of their characters in a certain range. If it successfully matches between two users, VENETA will recommend this stranger to the user.

Xie and Hengartner [6] presented a matching protocol in mobile social networks. The protocol adds signature verification to property elements; this paper shows that Agrawal et al. proved that, given the Decisional Diffie-Hellman (DDH) hypothesis, for fixed values of and , with , is indistinguishable from , when is not given. They prove that these certificates are sent across an encrypted channel, so a passive eavesdropper cannot learn Alice’s or Bob’s interests. Thus, attackers cannot reorder all the segments and properties easily, thus avoiding counterfeit and scanning attacks.

In [17], a new efficient solution to Yao’s millionaires’ problem based on symmetric cryptography is constructed, and the privacy-preserving property of the solution is demonstrated by a well-accepted simulation paradigm. It proposes a new security paradigm that quantitatively captures the security levels of different solutions. The paper proposes an ideal model with a trusted third party; Alice and Bob have and ; they want privately to compute functionality with the help of a trusted third party. At the end of the protocol, Alice (Bob) obtains , without leaking . The ideal model provides the best possible security of secure multiparty computations, and its security level is the highest level compared with that any secure multiparty computation solution can achieve. The authors of [17] use the following to judge whether Protocol is more secure than Protocol :They concluded that the new solution was as secure as both the ideal secure multiparty computation solution and Yao’s solution. In this paper, the solution provided in [17] enables XOR operating in commutative encryption function, but it greatly increases the calculation costs and decreases the security of the system.

Matching Protocol Based on Linear Polynomial. The paper [15] presented FNP protocol, which transforms properties into linear polynomial, and uses the character of homomorphy to handle the encrypted coefficients. In the exchange process, one side is the client and the other is the server. For each input property of the client, it only knows whether the property belongs to the server and cannot acquire any other information. Meanwhile, the server cannot get any other input information from the client.

Kissner and Song [18] used a polynomial to represent multiple collections. Taking advantage of polynomial and addition homomorphy function, it realizes the secure operation of intersection, union, and complementation. This protocol can be applied against semihonest attacks and these techniques can be applied to a wide range of practical problems. This paper has an important feature of privacy-preserving multiset operations which can be composed and enable a wide range of applications. The authors of [18] use the following grammar to compute the output of any function over the multisets.

. They construct an algorithm for computing the polynomial representation of operations on sets, including union, intersection, and element reduction. And they extend these techniques including intersection and element reduction with a trusted third party to encrypted polynomials, allowing secure implementation of our techniques without a trusted third party. Dachman-Soled et al. [19] presented a PSI protocol which can be applied for defending against malicious users. They also use the polynomial coefficient to represent properties and use Shamir Secret Share to dividing coefficient for realizing higher security.

Lu et al. [20] put forward Secure Handshake with symptom-matching and deployed the matching protocol into disease monitoring. Their system enables patients who have the same symptoms to communicate and share information. The core of their algorithm is the feature of bilinear matching function. In this system model, they consider a typical mHealthcare social network (MHSN), which consists of trusted authority (TA) at eHealth center and a large number of mobile patients. As the patient health condition is very sensitive to the patient himself/herself, therefore, it is essential that the privacy of PHI should be controlled by the patient in a MHSN environment, so they develop a secure same-symptom-based handshake (SSH) scheme. It consists of these algorithms: system setup, patient joining, and patient’s same-symptom-based handshaking (PatientsSSH). The paper shows the employed identity-based encryption.The identity-based encryption (IBE) should be semantic security (indistinguishable) under selective-PID-symptoms and chosen-plaintext attacks. In this paper, let IBE be a secure encryption scheme with security parameter , and define the advantage probability of to be an INDsPS-CPA adversary against IBE. It is very important in the random oracle model.In this paper, SSH is of vital importance to the success of MHSN, but this algorithm can be only applied to matching single property and it is difficult to extend it into multiproperties.

Matching Protocol Based on Pseudorandom Number. This protocol is firstly presented in [21]. Hazay and Lindell designed a PSI protocol, in order to defend against different attacks and ensure the operating efficiency at the same time. The pseudorandom number is used for encryption. In this paper, the protocols for securely computing the set intersection functionality are based on secure pseudorandom function evaluations, in contrast to previous protocols. This paper proposed Secure Pattern Matching; was used to address the question of how to securely compute the above basic pattern matching functionality.

This protocol is presented for securely computing in the presence of malicious adversaries with one-sided simulatability. And specific properties of the Naor-Reingold pseudorandom function and the protocol for computing the Naor-Reingold function are utilized. The following is used instead of the corrupted party computing and sending the set.So, the Naor-Reingold pseudorandom function will achieve high efficiency.

Yang et al. [22] designed a distributed mobile social network, E-SmallTalker. It utilizes Bloom filter as the store structure of properties and calculates the intersection through several rounds of iteration with pseudorandom function. E-SmallTalker can reduce the storage space effectively and prevent interactive users from acquiring more information besides the common properties. It requires no data services like Internet access and exchanges user information between two phones and performs matching locally. Yang et al. build on the Bluetooth Service Discovery Protocol (SDP) to search for nearby E-SmallTalker users. And the iterative Bloom filter (IBF) they proposed is to encode user information; data is encoded in a bit string to address SDP attributes’ size limit. This system architecture includes four software components: context data store, context encoding and matching, context exchange, and user interface (UI). The authors of [22] devise a multiround protocol to achieve the desired false-positive rate with a minimum total amount of transmission given the constraints imposed by the implementation of Bluetooth SDP, and the quantitative measurement of the false-positive rate is defined by the following formula: . So, this paper’s approach was efficient in computation and communication.

3. PRUB System Overview

The PRUB system is based on users browsing behavior to recommend related friends with similar behavior and activities. The system adopts the hybrid management architecture mentioned above. The system consists of several users, several smartphones, one verification server (VS), and several anchor servers (AS). The basic structure is shown in Figure 1.

3.1. Users

Every user who wants to use our system to find friends with similar behavior should have a smartphone and install our application. The application will collect users browsing histories and classify them into several catalogs. The value of each catalog is defined as properties and characteristic of the user. The users should sign up with an account to use the service.

To test the security of our system, we define some kinds of abnormal users. The first kind of abnormal users will not try to break protocols of the network. They only want to obtain some privacy data of other users by analyzing the information they obtain from the recommendation result. The way they are trying to do this is defined as passive attack. These users are called semihonest users. Another kind of users who are trying to attack actively is called malicious users. For getting more information of other users, they do not obey the protocols and even try to break down the whole system. Some methods of attacking are to send fake messages or terminate an agreement before it finishes.

3.2. Smartphones

The smartphones which have installed our application are also part of the system. The application saves the personal information of the user, including his/her ID, properties, private key, and public key. The smartphone should have some computability in order to compute some secure information. The smartphones of the users form mobile social networks (MSNs).

3.3. Verification Server (VS)

The user should register on the verification server before he/she obtains the recommendation data. The application will send the encrypted user’s personal information to the VS to get the ID and a pair of RSA keys. When verifying the user’s ID and properties, the application will send the username and the public key to VS; the VS generates a random number and then sends the random number to the user’s application. The application will utilize the properties and send the ID to the VS. The VS uses its private key to get and sends it back to the user’s application. In order to prevent abnormal users from changing their properties to obtain others’ private information, the VS can bind the user with its properties and signature certification.

3.4. Anchor Server (AS)

The anchor servers are used to connect several MSNs. The user can register on several different MSNs, and the AS transmits information to the users from different MSNs.

4. Secure Matching Protocol

One-Way Authentication Protocol. In the beginning of establishing the interaction communication channel between VS and the user, VS utilizes one-way authentication protocol to authenticate the identity of the user. The protocol is described in Box 1.

After the interaction about setting identity and random identification, the protocol can realize the identification between dual direction key and one-way entity through verifying hash value. It achieves forward secrecy and nonrepudiation and can defend the man-in-the-middle attacks, including replay attack, reflection attack, prophecy attack, and interleaving attack.

Mutual Authentication Protocol. This protocol is used to mutually authenticate the pairing of interaction when establishing the channel. The protocol is described in Box 2, where   ,   , and   .

Our mutual authentication protocol is developed on the basis of STS workstation protocol. After adding the identity and using a random number timestamp, the identification is completed. This protocol can also be used to defend man-in-the-middle attack.

Matching Protocol. In order to achieve fine-grained friend recommendation, the similarity of common properties with coarse grained recommendation is calculated. The mutual protocol must be done before matching. The protocol is presented as shown in Box 3.

, and denotes all the users that are recommended to . and .

The core of the matching protocol is the principle of commutative encryption of messages. Pairing of interaction acquires the amount of common properties through two encrypted comparisons of each property.

Meanwhile, the confusing operation is added to ensure the equity and security. In the last step, common properties of both sides are compared; if they do not match, then we turn into arbitration.

Common Property Exchange Protocol. This protocol is used for pairing of users to exchange the detail of common properties. The mutual protocol must be done before exchanging properties. Box 4 shows this protocol.

The core of this protocol is also the commutative encryption, but it does not include a confusing operation. Pairing of interaction can get the detail of properties by the number. Similarly, common properties of both sides are compared in the last step, and hence we decide whether to have arbitration.

5. Security Principles

In this section, we evaluate the security of the proposed scheme under the security protocols we proposed in Section 4. Our protocols are designed based on the Dolev-Yao security model [23].

5.1. Definitions

Definition 1 (ignorable function). Function is an ignorable function, when polynomial , ,

Definition 2 (probabilistic polynomial (PP)). A language is in PP if and only if there exists a probabilistic Turing machine , such that runs for polynomial time on all inputs;for all in , outputs with probability strictly greater than ;for all not in , M outputs with probability less than or equal to .

Definition 3 (computationally indistinguishable). Probability collectives and are computationally indistinguishable if probabilistic polynomial Turing machine , a large enough integer , and any polynomial :For , and are the same. cannot get any information of from and vice versa.

Definition 4 (Decisional Diffie-Hellman (DDH) hypothesis). is a prime number, is a cycle group ordered , and is a generator of . And hence, . The distributions of and are computationally indistinguishable.

5.2. Security Analysis

The system security faces a number of threats, such as man-in-the-middle attack, passive wiretapping, property modification, and malicious match by abnormal users. We analyze four threats on our system.

5.2.1. Man-in-the-Middle Attack

The user should register on the VS first. VS matches the properties and sends back the coarse recommendation result. The system implements the one-way authentication protocol to defend against the man-in-the-middle attack. The process is shown in Figure 2.

The fine-grained friend recommendation of our system is based on matching protocol based on commutative encryption function. In order to defend against the man-in-the-middle attack, we implement mutual authentication protocol when the communication channel is established. The process is shown in Figure 3.

One-way authentication protocol achieves the bidirectional key and unidirectional entity confirmation. Mutual authentication protocol achieves the bidirectional key and bidirectional entity confirmation. They complete forward secrecy and nonrepudiation, which can defend against basic man-in-the-middle attacks including replay attack, reflection attack, and interleaving attack.

5.2.2. Passive Wiretapping

Passive wiretapping is when attackers are trying to obtain users information through the communication channel. The register process, when VS is matching the properties and sending back the coarse recommendation result, is threatened by passive wiretapping.

To avoid passive wiretapping, the system uses encrypted properties. Key agreement in the authentication protocols will encrypt the value of the properties, which will efficiently defend against passive wiretapping.

Communication channel’s general model shown in Figure 4 shows that the source is discrete and memoryless with entropy . The “main channel” and the “wiretap channel” are discrete memoryless channels with transition probabilities and . The source and the transition probabilities and are given and fixed. The encoder, as shown in the figure, is a channel with the vector as input and the vector as output. The vector is in turn the input to the main channel. The main channel output and the wiretap channel input are . The wiretap channel output is . The decoder associates a vector with , and the error probablity . The source sends a data sequence , which consists of independent copies of the binary random variable , where . The encoder examines the first source bits and encodes into a binary vector . in turn is transmitted perfectly to the decoder via the noiseless channel and is transformed into a binary data stream for the delivery to the destination. The wiretapper observes the encoded vector , through a binary symmetric channel with crossover probability . The corresponding output at the wiretap is , so that, for , with , and the transmission rate is source bits per channel input symbol.

As shown in Figure 5, Alice and Bob communicate with each other. The transmission probability is . Supposing that the channel has no memory, the transmission probability of length of sequence is Alice sends a common message to Bob and Eve and sends a private message to Bob. The codeword is defined by these:(1) and .(2)An encoding function: , .(3)Two decoding functions: and . , get .

Attackers get the signal M1’s with uncertainty: ; it needs a codeword ; for any , the set of rate () should satisfy

5.2.3. Property Modification by Abnormal Users

Some malicious users do not follow the protocols. They send fake message stream to get more details and more private information than the normal users. To defend against this kind of attack, the matching protocol confuses the property information sequence; thus, the information will not be leaked by property modification.

Users’ behavior analysis is used to find malicious users. Behavior analysis is a technique that can show whether and how strongly one user is similar to other users. In our method, we are using two types of behavior analysis to find malicious users: (1) behavior analysis of a single user with other products and (2) behavior analysis of multiple user IDs with commonly rated products. To analyze behavior of users and , we have used the cosine similarity method. If and are the rating values of common user IDs rated for a common product, the cosine similarity, , is defined as

The resulting similarity ranges from 0, usually indicating independence, to 1, meaning exactly the same, with values in between indicating intermediate similarity or dissimilarity.

Malicious users and attacks have been mostly considered from a system perspective for particular protocols or algorithms. We use a game theoretic model to explain abnormal users. The network is modeled as an undirected graph , where each node in corresponds to one user. An edge means that there is a communication link between the users corresponding to nodes and . The set of neighbors of user , denoted by , is the set of users such that there exists an edge . The neighbors of user are also called adjacent nodes to . Since the graph is undirected, the neighbor relationship is symmetrical: . In order to have a model with asymmetric links, the assumption for an undirected graph can be dropped, but we believe the extension to be straightforward. We denote the set of bad users by and the set of good users by . It holds that and . We will be using the term type of a user for the property of being good or bad (see Figure 6).

Users have a choice between two actions: C (for cooperate) and D (for defect). When all users choose their actions, each user receives a payoff that depends on three things: his own action, his neighbors’ actions, and his own type (but not his neighbors’ types). The payoff is decomposed as a sum of payoffs, one for each link. Each term of the sum depends on the user’s own action and the action and type of his neighbor along that link. Observe that the user is playing the same action against all neighbors. The payoff of user is denoted by , when ’s action is and ’s type is . We extend and slightly abuse this notation to denote by the payoff for when is a neighbor of and ’s action is . So, the decomposition of ’s payoff can be written asUsers have incentives and disincentives to cooperate; we model both of them in a game theoretic fashion, with appropriate payoffs ( and ).

5.2.4. Malicious Match by Abnormal Users

During the matching process based on commutative encryption function, the attackers can modify its properties to scan the recommended friend’s common browsing behaviors. Because the matching protocol is based on the similarity of the user’s properties, that is, browsing behavior, the abnormal users can change their properties to get detailed private information of one recommended friend.

To avoid such attack, the system designed the arbitration protocol to detect the abnormal users.

The arbitration protocol is as follows:(1)In one-way authentication protocol, users will have key agreement with the VS and get a key . The VS gets from the key.(2)Both the semihonest user and the recommended friends send their , and the VS calculates using its private key.(3)Calculate the hash of by the key from server and clients. Then, the abnormal user can be found.

Then, we analyze the security of the matching protocol based on commutative encryption function.

Let Alice be the protocol initiator and Bob be the person to be recommended.

Theorem 5 (correctness). When Alice and Bob have the same properties, the system will recommend Bob to Alice.

Proof. Let Alice have the property set which indicates browsing behaviors and secret parameter . Let Bob have the property set and secret parameter . An arbitrary element .
Alice and Bob calculate , , respectively, and send them to each other. Then, they calculate , . Because the properties sequence is confused, we can only get ; according to the DDH hypothesis, is not clear.
In the second phase, Alice sends to Bob and Bob sends to Alice. If , then we can get from and Thus, if Alice and Bob have the same properties, the system believes that Bob and Alice have similar browsing behaviors and recommends Bob to Alice. Thus, we can see that the matching protocol based on commutative encryption function is correct.

Theorem 6. Matching protocol based on commutative encryption function can defend against passive attack.

Proof. Semihonest users want to get sensitive private information of other users, such as private key and properties, by analyzing the information they get from the system.
Assume Alice is a semihonest user. She wants to get Bob’s private key and properties. During the process, Alice can get Bob’s information of . According to the DDH hypothesis, getting from is hard. So she cannot get from . For the same reason, assume that Bob is a semihonest user. He cannot get any and of Alice. Thus, in the first phase, both users can only get the size of the common property set. Then, in the second phase, they can only know the common property set without other information. The matching protocol based on commutative encryption function is efficient in defending against passive attack.
To prove that the matching protocol based on commutative encryption function can defend against active attack, we proposed 3 different situations. We analyze the attack from both the initiator and the recommended friends to prove that the matching protocol we proposed can protect the privacy of the user.
Scenario 1. It can occur in the first phase of the protocol. The initiator counterfeits its properties or sends fake message at the last step, which will get false ratio of the intersection. When someone finds the mistake, the protocol can be terminated immediately, and we can find out who the malicious user is from the arbitration protocol. Even if the malicious user is not clear, the common set will be incorrect in phase two. Then, the VS can still find out the malicious user through the arbitration protocol.
Scenario 2. It can occur in the second phase of the protocol. We assume that Alice is the malicious user and Bob is a normal user. At first, Bob sends Alice . By receiving , Alice gets the intersection . Alice counterfeits (e.g., replaces some part of by , where ). Bob gets the intersection . Because is a subset of , Alice can get more information than Bob. This kind of attack cannot be detected. In order to avoid the properties modification from the user, we add promise from Alice to Bob Using the promise, we can ensure that Alice sends proper information after receiving Bob’s . If Bob receives different information from Alice, Bob can terminate the process and report to the VS.
Scenario 3. It can occur in the second phase of the protocol. We assume that Alice is the malicious user and Bob is a normal user. Alice modifies its properties (e.g., using to replace , where is a random number). Before Bob sends , Alice should send the promise of to Bob. Obviously, Bob will think that the protocol is processed normally and compute the information from Alice. If there is no supervision, Alice can get detailed properties of Bob. However, when they commute property set, Bob can find the difference. Then, he can report to the VS.

We compare our work with some other protocols, shown in Table 1.

Agrawal et al.’s protocol can only defend against the attack from semihonest users. Xie and Hengartner improved Agrawal et al.’s work; the protocol can defend against some attack from malicious users. PRUB improved these protocols by adding promise information and VS validation, which enhance the security of the protocol.

6. Experiment

In this section, we present the performance evaluation of PRUB. We first show how the user uses the system. After that, we evaluate the recommendation performance using the anonymous data from a Chinese ISP.

6.1. Using Process

The user first registers on the VS and the VS requires some data from the user. Figure 7 shows the register interface of the user.

The users are then asked what kind of information they are willing to share and the threshold on how many kinds of information should be the same for the friend recommendation. If the user chooses not to share one catalog of the browsing information, the system will not utilize the catalog to do recommendation.

After that, the server returns the coarse recommendation result to the client application. Then, the client application scans the recommended friends using the matching protocol we proposed. If a friend shares more common interests, which meet the threshold, the application will add the person to the friend list. Figure 8 shows the friend list.

The user then can choose a friend in the list to exchange the common interests. If the protocol fails, either one can ask for arbitration to find out who the abnormal user is. If it succeeds, the system will give a result with users of the same interest. The result is shown in Figure 9.

6.2. Recommendation Performance

We now evaluate recommendation performance using the anonymous data from a Chinese ISP. The data contained 20000 users’ browsing histories for 3 months (October 2013 to December 2013). The users were told that their information would be recorded. Then, the browsing behavior is classified into 12 catalogs. We then computed the ratio of the browsing histories and marked from 1 to 10 for each catalog to indicate the interest of the user. This data is used in the system as properties of the user. A data example of 30 users is shown in Figure 10.

The data is only from the browsing behavior. The friend relationship is not established. According to the Pew Research on Teens, Social Media, and Privacy [24], teen Facebook users have an average (mean number) of 425.4 friends. We assume that every person has about 500 friends and apply -mean to form the possible friend relationship of the users from the browsing data we get from a Chinese ISP. Let denote the set of possible friends for each user .

Then, we apply our recommendation scheme to the ISP data of the user browsing behavior. We then randomly select user to query the system and obtain its friend recommendation results. Let denote the set of recommended friends. The following measurement metrics are used for performance evaluation.

Recommendation Precision. The average of the ratio of the number of recommended friends in the set of possible friends of the query user over the total number of recommended friends iswhere denotes the number of elements in a set. The dominator is 500 because is the average of 500 users in one experiment.

Recommendation Recall. The average ratio of the number of recommended friends in the set of possible friends of the query user over the number of the sets of possible friends of the query user is

Using different thresholds, we calculated the average recommendation precision and recommendation recall of the 500 randomly selected users. Figure 11 presents the result.

With a higher threshold, the recommendation recall is comparatively low. Friends sharing more alike interests are not easy to find. However, the precision is increasing alongside the threshold. The more the browsing behavior similarities are, the more precise the recommendation result will be.

The experiment shows that our system can achieve relatively high recommendation precision and recommendation recall, and the recommendation system receives remarkable recommendation satisfaction.

7. Conclusion

While enjoying the benefits brought about by the friend recommendation system on the mobile social network, users and researchers begin to notice the personal privacy protection on the social network platforms. In centralized management architectures, all security is ensured by the central server, but, in distributed or hybrid architectures, users can directly exchange information with little or even without the involvement of the server. The security of the network has to be guaranteed by other protocols. In this paper, we presented a secure friend recommendation system PRUB based on user behavior which deploys hybrid management architecture, as a way of reducing the pressure on servers. PRUB can achieve fine-grained recommendation to friends who share the same characteristics without exposing the actual user behavior. In PRUB, the modified matching and authorization protocol can guarantee the privacy. PRUB first uses KNN classification algorithm to do coarse friend recommendation and then uses matching protocol to realize find-grained recommendation. To evaluate the security and performance of PRUB, we theoretically prove that our protocol can defend against attack in the aspect of initiator and matching target, respectively, and we utilize the anonymous data for realistic deployment. The experiment result shows that PRUB not only realizes the fine-grained friend recommendation, but also protects the privacy information of users.

Competing Interests

The authors declare that they have no competing interests regarding the publication of this paper.

Acknowledgments

This work is supported by Fundamental Research Funds for the Central Universities (nos. ZYGX2014J051 and ZYGX2014J066), Science and Postdoctoral Fund in China (2015M572464), and Technology Projects in Sichuan Province (2015JY0178) and the project sponsored by OATF, UESTC.