Abstract

Multiangle social network recommendation algorithms (MSN) and a new assessment method, called similarity network evaluation (SNE), are both proposed. From the viewpoint of six dimensions, the MSN are classified into six algorithms, including user-based algorithm from resource point (UBR), user-based algorithm from tag point (UBT), resource-based algorithm from tag point (RBT), resource-based algorithm from user point (RBU), tag-based algorithm from resource point (TBR), and tag-based algorithm from user point (TBU). Compared with the traditional recall/precision (RP) method, the SNE is more simple, effective, and visualized. The simulation results show that TBR and UBR are the best algorithms, RBU and TBU are the worst ones, and UBT and RBT are in the medium levels.

1. Introduction

In recent years social tagging systems have become increasingly popular as a means to classify large sets of resources on the web. These systems allow users to add metadata in the form of keywords to share resources [1]. Nevertheless, the rapidly growing data in these systems present new technical challenges involved with recommended resources. Collaborative filtering [2] approach provides a solution to overcome this challenge, which makes recommendations solely base on the preferred database. The Top-N recommendation method [3] tries to recommend the top-n ranked resources that users may be interested in.

Computation of the similarity plays a key role in the collaborative filtering, and there are many different ways to compute the similarities such as the Pearson correlation [4], constrained Pearson correlation [5], cosine-based similarity [6], adjusted cosine similarity [7], and Spearman rank correlation [8]. Currently, the user-based collaborative filtering algorithm mainly considers user similarity from the resource perspective [9]. Some papers did not discuss the resource similarity from the user perspective [10]. Mostly, the tag-based recommendation only takes into account the tag similarity based on the resource tag [11]. Some papers also comprehensively consider the user similarity and resource similarity but do not consider the tag similarity [12]. Based on the pursuit of a recommendation system, the assessment indicators are employed to evaluate an algorithm, for instance, recall/precision, mean absolute error (MAE), mean average precision (MAP), and area under curve (AUC) [13]. For recommended results, there are some auxiliary assessment indicators [14], such as coverage (COV), diversity (DIV), and average popularity (AP).

In this paper, the concepts of the 1-mode network and 2-mode network in social network [15] are addressed for collaborative filtering. For movies, this paper gives recommended explanations of the six algorithms as follows:UBR: “Users who watched your favorite movies also watch…”UBT: “Users who used your favorite tags also watch …”RBU: “Users who watch this movie also watch …”RBT: “Tags which annotate this movie also annotate …”TBR: “Other Tags of this movie also annotate …”TBU: “Other Tags which you used also annotate …”

On the basis of the similarity network evaluation (SNE), and comparison with the recall/precision, it is indicated that the results from the two assessment methods are consistent, but the SNE is more simple, visualized, and effective with less steps.

2. Multiangle Social Network Recommendation Algorithm (MSN)

2.1. Pretreatment of Social Network

The triple () represents the raw data, which means the user is given the tag on the resource . From this triple, one can deduce three 2-mode networks such as , , and . Each 2-mode network can deduce two 1-mode networks. For instance, can infer two 1-mode networks and ; infers 1-mode networks and ; and infers 1-mode networks and (see Figure 1). In the present paper, the symbol of denotes the transpose of .

The matrix shows the membership of each pair of users who like the common resources, described by . The matrix represents the membership of each pair of resources which have the common users, described by . The matrix is the membership of each pair of resources which have the same tags, described by . The matrix indicates the membership of each pair of tags which have the common resources, with . The matrix means the membership of each pair of users who annotate the common tags, with . The matrix shows the membership of each pair of tags which have the common users, with .

2.2. User-Based Social Network Algorithm (USN)

There are two steps of the USN algorithm: firstly to find other users who are similar to the target user and then recommend the other users’ favorite resources to the target user. The algorithm can be divided into the two algorithms UBR and UBT.

(1) User-Based Algorithm from Resource Perspective (User-Based Resources, UBR). The user similarity from resource (USR) perspective means that if two users like the same resources, they are similar. The element of the USR, denoted by is defined as: Based on the above similarity, the user-resource interest matrix UBR is described by:

(2) User-Based Algorithm from Tag Perspective (User-Based Tag, UBT). The user similarity from tag (UST) perspective indicates two users are similar if they prefer to use the same tags. The element of the UST, denoted by is computed by Based on the similarity, the interest matrix UBT is defined as:

2.3. Resource-Based Social Network Algorithm (RSN)

The RSN algorithm also has two steps: firstly to find out the resources that the target user likes and then recommend other similar ones to the user. It is divided into two algorithms: RBU and RBT.

(1) Resource-Based Algorithm from User Perspective (Resources-Based User, RBU). The resource similarity from user (RSU) perspective means that if two resources are enjoyed by the same user, they are similar, where is defined by The interest matrix RBU is written as

(2) Resource-Based Algorithm from Tag Perspective (Resources-Based Tag, RBT). The resource similarity from tag (RST) perspective implies that if two resources are enjoyed by the same user, they are similar. The element of the RST is defined as The interest matrix RBT is described by

2.4. Tag-Based Social Network Algorithm (TSN)

The TSN algorithm consists of three parts: the first step is to look for the frequently used tags of the target user, next to find other similar tags and merge both of them into a tag set, finally to recommend these tag-set corresponding resources to the target user.

(1) Tag-Based Algorithm from Resource Perspective (Tag-Based Resources, TBR). The tag similarity from resource (TSR) perspective means that if two resources are enjoyed by the same user, they are similar. The element of the TSR is computed as The tag similarity matrix TSR, which is viewed from the resources, show that if two tags annotate the same resources, they are similar. One can have

(2) Tag-Based Algorithm from User Perspective (Tag-Based User, TBU). The tag similarity from user perspective (TSU) implies that if two resources are enjoyed by the same user, they are similar. The element of the TSU is defined as The interest matrix TBU can be written as:

3. Similarity Network Evaluation (SNE)

3.1. Basic Concepts

Definition 1 (similarity network (SN)). A connection matrix is used to store similarity network (SN), whose element is , defined by where is the similarity between the nodes and and . When , there is no edge between nodes and ; is a threshold.

The definition of the similarity of the SN is actually borrowed from the definition of the gene community network (GCN) proposed by [16].

Definition 2 (types of the network). The similarity networks can be divided into six categories in terms of the intensity of similarities.(1)Perfect correlation network (PCN): when , remove the edges whose weights are lower than 1, and the weights of all edges are 1. (2)Very strong correlation network (VSN): when , delete the edges with the weights lower than 0.8. (3)Strong correlation network (SCN): when , remove the edges with the weights lower than 0.6 to form a new network. (4)Moderate correlation network (MCN): when , delete the edges whose weights are lower than 0.4 to form a similarity Network. (5)Weak correlation network (WCN): when , the values of all the edges are lower than 0.2 in the network.(6)Uncorrelated network (UCN): when , all nodes are isolated nodes.

3.2. Algorithm Principle

The core of the collaborative filtering algorithm is to calculate the similarity. The SNE method can better evaluate the algorithm if under a certain threshold value there are fewer isolated nodes but more small communities in its similarity networks. It includes the following requirements.

(1) A Certain Threshold Value. Relative to the entire resource, these recommended resources are quite few. Therefore, when we evaluate an algorithm, the PCN network () only needs to be considered. However, in order to have a comprehensive analysis, the VSN network () and the SCN network () should be considered together.

(2) Fewer Isolated Nodes. Recommendation algorithm is based on the similarity. The isolated node is not similar to any other nodes. In this case, the algorithm can not give a recommendation about these isolated nodes. Therefore, a good algorithm may produce a similarity network with fewer isolated nodes and more nonisolated ones.

  More Communities. Each community () represents a different interest of users. The more the communities, the more detailed features can they reflect, which makes the recommended results more in line with the user’s taste.

  Fewer Nodes in the Largest Community. In the largest community, there are too many nodes to reflect the user’s taste in detail. The nodes in the largest community should be as few as possible.

Taking into account the previous points, and also considering these different networks with different number of nodes, the score of the similarity network evaluation (SNE) is described by where SS is the score function of similarity network, is the threshold, is the number of all nodes in the network, is the number of nonisolated nodes in the network, is the number of communities, and is the number of nodes in the largest community.

3.3. Steps of SNE

The recall/precision rate of an algorithm has nine calculation steps (see Figure 2).

The SNE method only needs to calculate similarity matrices; thus one can directly evaluate the algorithms after constructing similarity networks. It is a simple method, which only has one intermediate step (see Figure 3).

4. Results and Discussions

The dataset is “hetrec2011-movielens-2k” from HetRec 2011 [17], which is an extension of MovieLens 10M dataset, with 2113 users, 10197 movies, 13222 tags, and 855598 ratings. In this paper, we focus on the users, resources, and tags and ignore the ratings.

4.1. Visual Analysis of the SNE

Visual similarity network analysis can be more intuitive to roughly evaluate various kinds of recommendation algorithms. We use the Pajek [18] to show the six PCN networks. To see more clearly, we just intercept 1/16 of the original screen in the upper-left corner (see Figure 4).

In the SNE evaluation, an algorithm with more small communities is better than the one with more large communities. In Figure 4, we can see the following:(1)The UST’s big communities are more obvious than the USR’s; therefore, UBR is better than UBT, denoted by UBR > UBT.(2)The RSU has more big communities than the RST; thus RBT > RBU, indicating RBT is better than RBU. (3)There are very big communities in TSU. Conversely, TSR has fewer ones. Therefore, TBR > TBU, meaning TBR is better than TBU.(4)Obviously, the TSU and RSU have the largest communities; thus the algorithms TBU and RBU are relatively poor.(5)In the same way, the better algorithms are UBR and TBR, and the medium ones are UBT and RBT.

4.2. Quantitative Analysis of SNE

Based on score function of SNE, the scores of different algorithms under the threshold values of 1, 0.8, and 0.6 are, respectively, displayed in Tables 13.

According to the similarity scores shown by Tables 1 and 2, one can give the order of the algorithms from best to worst as follows: TBR > UBR > UBT > RBT > RBU > TBU. And from Table 3, in order of the best algorithm, it goes as follows: TBR > UBR > UBT > RBT > TBU > RBU. Overall, RBU and TBU are the worst algorithms.

4.3. Comparison with Recall/Precision

The recall/precision rates can evaluate the accuracy of algorithms. Based on the training set, we recommend the top-N resources to users by using the six algorithms. Corresponding to the test set, one can obtain the Recall/Precision rates under different values, respectively (see Tables 4 and 5).

The Recall and Precision rates of different algorithms and different values are shown in Figure 5.

The comparative analysis of two evaluation indicators is given in Table 6.

From the comparison, one can see that the results of two evaluation algorithms are very agreeable, where the same results are UBR > UBT, RBT > RBU, and TBR > TBU. The similarity network visualization method can roughly evaluate algorithms. From the perspective of the complexity and visualization, the similarity network assessment algorithm is in an advantageous position.

5. Conclusion and Future Work

In the paper, not only the collaborative filtering algorithms are proposed based on social network, but also a new assessment method of similarity network evaluation is addressed. For these six algorithms, TBR and UBR are the best algorithms, RBU and TBU are the worst ones, and UBT and RBT are in the medium levels. From the recommended effects, we can conclude that UBR > UBT, RBT > RBU, and TBR > TBU. It is noted that, in the actual use of algorithms, the accuracy of the algorithm is not the unique factor to be considered; the complexity of the algorithm and the maintenance cost of the algorithm have to be taken into account as well.

Future work is encouraged for the three aspects: recommended algorithm based on the current similarity calculation method is of interest; hybrid algorithms based on two recommendation algorithms are of significance; the SNE assessment would be extended to evaluate other algorithms rather than the collaborative recommended algorithms.