Abstract
The development of recommendation system comes with the research of data sparsity, cold start, scalability, and privacy protection problems. Even though many papers proposed different improved recommendation algorithms to solve those problems, there is still plenty of room for improvement. In the complex social network, we can take full advantage of dynamic information such as user’s hobby, social relationship, and historical log to improve the performance of recommendation system. In this paper, we proposed a new recommendation algorithm which is based on social user’s dynamic information to solve the cold start problem of traditional collaborative filtering algorithm and also considered the dynamic factors. The algorithm takes user’s response information, dynamic interest, and the classic similar measurement of collaborative filtering algorithm into account. Then, we compared the new proposed recommendation algorithm with the traditional user based collaborative filtering algorithm and also presented some of the findings from experiment. The results of experiment demonstrate that the new proposed algorithm has a better recommended performance than the collaborative filtering algorithm in cold start scenario.
1. Introduction
A social network site, such as Facebook, Twitter, and Sina Weibo, has become an indispensable part of Internet users online life. It is also an important way of user information sharing and obtaining. However, with the number of social network users going into explosive growth, the information generated by the user also increases numerously. Therefore, when the user’s ability to process information cannot keep up with the speed of the network information explosion, the user will have the problem of information overload [1]. It will increase the cost of obtaining useful information. Recommendation system [2, 3] as a kind of technology can effectively alleviate the information overload problem and provides users with excellent personalized service.
In the traditional personalized recommendation algorithm, collaborative filtering algorithm [4, 5] is undoubtedly the most successful one. The collaborative filtering recommendation algorithm is based on the similarity preference between users of some certain items. More generally speaking, if they have similar interests in some items, it is most possible that they are interested in some other items. The defect of collaborative filtering algorithm is that it does not reflect that the user’s preferences are not immutable but has the feature of the dynamic changes [6–8]. It also did not take the user’s contextual factors into consideration. Therefore, the traditional collaborative filtering algorithm has some defects compared with other algorithms. Recently some published papers show that the improved algorithm will result in better recommendation performance by considering the user dynamic context factors in social network scenario [9, 10].
In this paper, we proposed the recommendation algorithm based on user’s dynamic information in complex social network to address the above problems. By considering the dynamic information of user’s response information and time factor to reflect the user’s dynamic preference feature, we proposed an improved recommendation algorithm, combined with a new similarity measurement. Then the experiments show that the new proposed algorithm has better recommendation performance than the traditional collaborative filtering recommendation algorithm.
2. Related Work
The collaborative filtering recommendation algorithm is one of the most successful recommendation algorithms. The core idea of it can be divided into three parts: first, to calculate the similarity between users from the user’s historical interest information; then, to select the nearest neighbors according to the similarity of users to predict the user’s preference for particular items;, finally, to select several items whose prediction score is enough high as a result of recommendation system recommended to the user.
2.1. The Traditional Similarity Measurement Method
The key point of the collaborative filtering recommendation is the measurement of similarity between different users. The widely adopted method is based on the similarity calculation of users’ common historical ratings data. Among the similarity calculation methods, although each one has its own advantages and disadvantages, the most common method [2, 5, 11–14] is the Pearson similarity [2, 11] and the Cosine similarity [5, 10].
Pearson similarity method: in the process of Pearson similarity calculation, we can get the similarity between different users based on the common items preference ratings. We give formalized representation here. Let user ’s rated common item set be ; and represent the user’s historical rating score of the item; furthermore, in order to eliminate the influence of the user rating score scale problem, we will subtract the user’s average score in the process of calculating user similarity; let and represent the average rating score of the item of user and user , respectively, which comes from the equation of user and the equation of user . So the Pearson similarity between user and user can be defined as
Cosine similarity method: in the process of Cosine similarity calculation, we treat user’s historical rating score as an mdimensional vector. The similarity between two users and is defined as the cosine of these two vectors. Let user and user ’s historical rating vectors be and , and the rated item set is , . So the cosine similarity of users is given by
Once we get the set of nearest neighbor users [15] based on the similarity measures, we named the set . Then the prediction rating score on an item for target user by using this formula is as follows:where is the similarity between target user and one of the nearest neighbor users , is the interest rating score of user to item , and and , respectively, represent the average interest rating score of users and to item set. is the nearest neighbors set of target user. The recommendation system can predict target user for its possible interest degree in the item which he or she has not known and then select several high predicted interest degree items as the recommendation result to the target user.
2.2. The Traditional Dynamic User Interest Model
For the context of social user and user’s dynamic interest pattern, the traditional collaborative filtering recommendation algorithm did not take them into account. And the already proposed method defined the time weight function to represent user’s dynamic interest pattern and then combined this function with recommendation algorithm [6, 13, 16]. One of the simple ways is to assume that the user’s interest is a monotonic decreasing function with time [16] and combine it with the user interest prediction function, or divide the user’s dynamic interest in more details by different time segments and construct the corresponding time weight function [6]. All these methods improved the recommendation algorithm’s recommended result.
3. The New Method
3.1. The Similarity Based on Social User’s Dynamic Information
In the social network, the most common way to construct the relationship of users is the graph construction . The represents the set of vertices corresponding to users or items. Using the edge to connect the different vertices, is the set of edges which means the friendship of different users or user’s social behavior historical relationship between user and item. In this paper, the exact means of set is the relationship of user’s interest response information to item. We take into account of user’s the different types of response information and the time factor to propose a new recommendation algorithm.
In social network, we can regard the behavior of information forwarding, collection, and other actions as the positive response type. So when the user is at timestamp , the number of positive responses can be defined as where represents the relationship graph of user ; represents the set of social users , where ; is the set of items and ; the means the number of items and ; is the timestamp when user gives the response information to item .
Similarly, if user did not show any interest in the item or even take actions such as shielding, cancelling the attention, and so on we can regard those actions as the negative response information. So when user is at timestamp , the amount of negative response information can be defined as
Based on the above response type definition, at the same time, considering the time effect of user interest preferences and the drawback of traditional collaborative filtering algorithm to this aspect, this paper put forward the new type of user similarity measurement.
We defined the user similarity by considering the user response information and the benefits of only considering the user’s response information without paying too much attention to the content of the response information are that it can ensure the diversity of the recommendation results compared with the traditional collaborative filtering recommendation algorithm [6]. Due to the time of different user response to the same item is different, it means the interest degree in this item of different user is also not the same, so after being considered to give the response of time weight, it can more accurately reflect the dynamic feature of user’s interest pattern. The user similarity measurement based on users’ positive response can be defined aswhere and , respectively, represent the set of items which users and gave their positive response to and is the common set of items the users and responded to. The value of is the common item set size. The is used to reflect the dynamic interest of use and detailed information of it will be discussed later.
Similarly, we can get the user similarity measurement based on user negative response from this formula:The meanings of sets are similar with the situation of positive response; is the size of the common item set which users and responded to with their negative feedback or ignored.
To take the dynamic interest and response information of social user into consideration, we design a decreasing time function to model the user’s dynamic interest feature in social network sites and then combine it with the new proposed similarity measurement. The function shows in the part of formulas (6) and (7). The definition of this time weight function is given bywhere means the interval time of two users that give the same type response information. We can tune the parameter to determine the decay rate of user’s dynamic interest. The default value of is 1.8, but you can also change the value to make the recommendation system have the best performance according to your application environment. The represents the number of common items users gave their positive response to, which equals the number of the common sets .
Then, we will use the regulatory factor to combine with these two types of different similarity measurements, so the combination similarity calculation method can be defined as where the value of regulatory factor and and , respectively, represent the weight of positive response and negative response in the equation. We can adopt the optimal weight value to optimize the recommendation result.
Finally, we try to combine the similarity measurement based on social user response with the similarity calculation process of traditional collaborative filtering recommendation algorithm. On the one hand, in the early stage of the new register user, the traditional recommendation system cannot give a good recommendation result, due to the fact that no available user historical data can be used. But once the recommendation got enough user preference historical data, the performance of it will be much better. But it takes time to collect user useful time.
By combination with the user’s response information, we can alleviate the cold start problem of recommendation system. It collects the response information more quickly by just some user clicks and gives the preliminary recommendation to user. On the other hand, it can also guarantee a certain diversity of the recommendation result by only focusing on the amount of response information not the content. The formalized user Pearson similarity method can be defined asAnd the Cosine similarity method is
3.2. The Predication Score Based on Social User’s Dynamic Information
After calculating the user similarity, the highest nearest neighbor to target user will be selected as the base set, in order to ensure the recommendation system recommend effect. Let the target user rated item set be ; the prediction score of any item , which user may give, can be obtained by calculation of the user one of the nearest neighbor’s historical rating scores of the item . The calculation method shows below: where is the similarity between the target user and the nearest neighbor user ; represents the historical rating score of user to the item ; the sign of and , respectively, means the average rating score value of users and . We can obtain the target user’s prediction rating of the item and then generate the recommended list to target user by using certain strategies to deal with.
3.3. The Time Complexity Analysis of Algorithm
In the process of recommendation algorithm, the algorithm involves a big part of user similarity computation. In particular when faced with the large data level of processing historical data, the performance of recommendation algorithm becomes extremely important. Here, we will analyze the time complexity of new proposed recommendation algorithm. The user similarity computation mainly involves two parts: one is the traditional similarity measure like Pearson similarity or Cosine similarity and other one is the similarity measure based on social user’s dynamic response information. The collaborative filtering algorithm mainly includes user similarity calculation and forecasts the target user’s rating scores.
According to Algorithms 1 and 2 process, we can know that the main factors which impact the time complexity of algorithm are the size of social user set and item set , while calculating the similarity. Let the size of user set be and the size of item set . According to the traditional user similarity measurement formulas (1) and (2), the time complexity of computing one pair of users’ similarity is ; then, the time complexity of computing any two different users’ similarity is . Considering the time complexity of traditional similarity calculation, we proposed selecting the nearest neighbor user set and then computing those users’ similarity. Let the time complexity of similarity measurement based on user’s positive response be and the time complexity of similarity measurement based on user’s negative response ; then we can simplify the formula and get the time complexity of the integrated similarity measurement as . Then the time complexity of new proposed user similarity of any two users is . In the part of user prediction score computation, the main factors influencing the time complexity of algorithm are the size of nearest neighbor set and the size of item set .

4. Experimental Evaluation
In order to validate the fact that the new proposed recommendation algorithm has better performance than the traditional user based collaborative filtering recommendation algorithm, we have collected the data of domestic mainstream social network site called Sina Weibo to complete the relevant experiments.
Since the grabbed data from the original Sina Weibo site has a lot of redundant information, therefore, it needs to extract and transform the raw data, commonly known as the process of ETL, eliminate the irrelevant information, and get the exact information we need. The data has about 6040 Sina users with about 3682 pieces of Weibo information, and 100 thousand response information logs. Considering the scenario in Sina Weibo, the negative response information of user is very difficult to define. The situation of without the user’s browsing, no forwarding, no comments, and so forth often denotes the user has no interest in this information. So consider that the new similarity measurement regulatory factor is set as , which means only social user’s positive response information be considered. We regard this situation of user collection, forwarding, and comment microblog information as positive response type.
For the existing data set, the data set is equally divided into 10 subsets by way of random selection, of which nine were randomly selected as the training set and the remaining one was selected as the test set. Due to the presence of the decreasing function parameter , which characterizes the dynamic of user interest, to get the optimization value of parameter will have a great influence on the final recommendation results. So we will get the optimal parameter before we do the further experiments. We use the average absolute deviation MAE [6, 11, 17, 18] as the evaluation metric to evaluate the performance of recommendation algorithm. The lower the MAE is, the more accurately recommended result the recommendation can give. We can obtain the evaluation metric MAE data of the improved recommendation algorithm by changing the parameter as 1.0, 1.5, 1.8, and 2.1. Then, we can also get Figure 1.
We can draw a conclusion that the improved algorithm has the best performance from Figure 1, when the decay rate parameter is 1.8. But the optimal value of parameter may be different from this value; it should be tuned according to your specific environmental factors. One of the main factors influencing recommendation algorithm performances is the feature of data set.
Once we determined the parameter , we designed the comparative experiment between the new recommendation algorithm based on social user’s dynamic information and the user based collaborative filtering recommendation algorithm. The collaborative filtering algorithm is one of the classic and most successful recommendation algorithms; so as a comparison basis, it will have some convincing. Under the same experiment conditions, the data set is also equally divided into 10 equal subsets by way of random selection, of which nine were randomly selected as the training set and the remaining one was selected as the test set; then the average MAE value of 10 experiments was the MAE evaluation metric result of the recommendation algorithm. We can get the following figure by changing the number of nearest neighbors to 5, 10, 15, 20, and 25. The experiment result shows in Figure 2.
From Figure 2, the improved algorithm outperforms better recommendation result than the traditional user based collaborative filtering algorithm at the same number of nearest neighbors. Further, from the same type of algorithm, but with different number of nearest neighbors, we can see that as the number of nearest neighbor users increases, the recommendation algorithm gets better results; namely, the value of MAE presents a descending trend, but not infinite decline. It will become one kind of steady state. Therefore, the contrast experiment gives us a lesson that how to select the optimal set of parameters will have a great effect on the recommended result of recommendation algorithm in the specific personalized recommendation system application environment. It also can involve some optimization and method to get the best recommendation performance.
5. Conclusion
In this paper, the theory of collaborative filtering recommendation algorithm is briefly introduced. The collaborative filtering recommendation algorithm is a kind of widely used and more mature algorithms and has a good recommended effect of recommendation algorithm. However, the collaborative filtering recommendation algorithm is a flawed one, in some aspects. For example, the traditional collaborative filtering algorithm did not take into account the temporal characteristics of user’s interest while computing the similarity, so it will lose a part of the recommendation accuracy and diversity. Meanwhile, with the rise of social network, the social network user surged, so the users are faced with the problem of information overload on social network sites. But the social user contains rich contextual information; therefore, this paper takes the dynamic information of social user into account to propose the improved algorithm to alleviate the problem and validates the fact that the new improved algorithm has the improvement of recommended effect by contrast experiment. However, the social network user’s dynamic information not just only the response information and time factor; there are also geographical information, social relationship information, and other context information. We will continually optimize the current deficiency of the proposed algorithm and also try to do the work of how to better model social user’s dynamic information, dig deeper into the user’s behavior patterns, and combine them into recommendation algorithm to improve the recommendation algorithm performance in our next step of research.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported in part by NSFC Grant nos. 61472284, 61472004, and 61202384, by National 863 Programs Grant no. 2012AA062800, by Natural Science Foundation Programs of Shanghai Grant no. 13ZR1443100, and by ISTCP Grant no. 2013FM10100.