Mathematical Problems in Engineering

Volume 2017 (2017), Article ID 1383891, 12 pages

https://doi.org/10.1155/2017/1383891

## SHMF: Interest Prediction Model with Social Hub Matrix Factorization

^{1}Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui 230031, China^{2}University of Chinese Academy of Sciences, Beijing 100049, China^{3}Institute of Applied Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui 230088, China^{4}University of Science and Technology of China, Hefei, Anhui 230031, China

Correspondence should be addressed to Shu Yan

Received 24 January 2017; Accepted 5 June 2017; Published 22 August 2017

Academic Editor: Zonghua Zhang

Copyright © 2017 Chaoyuan Cui et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

With the development of social networks, microblog has become the major social communication tool. There is a lot of valuable information such as personal preference, public opinion, and marketing in microblog. Consequently, research on user interest prediction in microblog has a positive practical significance. In fact, how to extract information associated with user interest orientation from the constantly updated blog posts is not so easy. Existing prediction approaches based on probabilistic factor analysis use blog posts published by user to predict user interest. However, these methods are not very effective for the users who post less but browse more. In this paper, we propose a new prediction model, which is called SHMF, using social hub matrix factorization. SHMF constructs the interest prediction model by combining the information of blogs posts published by both user and direct neighbors in user’s social hub. Our proposed model predicts user interest by integrating user’s historical behavior and temporal factor as well as user’s friendships, thus achieving accurate forecasts of user’s future interests. The experimental results on Sina Weibo show the efficiency and effectiveness of our proposed model.

#### 1. Introduction

Online microblog systems such as Sina Weibo, Twitter, and Facebook provide a convenient platform for users to share their information. The number of such social media users showed exponential growth in last decade. A recent snapshot of the friendship network Facebook indicated that there are over 1 billion users in it. These social networks are becoming not only effective means to connect their friends but also powerful information dissemination and marketing platforms to spread ideas, fads, and political opinions.

Microblog contains a vast amount of information, and topics of users and user groups always change with hotspot at home and abroad or over time. In this context, research on user interest prediction is useful in network marketing, public opinion analysis, or even public security [1]. Generally, interest prediction is to generate potential and possible topics in the next time point according to one’s historical blog posts. Unfortunately, blog posts are almost short text; both user-keyword matrix and user-topic matrix of microblogs are relatively very sparse. Moreover, in the prediction model, contents of the related matrices transfer with lots of factors, such as time information and friendship in social hub. Therefore, interest prediction is still a challenging problem.

It should be noted that user interest prediction is different from user interest detection, as the latter mainly focuses on mining users’ current interests. Interest prediction remains a relatively understudied problem that poses two main challenges. First, user interest in microblog changes over time or time interval. In the time-aware prediction model, user’s temporal preference is an important aspect. Furthermore, long-term preference and short-term preference will result in different prediction result. Second, user interest is a dynamic phenomenon; it maybe migrates due to the topic migration of one’s social hub. In the real world, capturing user’s friendship and their topics is difficult.

Recently, a lot of models for prediction have been investigated [2–4]. A typical method exploits the probabilistic matrix factorization (PMF) technique to learn latent features for users and topics. These kinds of algorithms are mostly based on the blog posts published by user to predict his interest.

In fact, we observed several interesting phenomena. There exist some users who publish less but browse more blog posts and we call them silent type users. Such users may have very explicit interest and just may be prudent to express their ideas. And they do publish their opinion at an appropriate moment. However, existing prediction models always fail to predict their interests. Another kind of users expands their social hubs by focusing on new friends’ topics they are interested in. We call them interactive type users. In other words, the interest of such users can be represented by the interest of direct neighbors in their social hubs to some extent. Obviously, prediction models ignoring the impact of this interactive property always result in incomplete forecast.

In order to overcome the shortcomings of existing works, combining our observations about microblog, this paper proposes a social hub matrix factorization-based model for user interest prediction model in microblog, which is called SHMF. SHMF incorporates the impact of user’s social hub on user’s interests in our model to improve the quality of prediction. The experimental results on Sina Weibo dataset show that our approach improves the prediction accuracy and the performance efficiency.

The rest of this paper is organized as follows. The related work is discussed in Section 2. Some preliminary knowledge and research are introduced in Section 3. We present our proposed model in Section 4 and give the implementation details in Section 5. In Section 6, we describe the real datasets we used in our experiments. Our experiments are reported in Section 7. Finally, we conclude the paper and present some directions for future work in Section 8.

#### 2. Related Work

With regard to user interest prediction in microblog, there are a series of mature methods that are based on probability matrix factorization of probabilistic graph model. Probabilistic graph model is a kind of model which can concisely express complex probability distribution, effectively calculate the edge and condition distribution, and conveniently learn the parameters and hyperparameters in probability model [5], while probability matrix factorization based on this model is often used to predict the user’s interests and recommendations.

In 2008, Salakhutdinov and Mnih [2] proposed a probability matrix factorization (PMF) method for the traditional collaborative filtering algorithm which cannot solve the problem of the recommendation of large sparse dataset and cold start. Experiments on datasets of Netflix demonstrate the effectiveness of PMFs on large number of sparse unbalanced datasets. In the same year, Ma et al. [3] applied PMF to social network and socialization recommendation and analyzed the complexity and prediction accuracy of this method in detail. In 2010, combining the characteristics of social networks, Jamali and Ester [4] proposed a social probability matrix factorization (SocialMF) model based on the consideration of the social trust relationship between users. This model promotes the application prospect of PMF in socialization recommendation. In 2003, Sun et al. [6] proposed a method to model the user’s timing behavior and combined this method with the SocialMF to predict the Weibo user’s interest, the experimental results of which prove that this way of modeling is more effective than the traditional recommendation algorithm based on label information. Taking into account the fact that user interest is changing over time, Bao et al. [7] introduced a new temporal and social PMF-based (TS-PMF) method to predict users’ interests in microblog. Compared with previous methods of interest prediction, this method has higher accuracy.

The above studies neglect the impact of the information of the blogs posted by others in their social hub on the user’s future interest and behavior, when they establish the Weibo user interest prediction model. Aiming at this problem, in this paper, we propose a new user interest prediction model (SHMF) based on PMF, which combines user’s history behavior, user’s social trust relationship, and the impact of the information of the users’ social hub on the user’s interests in the future. And it designs experiments on the Sina microblog real dataset to prove that this prediction model and the algorithm of the model are superior to the previous prediction model in top- accuracy [8].

#### 3. Preliminaries

In this section, we give the notations that will be used in the following discussions. In prediction model, we have a set of users and a set of topics in a microblog dataset.

The users’ interests expressed by user-topic matrix are given in , where if user has published posts on topic . We divide users’ historical data into time points and construct a set of user-topic matrix to represent user’s interests over time. Furthermore, considering the impact of user’s social hub on his/her interest, we can construct a set of user’s social hub-topic matrix according to the blogs posted by friends of his/her social hub.

In microblog, each user can follow others whom he is interested in; then users’ friendships can be described as a user-user matrix , where which denotes that has followed . Each user can mainly read the blogs posted by his friends of his social hub. Obviously, there are interactions among different users’ social hubs. Users’ social hubs can be described as a hub-hub matrix . We set if the number of users in the intersection of hub and hub is and the number of users in hub is . Hub is a set of users who are followed by , and we have a set of user social hubs .

Generally, user interest prediction model is to generate a user-interest matrix in the next time segment. The basic matrix factorization (MF) approach finds the approximate matrix of the original matrix in the low-rank space as a predictive approximation matrix. It has been proven to be effective to learn the latent characteristics of users and topics and predict the scores using these latent characteristics. The conditional probability of the known scores is defined as

As is shown in (1), and are the latent characteristics of users and topic feature matrices, with column vectors and representing -dimensional user-latent and topic-latent feature vectors, respectively; , where is the transpose of . is the Gaussian distribution with mean and variance , and is the indicator function that is equal to 1 if and is equal to 0 otherwise. The function is a logistic function with the formula , which makes it possible to bound within the range .

In fact, the relations among users in social network architecture play an important role in users’ behaviors [9, 10]. Specifically, a user is more and more similar to his/her friends. SocialMF model incorporates social influence into the MF approach for prediction, adding the user-user relationship matrix :

Figure 1 shows the graphical model corresponding to (2). In Figure 1, the edges among the latent feature vectors of users are representatives of the trust relationship among users and the degree of trust of user on user is .