A Prediction Method of Mobile User Preference Based on the Influence between Users

Shi, Yancui; Cao, Jianhua; Xiong, Congcong; Zhang, Xiankun

doi:https://doi.org/10.1155/2018/8081409

International Journal of Digital Multimedia Broadcasting

On this page

Abstract Introduction Related Work Experimental Results and Analysis Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2018 | Article ID 8081409 | https://doi.org/10.1155/2018/8081409

A Prediction Method of Mobile User Preference Based on the Influence between Users

Yancui Shi,¹Jianhua Cao,¹Congcong Xiong,¹and Xiankun Zhang¹

Academic Editor: Yifeng He

Received18 Apr 2018

Accepted08 Jul 2018

Published19 Jul 2018

Abstract

User preference will be impacted by other users. To accurately predict mobile user preference, the influence between users is introduced into the prediction model of user preference. First, the mobile social network is constructed according to the interaction behavior of the mobile user, and the influence of the user is calculated according to the topology of the constructed mobile social network and mobile user behavior. Second, the influence between users is calculated according to the user’s influence, the interaction behavior between users, and the similarity of user preferences. When calculating the influence based on the interaction behavior, the context information is considered; the context information and the order of user preferences are considered when calculating the influence based on the similarity of user preferences. The improved collaborative filtering method is then employed to predict mobile user preferences based on the obtained influence between users. Finally, the experiment is executed on the real data set and the integrated data set, and the results show that the proposed method can obtain more accurate mobile user preferences than those of existing methods.

1. Introduction

The popularity of the mobile terminal (e.g., smart phone, tablet) and the improvement of the wireless network (e.g., 3G, 4G, 5G) means mobile users can access information or services anytime and anywhere [1, 2]. Hence, the mobile user wants to obtain personalized information timely and accurately from the “information ocean”. In the mobile network, due to the extensive growth of mobile network services, it is time-consuming and frustrating for mobile users to find the information or services that meet their needs. As a filtering tool of information, the personalized recommender system can solve the above problem well; this system has also been applied in e-commerce (e.g., Amazon, eBay, Netflix), information retrieval (e.g., Google, Baidu), e-tourism, and Internet advertising [3]. The key of the personalized recommender system is how to accurately predict the user preference.

In the recommender system, the common prediction method of the user’s preference is collaborative filtering (CF), which predicts the target user’s preference according to the preferences of the nearest neighbors who have similar preferences to those of the target user. However, user preference is not only impacted by the nearest neighbors but by family, friends, colleagues, and other users. Hence, knowing how to find the most influential users to the target user is vital. Currently, the research about the influence measurement methods mainly focuses on the micro-blog application [4–6], while the research about the mobile social network is relatively small.

The mobile network has unique characteristics. The mobile terminal not only provides the tool for users to telecommute and access information but provides the ability to access the contextual information due to the improvement of the hardware. For example, not only can it obtain the explicit contextual information directly, such as time, location, and weather, but it can obtain the implicit contextual information using statistical analysis and reasoning, such as through user interest and social relations [7]. According to the mobile user behavior recorded by the mobile terminal, it can obtain more realistic social relationships and more accurate user influence, thus improving the accuracy of the predicted mobile user preference.

To summarize, the contributions of the paper are as follows: (1) A new calculation method of the influence of the users is proposed, which considers the topology of the mobile social network and the behavior that the mobile user has when using mobile network services. The mobile social network is constructed by analyzing the communication behavior between users. (2) A new calculation method of the influence between users is proposed, which considers the influence of users, the similarity of mobile user preferences, and the interaction behavior between users under context. When calculating the similarity of mobile user preferences, the context information and the order of user preferences are considered. (3) An improved prediction method of mobile user preference is proposed, which considers the obtained influence between users.

In the mobile network, users can access information or services anytime and anywhere by smart phone, such as WeChat, QQ, micro-blog, Youku, and meituan.com. Take meituan.com, for example; there are a massive number of items, and knowing how to accurately find the items that the user is interested in is very important.

In the recommender system, the common prediction method of the user’s preference is CF. To improve the accuracy of the predicted user preferences using CF, the researchers employ the matrix factorization [8], introduce the user’s trust [9], or rely on context information [10]. Although the matrix factorization can solve the sparsity problem, it cannot find the accurate nearest neighbors due to not considering the social information of the users. The methods that introduce context information can more exactly locate user preferences, but they further aggravate the sparsity problem. Hence, the accuracy of the whole user preference is lower.

In addition, the user’s preference is also impacted by family, friends, classmates, or followed celebrities. For example, the choices of users will be impacted by family or friends who use meituan.com, while the user will be impacted by followed celebrities on Youku. Hence, it is feasible to introduce the influence of users on the prediction method of user preferences. To find the accurate nearest neighbors, some researchers introduced the user’s trust in the prediction model [11, 12]. However, in the above methods, the authors only considered the interaction of voice calls and SMSs (short message or messaging service), so the obtained relationship is not all encompassing and accurate. With reference to the method, in the paper, the influence between users is considered in the prediction model of mobile user preferences.

Currently, the measurement methods of the user influence are divided into three categories: the method based on network topology, the method based on user behavior, and the method based on interaction behavior [13, 14]. The method based on network topology measures the influence according to the nodes (the centrality, closeness centrality, etc.) or the edges (edge betweenness, common neighbors, etc.) [15, 16]. Muruganatham et al. considered that every calculation method of centrality had its advantage and disadvantage, so a variety of calculation methods of centrality were employed to measure user influence [17]. However, the method did not evaluate the obtained influence, and it only gave the user’s rank according to the influence obtained by different methods. Brown et al. employed the improved -shell decomposition method to measure the influence of a Twitter user by analyzing the network topology [4]. According to information from the user’s Twitter account, Cossu used the traditional user characteristics and the unique characteristics of the Twitter user to measure the user’s influence [5]. These characteristics included the local topology, the whole topology, and other features associated with the network structure. The method employed an F-score and mean average precision in the evaluation. Whether the methods were based on nodes or on edges, they did not consider the information of the user behavior and simply calculated the influence according to the network topology, so the accuracy of the obtained user influence was not ideal.

To improve the accuracy of the obtained user influence, some researchers measured the user influence by analyzing the user behaviors, which included the login behavior and the information generated by users (e.g., comments, forwarding) [18]. For example, by analyzing the user’s behavior related to economics, Bakshy et al. calculated user influence in social advertising according to the social information [19]. Mao et al. considered the network topology and user behavior to measure the influence of the micro-blog user accurately and employed the Spearman rank correlation coefficient as the evaluation [20]. Similarly, Verenich et al. measured user influence according to the number of users that used some product latter than the target user and employed the mean decrease accuracy and the area under the cumulative gains score as the evaluation [21]. Rabiger et al. extracted some user features (community structure, activity, the quality of releasing information, the user centrality) and employed the supervised method to learn about the user’s influence [6]. Tommasel et al. proposed a new formula to measure the user’s influence according to the network topology and user behavior information [22]. However, the calculation methods based on network topology and based on user behavior both only consider the influence that user has in the social network without considering the specific influence between users.

The calculation method of the influence based on the interaction information considers the interaction behavior between users, such as forwarding and commenting. Anger et al. considered the content of communication between users and mutual information in the constructed model of influence [23]. Cataldi et al. proposed a calculation method of user influence based on domain by analyzing the mutual information between users [24]. Guo et al. employed the maximum likelihood estimation to calculate the influence between users by analyzing the history logs of user behavior [25]. Liu et al. proposed the time-based influence graph (TIG) model and algorithm based on the network topology and time dimension according to the characteristics of the mobile data [1]. The algorithm considered the dynamics of the mobile data, the user behavior, and user mutual behavior incomplete, without considering the impact of the context and the similarity of the user’s preferences. In the algorithm, accuracy is employed to measure the obtained influence.

3. The Proposed Method

In the paper, the used variables are defined as follows: U represents the set of mobile users; N_U = represents the number of mobile users, and i = 1,2,…N_U; W represents the set of interaction ways that the mobile user used; N_W = represents the number of interaction ways, and l = 1,2,…N_W; S represents the set of the mobile network services; N_S = represents the types of mobile network services, and k = 1,2,…N_S; C represents the set of the context instances; N_C = represents the number of the context instances, and r = 1,2,…N_C; and represents the preference of u_i toward s_kunder C_r, which is set in the integer in . Currently, the value of influence is set in the interval in the most research; therefore, the setting is adopted in the paper.

Definition 1 (the influence of the user, ). The generates influence toward the other users due to his behavior in the social network. The influence of the user is used to measure influence macroscopically.

Definition 2 (the influence between users, ). The generates influence toward the u_j due to his behavior, the interaction behavior with u_j, the common preference with u_j, and the other factors, , j ≠ i. The influence between users is used to measure influence microscopically.

In addition, an important problem that needs to be solved is how to integrate the obtained multiple kinds of influence. The method in [22] calculated the influence based on network topology and user behavior, respectively, and set the same weight value when integrating the influence. The method in [23] extracted many user features, calculated the influence of each feature, and set the same weight value for each obtained influence. In the paper, multiple factors are considered. Considering the importance of each factor toward influence is different; hence, the weight of each factor is set to the optimal value, according to experimental results.

3.1. The Influence of the User

The steps of calculating the influence of the user are as follows.

(1) The Quantification of Interaction Behavior between Users. The interaction between users is not only achieved by the basic means of communication (voice, SMS, etc.) but it is achieved by the instant message software installed in the mobile terminal (WeChat, QQ, etc.). The different interaction methods have different measurements. For example, the SMS uses the number of messages for measurement, while the voice call uses the duration for measurement. In addition, the times of interaction also impact the relationships between users. Suppose the duration for which u₁ communicates with u₂ is 100 minutes and the time of communication is 1, while the duration for which u₁ communicates with u₃ is also 100 minutes and the time of communication is 10. Obviously, the relationship between u₁ and u₃ is better than that between u₁ and u₂. Hence, in the paper, the interaction volume employs the duration (or the number of messages) and the times for measurement.

The voice call is continuous, so the times of communication are computed easily. To avoid introducing harassing phone, only when the communication duration is longer than 10 seconds or the times of communication are more than one, the communication is regarded as effective. When using SMS or instant message software, a communication is not a message, but the whole process of interacting, which consists of many messages. Take the SMS, for example; in a communication, the two parties usually use multiple messages to interact. In the paper, we adopted the following rules to determine a communication using SMS or instant message software: (1) all messages cannot come from a party, and (2) the interval between messages is less than 5 minutes. The quantification formula of the interaction behavior between users is as follows:where represents the quantification value of the interaction behavior between u_i and u_j, , j ≠ i; represents the duration for which u_i interacts with u_j using ; represents the means of the duration for which all users interact using ; represents the times at which u_i interacts with u_j using ; represents the means of the times at which all users interact using ; and a₁ and a₂ represent the weight of the duration and times of the interaction, respectively, and a₁ + a₂ = 1. In addition, the formula shows that the quantification value of interaction behavior is symmetrical; that is, = .

(2) The Construction of the Mobile Social Network. In the paper, the undirected graph G(V,E) is employed to represent the mobile social network; V represents the set of the nodes in the social network, which is the set of mobile users; and E represents the set of edges in the social network. When > , e(u_i,u_j) = 1, it represents the existing social relationship between u_i and u_j, or else e(u_i,u_j) = 0 represents the fact that there is no social relationship between u_i and u_j. Here, e(u_i,u_j)∈E, and represents the threshold used to judge whether there is social relationship between users.

(3) The Calculation of the Influence of the User. When calculating the influence of the user, the network topology and user behavior are considered. The centrality is employed to reflect the impact of the network topology; the mean amount of the interaction between the target user and other users and the usage of the mobile network services (video, game, music, news, etc.) by the target user are employed to reflect the impact of user behavior.

The centrality measures the user’s influence on the social network according to the number of the target users’ neighbors. The more the neighbors are, the greater the influence is [14]. In Internet, a social network consists of the users in the virtual world. While in the mobile network, a social network is gradually transitioning to that which consists of the users in the physical world, and the trust between users also increases. On a mobile social network, the user generally interacts with acquaintances, such as family, friends, colleagues, and classmates, by voice or SMS. However, the contacts of WeChat or QQ could be acquaintances or could be strangers. When employing the centrality to measure the influence of the user, there will be some deviation regarding the obtained influence when only considering the number of the target user’s neighbors. Suppose in the mobile social network that the numbers of neighbors of u₁ and u₂ are both 100 and that 98 neighbors of u₁ are acquaintances, while 2 neighbors of u₂ are acquaintances. Obviously, when only considering the number of neighbors, the influence of u₁ is equal to that of u₂. Actually, however, the influence of u₁ is greater than that of u₂.

The current instant messaging software can synchronize with the telephone book. Therefore, it can confirm whether the contact is an acquaintance through judging whether the contact is in the telephone book. If the contact is in the telephone book, it is regarded as an acquaintance; otherwise, it is regarded as a stranger. The formula of the centrality is as follows [13]:where , U_i represents the set of the u_i’s neighbors; the formula that is used to calculate is as follows:where represents the set of contacts that u_i’s telephone book includes; the value of θ is set to (0,1).

The greater the amount of interaction between the target user and the other users, the larger the implied contribution that the target user provides and, thus, the greater the user’s influence. In the paper, the mean amount of the interaction between users is employed to measure the influence of the user; the formula is as follows:where and N_i= represents the number of u_i’s neighbors.

The more the types of the mobile services are used and the greater the amount of the usage is, the greater the influence of the user in mobile social network is. The formula that is used to calculate the usage of the mobile network services is as follows:where and represents the set of the mobile network services used by u_i; = represents the types of the mobile network services used by u_i; represents the duration that u_i used s_k; and represents the times at which u_i used s_k.

The formula that is employed to calculate the influence of the user is as follows:where u_iU, λ₁ and λ₂ represent the weights of the different influences, respectively, and λ₁ + λ₂ = 1.

3.2. The Influence between Users

When calculating the influence between users, consider not only the interaction behavior, but also the similarity of the user’s preferences and the influence of the user. The steps that calculate the influence between users are as follows.

(1) The Influence between Users Is Based on Interaction Behavior. The influence of the user is depicted macroscopically, and it only considers the mobile users that the target user interacts with, while the influence between users analyzes the influence between two mobile users microscopically and concretely. In addition, the interaction behaviors under the different contexts have different impacts on the influence between users. For example, the user usually interacts with the other user for work during a workday, while he will interact with family or friends on the weekend [26]. Therefore, it is necessary to distinguish the influences under different contexts. The quantification formula of the interaction behavior under various contexts is as follows:where , j ≠ i, ; represents the quantification value of interaction behavior between u_i and u_j under C_r; represents the duration for which u_i interacts with u_j under C_r using ; and represents the times at which u_i interacts with u_j under C_r using .

The calculation formula of the influence between users based on interaction behaviors is as follows:where , j ≠ i; β_r represents the weight value of the C_r.

(2) Learn the Mobile User Preference. In the mobile network, the explicit user preferences are few, so the proposed method needs to learn the user preferences by analyzing the mobile user’s behavior in relation to the context. The goal is to map the user’s usage of mobile services relying on integers of 1 to 5, and the distribution of obtained preferences is consistent with Pareto’s law.

The user’s usage of the mobile network services under the context is as follows:where , , ; represents the duration for which u_i used s_k under C_r; and represents the times at which u_i used s_k under C_r.

The value of the user’s preferences will increase when the user’s usage of the mobile network services increases. When the value of the usage is larger than some value, the growth of the user’s preferences slows down and gradually tends toward a constant value, which is in line with the characteristics of the logarithmic function with a base bigger than 1. Hence, the logarithmic function is employed to calculate the user’s preferences. The calculation formula of the user’s preferences is as follows:where , , ; round() represents the rounding function; and represent the minimum and maximum of the user’s preferences obtained by the formula (10a); and b is the parameter, and . Additionally, b is set according to the Pareto’s law.

(3) The Influence between Users Is Based on the Similarity of the User’s Preferences. The Pearson correlation coefficient is employed to calculate the similarity of the user’s preferences. When calculating the similarity, it not only considers the impact of the context but also considers the order in which the user’s preferences occurred. If u₁ used the mobile network services and always lagged behind u₂, then u₁ is impacted by u₂, but u₂ is not affected by u₁. Hence, it is necessary to consider the order in which the user’s preferences occurred when calculating the influence between users based on the similarity of the user’s preferences. The calculation formula of the similarity is as follows:where , j ≠ i; represents the set of the mobile network services that u_i used before u_j under C_r; and is as follows:where represents the mean of the user’s preferences under C_r; the formula is as follows:

The formula that is used to calculate the influence between users based on the similarity of the user’s preferences is as follows:

According to the definition of the similarity, it shows that the similarity is asymmetrical; that is, ≠ .

According to the foregoing analysis, the calculation formula of the influence between users is as follows:where , j ≠ i; λ_i represents the weight of different influences, and λ₃ + λ₄ + λ₅ = 1.

3.3. The Prediction of Mobile User Preference

In this section, the improved CF is employed to predict mobile user preference. The procedure is as follows:

(1) Select the top-K most influential users for the target user according to the obtained influence between users.

(2) Mobile user preference is predicted according to the preferences of selected K users. The formula for predicting mobile user preferences is as follows:where , ; , represents the set of mobile network services that u_i did not use under the context C_r but his top-K most influential users had used; and U_i,K represents the set of the top-K most influential users of u_i.

4. Experimental Results and Analysis

This section introduces the used data set, experimental steps, experimental results, and the analysis.

4.1. The Data Set

The simulation experiment is executed in two data sets: (1) the reality mining of mobile users collected by the MIT Media Lab (RM data set), which includes the interaction behavior and the corresponding context (location, time, etc.) of the 94 smart phone mobile users from September 2004 to June 2005 [27]; (2) the data set integrated using the RM data set and the data set of MovieLens (RMM data set), which includes 6,040 mobile users, according to specific rules [12].

The RM data set includes the original context information, such as the date, time, base station information, and around-phone information obtained by Bluetooth. Hence, before performing the experiment, the location context (at home, workplace, elsewhere) is obtained using statistical analysis and reasoning according to the rules given in Tables 1, 2, and 3. The main basis of setting the rules is as follows: users usually rest at home at night (00:00~5:59); during the workday, the user usually will work at the workplace, while at noon, the user may go to the workplace, home, or elsewhere to eat or be entertained; on the weekend, the user may be at home or elsewhere for rest or entertainment.

Since the proposed algorithm in the paper considers the order in which user preferences occurred, the leave-one-out method is employed to select the training and test sets. In the paper, the data of the first 5 months is selected as the training data set, and the data of the sixth month is selected as the test data set.

4.2. Evaluation Method

The F-score is employed to evaluate the accuracy of the obtained social relationships or mobile user preferences; the formula is as follows [5, 28]:where Q represents the precision and R represents the recall, and their calculation formulas are as follows:where N_tr represents the number of obtained accurate social relationships or mobile user preferences; N_fr represents the number of obtained false social relationships or mobile user preferences; and N_fn represents the number of omitting social relationships or mobile user preferences. When using the F-score for evaluation, only the value of F can be used to evaluate the obtained results; the values of F, Q, and R can also be used to evaluate the obtained results, such as in [28]. In this paper, the latter is employed to evaluate the obtained results. The greater the values of F, Q, and R, the better the obtained results.

Root-mean-square error (RMSE) is employed to evaluate the accuracy of predicted mobile user preferences; the formula is as follows:where represents the real user’s preferences obtained with formula (10a) and (10b), represents the mobile user preferences predicted by the proposed method; and N_p represents the number of predicted user’s preferences.

4.3. The Experimental Steps

The benchmark method is the traditional CF, and the comparison methods include several improved CF methods. The experimental steps follow. In step (2)~step , the improved CF is employed to predict user preference, where K is set to round(ωN_u), ω is set to 0.1, and if K > 100, K is set to 100. The difference in step (2)~step is that the influence is different when using the improved CF.

(1) Determine the values of a₁ and . In this step, simply compute the value of when a₁ is set to different values in the training set of the RM data set. Thus, a₁ is set to the values in and its step size is 0.1. When a₁ = 0, it only considers the impact of the times of usage. When a₁ = 1, it only considers the impact of the duration of the usage. Additionally, is set to the values in , and its step size is 0.1. When = 0, it implies only if there is interaction behavior between users, it accounts for a social relationship between users in the physical world. However, it might introduce some noise data, such as harassing or dialed phone calls. When is too large, it is very stringent for the determination of the user’s social relationship, and the accuracy will increase, but it might lose part of the social relationship, resulting in a drop in the recall rate.

The F-score of the edges in the constructed mobile social network is employed to evaluate the results. The mobile social network will be constructed based on the optimum values of and a₁.

(2) Determine the value of θ. In this step, consider the impact of network topology when predicting the user’s preferences. First, compute in the mobile social network constructed by step (1), and then predict user preferences according to . When computing , θ is set to the value in , and its step size is 0.1. When θ = 1, it denotes the impact of acquaintances and strangers, which are not distinguished between; while when θ = 0, the impact of the strangers is not considered.

The F-score and RMSE of the predicted preferences are employed to evaluate the results. According to the experimental results, is computed where θ is set to the optimum value.

(3) Determine the values of λ₁ and λ₂. In this step, consider the influence of the user when predicting the user’s preferences. First, compute in the training set of the RM data set, where a₁ is set as the optimum values obtained by step . Then, compute by fusing the and obtained in step . Finally, predict the user’s preferences according to .

Due to λ₂ = 1 - λ₁, the setting of parameters is more simplified to set the value of λ₁. Thus, λ₁ is given the values in , and its step size is 0.1. The F-score and RMSE of the predicted preferences are employed to evaluate the results. According to the experimental results, is computed where λ₁ and λ₂ are set to the optimum values.

(4) Determine the value of β_r, which is the determination of the weight of the context instances. It needs to compute when β_r is given the different value in the training set of the RM data set. It then predicts the user’s preferences based on . The date context includes two kinds of context instances, the time context includes five kinds of context instances, and the location context includes three kinds of context instances. Hence, the number of all context instances is 2 5 3 = 30. In (8): ① if it does not consider the impact of the context, the weights of all context instances are equal; that is, β_r = 1 and r = 1,2,…,30; ② if it considers the impact of the context instances, the number of the parameters is 30, β_r is set to the values in , and its step size is 0.1. The genetic algorithm is employed to select the optimum parameters of β_r. The adaptive function of the genetic algorithm is set in formula (20).

The F-score and RMSE of the predicted preferences are employed to evaluate the results. According to the experimental results, is computed where β_r is given optimum values.

(5) Determine the value of b. In this step, compute the user’s preferences when b is given different values in the training set of the RM data set. Thus, b is given the value which makes the preference more consistent with Pareto’s law.

(6) Determine the values of λ₃, λ₄, and λ₅, that is, the weight of each influence when fusing the obtained influence. In this step, first, compute where β_r is given the optimum values according to step , and b is given the optimum value according to step ; then, compute by fusing , , and . Since λ₅ = 1 - λ₃ - λ₄, simply determine the value of λ₃ and λ₄. Similarity, the genetic algorithm is employed to determine the values of λ₃ and λ₄, and the adaptive function of the genetic algorithm is given in formula (20). Thus, λ₃ is given the values in , and its step size is 0.1; λ₄ is set to the values in [λ₃,1], and its step size is 0.1.

The F-score and RMSE of the predicted preferences are employed to evaluate the results. Additionally, is computed where λ₃, λ₄, and λ₅ are given the optimum values according to the experimental results.

(7) Determine the value of K, which is the number of the nearest neighbors. In the step, first, rank the obtained in step by descending order, and select the top-K nearest neighbors. Next, predict the given user’s preferences according to the preferences of the selected nearest neighbors. Additionally, K is set to round(ωN_u), and ω is set to 0.05, 0.1, 0.15, 0.20, and 0.25, respectively. If K > 100, K is set to 100. The experiment is only executed on the RM data set. Since the RMM data set includes too many users, K is set to 100 when the experiment is executed in the RMM data set.

The F-score and RMSE of the predicted preferences are employed to evaluate the results, and K is given the optimum values according to the experimental results.

(8) Compare different prediction methods. Due to considering the context information in the proposed method, the context is considered in all compared methods. The compared methods include the traditional CF, which considers the context information (CCF); the CF using the matrix factorization (MFCCF); the CF considering the user trust (TCCF), where the method in [11] is used to compute the trust; the method based on TIG algorithm in [1] (C-TIG); the method based on the k-shell decomposition method [4] (C-kSD); and the proposed method in the paper (PM).

The experiment is exacted on the RM data set and RMM data set. The F-score and RMSE of the predicted preferences are employed to evaluate the results.

4.4. Experimental Results and Analysis

(1) The impact of a₁ and will be examined. The experimental results are shown in Figures 1 and 2.

Figure 1 shows that when a₁ = 0.5 and = 0.1, the constructed mobile social network is the best; in most instances, when is set to 0.1, the experimental results are better, especially when a₁ = 0.5; only when a₁ = 0.1 or a₁ = 0.6 and is set to 0.2, the experimental results are better. The reason is as follows: ① when is small, it is loose for the restriction of the interaction behavior between users when judging whether there is social relationship, so we can get many accurate social relationships, and the recall is higher, but it also introduces excess relationships simultaneously, which makes the precision very low. Thus, the value of F is low. For example, when = 0, the obtained accurate social relationship is 69, but the obtained excess social relationship is 129, so the precision is low. ② When increases, the restriction becomes very stringent. The obtained accurate social relationship decreases, so the recall reduces. However, the excess social relationship also decreases with the increasing of , which improves the precision. Hence, the value of F also is improved. ③ When is larger, the obtained accurate social relationships are fewer, so the recall is lower, which makes the value of F also lower. Therefore, the determination of is to compromise in the precision and recall and to make the value of F the best.

According to the above analysis, in the following experiment, is set to 0.1.

Figure 2 shows that ① when , the change in R is not obvious, and when a₁ > 0.5, the value of R reduces; ② the value of Q meets the relationship of the convex function with a₁ substantially, and the value of Q is best when a₁ = 0.5; ③ similarly, the value of F meets the relationship of the convex function with a₁ substantially, and the value of F is best when a₁ = 0.5.

This is because the user’s influence is not only impacted by the interaction duration but by the interaction times. When a₁ = 0, it only considers the influence of the interaction times, without considering the impact of the interaction duration, so it loses or reduces the impact of the interaction behavior; when a₁ = 1, it only considers the influence of the interaction duration without considering the impact of the interaction times. Hence, a₁ is set to 0.5 in the following experiment.

(2) The impact of θ will be examined. The experimental results are shown in Figure 3. The experiment shows that when θ = 0.8, the obtained results are the best. This indicates that the impact of the stranger is smaller than that of the acquaintance.

(3) The weight of the different context instances are shown in Table 4.

Table 4 shows that the different context instances have different impacts on the influence. This is because there are different user behaviors under different context instances. For example, the user usually spends weekends with family or friends, so the behavior under these contexts (e.g., weekend, at night, at home, or elsewhere) has a greater impact. The user spends workdays in the office with colleagues, and the contacts are usually clients or colleagues due to the work.

(4) The optimal value of b and the experimental results are shown in Table 5. The number of learned preferences by the linear function is 5,089, and the percentage of the learned preference in is 45.61%.

Table 5 shows that when , the number of learned preferences declines and the percentage of the learned preferences in increases with the increase of b. When b > 200, the results of the learned preferences decrease and the percentage of the learned preferences in did not change much. When b = 100, the percentage of the learned preference in is the same as when b = 200, and the number of obtained user preferences is more. Hence, in the paper, b is set to 100.

In addition, compared with the results obtained by the linear function, when b > 70, the results obtained by the logarithmic function are better, except when b = 400 and b = 700. This indicates that the logarithmic function is superior to the linear function when b > 70.

(5) The impact of λ_i is discussed. The obtained optimal values are shown in Table 6.

Table 6 shows the following: ① The user’s behavior of using mobile network services plays a more important role than the network topology in the influence of the user. This is because a smart phone is not just a communication tool, and it can provide a wide range of applications, such as games, music, news, and shopping. Therefore, unlike before, the current influence of the user needs to be measured from many aspects, including traditional communication behavior, interactive behavior, and usage behavior of APP. ② The similarity of user preferences has the most impact on the influence between users; next is the interaction behavior; and the influence of the user has the least impact. This is because the user finds it easier to accept the view of users who have similar preferences to his, which is the basic idea of the recommender system based on CF [29]. Second, it is the influence from an acquaintance. Even though there are no similar preferences, to a certain extent, the target user may accept some views of acquaintances. The influence of the user measures influence macroscopically, such as through celebrities and leaders. The target user may be impacted by those users in some respect.

(6) The impact of K will be discussed.

Figure 4 shows that when ω is set to 0.15, the obtained results are the best. The reason is that when ω is small, although these selected nearest neighbors have very similar preferences with the target user, the selected nearest neighbors are fewer and the predicted preferences are not overall collected, so the accuracy is lower; when ω is large, although the predicted preferences are very wide overall, it introduces some excess preferences. Hence, the recall increases with the increase of ω, but the precision increases first and then falls, and the RMSE falls first and then increases.

In summary, the K is set to round(0.15N_u), which means K is set to 14 in the RM data set.

(7) We will examine the comparison of results obtained by different prediction methods. The results are shown in Figures 5 and 6.

Figures 5 and 6 show the following:

The results obtained by the proposed method are both the best in the RM data set and RMM data set, followed by the TCCF, MFCCF, CCF, C-TIG, and C-kSD.

① The results obtained by the C-kSD are the worst. When calculating the influence, it only considers the network topology and measures the influence according the user’s role in the mobile social network. The method mainly measures the influence of the user. Hence, this method cannot find the accurate nearest neighbors for the target user and leads the accuracy of the predicted mobile user preference to be lower. With respect to the C-kSD, not only does the C-TIG method consider the network topology, but it considers the interaction behavior and dynamic of the user’s behavior when calculating the influences. Therefore, compared with the results obtained by the C-kSD, those obtained by C-TIG are better.

② The results obtained by the CCF are superior to those obtained by C-TIG. This is because CCF considers the similarity of user preferences, while C-TIG considers the network topology and the interaction behavior incomplete. According to the results obtained by experimental step , we know that the similarity of user preferences plays a more important role in influence than the network topology and the interaction behaviors. Hence, the experimental results are consistent with results obtained by experimental step .

③ Compared with the results obtained by CCF, those obtained by MFCCF improved by 4.84% in F and reduced by 0.1048 in the RSME in the RM data set, as well as improving by 1.06% in F and reducing by 0.0199 in the RSME in RMM data set. This is because using this method, MFCCF alleviates the sparsity problem by matrix factorization and improves the accuracy of the obtained similarities of user preferences. Hence, the selected nearest neighbors are more accurate, and the accuracy of the predicted mobile user preferences is better.

④ The results obtained by TCCF are superior to those obtained by the MFCCF. The reason is that the TCCF introduces the trust information into CCF, and it considers the impact of the users who have similar preferences to the target user and whom the target user trusts. Not only does the introduction of the trust alleviate the sparsity problem, but it can find more accurate nearest neighbors for the target user.

⑤ The results obtained by the proposed method are the best. Compared with the CCF, the value of F improved by 11.46% and the RSME reduced by 0.1494 in the RM data set, and the value of F improved by 7.47% and the RSME reduced by 0.1673 in the RMM data set. The proposed method considers the impact of many factors, including the similarity of user preferences, the influence of the user, the interaction behaviors between users, the context information, and the order in which user preferences occurred. Hence, this method can find more accurate nearest neighbors for the target user compared with the other methods, and the accuracy of the predicted user preference is the best.

5. Conclusion

Due to the characteristics of the mobile network, the mobile user wants to get personalized mobile network services timely, anytime, and anywhere. How to obtain accurate user preferences has become a hot issue. User preferences will be impacted by other users, such as family, friends, or users who have similar preferences to those of the target user.

In the paper, a prediction method of user preferences is proposed. To improve the accuracy of the predicted user preferences, the proposed method considers the impact of many factors, including the similarity of the user preferences, the influence of the user, the interaction behaviors between users, the context information, and the order in which user preferences occurred. The experimental results show that the proposed method is superior to the existing methods in the accuracy of predicted user preferences. In addition, according to the experimental results, we can conclude that the similarity of user preferences plays the most important role in the prediction of user preferences; next is the user interaction behavior, and the user’s behavior has the least impact on user preferences.

In the paper, the propagation and the dynamic of the influences are not considered. In addition, the public data, which include mobile user behavior and context information, are few, and the MIT is the only data set. However, the mobile users included in the data set of MIT are few. In future works, we will seek the newest data set to verify the proposed method.

Data Availability

The data used in this article includes the following: (1) the reality mining of mobile users collected by the MIT Media Lab (RM data set), which can be downloaded from website; (2) the data set integrated using the RM data set and the data set of MovieLens, which can be downloaded from website (RMM data set). The readers can access the data supporting the conclusions of the study according to the given formula and rules in the article using the above data sets.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (NSFC) (Grants nos. 61402331, 61702367, 61402332, and 61502338) and the Research Plan Project of Tianjin Municipal Education Commission (2017KJ035 and 2017KJ033).

References

Z. P. Liu and D. C. Pi, “Mining social influence of nodes from mobile datasets,” Journal of Computer Research and Development, vol. 50, pp. 244–248, 2013.
View at: Google Scholar
L. B. Chen, S. J. Li, and G. Pan, “Smartphone: pervasive sensing and applications,” Chinese Journal of Computer, vol. 38, no. 2, pp. 423–438, 2015.
View at: Google Scholar
Z. H. Huang, J. W. Zhang, C. Q. Tian, S. L. Sun, and Y. Xiang, “Survey on learning-to-rank based recommendation algorithms,” Journal of Software, vol. 27, no. 3, pp. 691–713, 2016.
View at: Google Scholar
P. Brown and J. Feng, “Measuring user influence on twitter using modified k-shell decomposition,” in Proceedings of the 5th International AAAI Conference on Weblogs and Social, Media, Barcelona, Spain, 2011.
View at: Google Scholar
J. Cossu, N. Dugue, and V. Labatut, “Detecting Real-World Influence through Twitter,” in Proceedings of the second European Network Intelligence Conference (ENIC), pp. 83–90, IEEE, Karlskrona, Sweden, September 2015.
View at: Publisher Site | Google Scholar
S. Räbiger and M. Spiliopoulou, “A framework for validating the merit of properties that predict the influence of a twitter user,” Expert Systems with Applications, vol. 42, no. 5, pp. 2824–2834, 2015.
View at: Publisher Site | Google Scholar
X. Hu, T. H. S. Chu, V. C. M. Leung, E. C.-H. Ngai, P. Kruchten, and H. C. B. Chan, “A Survey on mobile social networks: Applications, platforms, system architectures, and future research directions,” IEEE Communications Surveys & Tutorials, vol. 17, no. 3, pp. 1557–1581, 2015.
View at: Publisher Site | Google Scholar
X. Liu, C. Aggarwal, Y. F. Li, X. N. Kong, and X. Y. Sun, “Kernelized matrix factorization for collaborative filtering,” in Proceedings of the SIAM Conference on Data Mining, American Statistical Association, Miami, Florida, USA, 2016.
View at: Google Scholar
L. H. Wu and W. F. Chen, “Personalized Recommendation Based on Trust and Preference,” Applied Mechanics and Materials, vol. 713-715, pp. 2288–2291, 2015.
View at: Publisher Site | Google Scholar
C. C. Wu and M. J. Shih, “A context-aware recommender system based on social media,” in Proceedings of the Conference on Computer Science, Data Mining Mechanical Engineering (ICCDMME, Bangkok , Thailand, 2015.
View at: Google Scholar
X. Hu, X. W. Meng, Y. J. Zhange, and Y. C. Shi, “Recommendation algorithm combing item features and trust relationship of mobile users,” Journal of Software, vol. 25, no. 8, pp. 1817–1830, 2014.
View at: Google Scholar
H. Geng, X. W. Meng, and Y. C. Shi, “A mobile user preference prediction method based on trust and link prediction,” Journal of Electronics Information Technology, vol. 35, no. 12, pp. 2972–2977, 2013.
View at: Google Scholar
X. D. Wu, Y. Li, and L. Li, “Influence analysis of online social networks,” Chinese Journal of Computer, vol. 37, no. 4, pp. 735–752, 2014.
View at: Google Scholar
D. Huang, Y. Du, and Q. He, “Migration algorithm for big data in hybrid cloud storage,” Journal of Computer Research and Development, vol. 51, no. 1, pp. 199–205, 2014 (Chinese).
View at: Google Scholar
J. M. Sun and J. Tang, “A survey of models and algorithms for social influence analysis,” Social network data analytics, pp. 177–214, 2011.
View at: Google Scholar
D. Wei, X. Deng, X. Zhang, Y. Deng, and S. Mahadevan, “Identifying influential nodes in weighted networks based on evidence theory,” Physica A: Statistical Mechanics and its Applications, vol. 392, no. 10, pp. 2564–2575, 2013.
View at: Publisher Site | Google Scholar
A. Muruganantham and G. Meera Gandhi, “Ranking the influence users in a social networking site using an improved TOPSIS method,” Journal of Theoretical and Applied Information Technology, vol. 73, no. 1, pp. 1–11, 2015.
View at: Google Scholar
G. F. Zhu, Y. Yang, Z. R. Zhou, Z. Y. Ying, and F. J. Han, “A method of calculating the influence of micro-blog users based on domain,” Journal of Southwest University (Natural Science Edition, vol. 36, no. 3, pp. 145–151, 2014.
View at: Google Scholar
E. Bakshy, D. Echles, R. Yan, and I. Rosenn, “Social influence in social advertising: evidence from field experiments,” in Proceedings of the 13th ACM Conference on Electronic Commerce, p. 4, ACM, Valencia, Spain, 2012.
View at: Google Scholar
J. X. Mao, Y. Q. Liu, M. Zhang, and S. P. Ma, “Social influence analysis for micro-blog user based on user behavior,” Chinese Journal of Computer, vol. 37, no. 4, pp. 791–800, 2014.
View at: Google Scholar
I. Verenich, R. Kikas, M. Dumas, and D. Melnikov, “Combining propensity and influence models for product adoption prediction,” in Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ACM, France, Paris, 2015.
View at: Google Scholar
A. Tommasel and D. Godoy, “A novel metric for assessing user influence based on user behaviour,” in Proceedings of the 1st International Workshop on Social Influence Analysis, Buenos Aires, Argentina, 2015.
View at: Google Scholar
I. Anger and C. Kittl, “Measuring influence on Twitter,” in Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies, ACM, Graz, Austria, 2011.
View at: Google Scholar
M. Cataldi, M. Nupur, and M. A. Aufaure, “Estimating domain-based user influence in social networks,” in Proceedings of the 28th Annual ACM Symposium on Applied Computing, ACM, Salamanca, Spain, 2013.
View at: Google Scholar
J. Guo, Y. N. Cao, C. Zhou, P. Zhang, and L. Guo, “Influence weights learning under linear threshold model in social networks,” Journal of Electronics Information Technology, vol. 36, no. 8, pp. 1804–1809, 2014.
View at: Google Scholar
Y. C. Shi, X. W. Meng, Y. J. Zhang, and M. Xiao, “A trust calculating algorithm based on mobile phone data,” in Proceedings of the Global Communications Conference, IEEE, Los Angeles, USA, 2012.
View at: Google Scholar
N. Eagle, A. Pentland, and D. Lazer, “Inferring friendship network structure by using mobile phone data,” Proceedings of the National Acadamy of Sciences of the United States of America, vol. 106, no. 36, pp. 15274–15278, 2009.
View at: Publisher Site | Google Scholar
Z. Bu, Z. Wu, J. Cao, and Y. Jiang, “Local Community Mining on Distributed and Dynamic Networks from a Multiagent Perspective,” IEEE Transactions on Cybernetics, vol. 46, no. 4, pp. 986–999, 2016.
View at: Publisher Site | Google Scholar
C. D. H. Nguyen, N. Arch-Int, and S. Arch-Int, “A semantically hybrid framework of personalizing news recommendations,” International Journal of Innovative Computing, Information and Control, vol. 11, no. 6, pp. 1947–1963, 2015.
View at: Google Scholar

Copyright

Copyright © 2018 Yancui Shi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1554

Downloads

1257

Citations

International Journal of Digital Multimedia Broadcasting

A Prediction Method of Mobile User Preference Based on the Influence between Users

Abstract

1. Introduction

2. Related Work

3. The Proposed Method

3.1. The Influence of the User

3.2. The Influence between Users

3.3. The Prediction of Mobile User Preference

4. Experimental Results and Analysis

4.1. The Data Set

4.2. Evaluation Method

4.3. The Experimental Steps

4.4. Experimental Results and Analysis

5. Conclusion

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright