Deep Learning Driven Wireless Communications and Mobile ComputingView this Special Issue
SignRank: A Novel Random Walking Based Ranking Algorithm in Signed Networks
Social networks have become an indispensable part of modern life. Signed networks, a class of social network with positive and negative edges, are becoming increasingly important. Many social networks have adopted the use of signed networks to model like (trust) or dislike (distrust) relationships. Consequently, how to rank nodes from positive and negative views has become an open issue of social network data mining. Traditional ranking algorithms usually separate the signed network into positive and negative graphs so as to rank positive and negative scores separately. However, much global information of signed network gets lost during the use of such methods, e.g., the influence of a friend’s enemy. In this paper, we propose a novel ranking algorithm that computes a positive score and a negative score for each node in a signed network. We introduce a random walking model for signed network which considers the walker has a negative or positive emotion. The steady state probability of the walker visiting a node with negative or positive emotion represents the positive score or negative score. In order to evaluate our algorithm, we use it to solve sign prediction problem, and the result shows that our algorithm has a higher prediction accuracy compared with some well-known ranking algorithms.
Signed network  is a kind of social network that consists of edges with positive and negative signs. Positive edges denote friendship, trust, or agreement, while negative edges can be enmity, distrust, or disagreement. Online social network websites have become a favorite place for many people to share their opinions. Signed networks are very useful to understand the relationships between online social network users.
How to measure the importance of nodes in a social network has always been an important issue in many data mining fields such as collaborative filtering , community detection , and link prediction . Signed networks make the node ranking problem more complex because signs associated with edges suggest further information of nodes, but this is significant in many scenarios. For example, internet celebrities (famous in the network) of online social network can gain a lot from word-of-mouth marketing. However, a person can become famous either because many people like him or because a lot of people dislike him. An unwelcome celebrity will make the marketing fail. Therefore, it is crucial to recognize the difference between positive influence and negative influence. Furthermore, it is hard to say if the different influences might cancel each other out. For example, people with a lot of supporters and the same number of opponents will be more influential than ordinary people. So, it is attractive to rank nodes from both positive and negative views.
A classical node ranking method is known as PageRank , which measures a node from a global view. If there is a path from node A to node B, it can be viewed as A supporting B. Hits  is another classical ranking algorithm, which measures the authority of websites by external links. These methods assume the network only contains positive edges. It means a node that has many friends is not different from a node that has many enemies because the signs on the edges are not considered. In the wake of the development of signed networks, a growing interest goes to the trustability of a person by computing ranking nodes based on a criterion evaluating the trust worthiness. Some modified ranking algorithms [7, 8] separate signed network into positive subgraph and negative subgraph and then compute the corresponding ranks using PageRank or Hits. Some important global information may be lost in those modified methods. For example, node A has a friend B, and B has an enemy C. Methods  and  ignore the relationship between A and C. SRWR  could compute positive and negative scores of nodes in signed network, but these scores are from a personal view.
In this paper, we propose a novel ranking method for signed network. Our contributions are as follows:
A novel random walking model for signed network: this model can simulate human activity of visiting an online social network website. A walker visits nodes with positive or negative emotion which denotes the walker likes (trust) or dislikes (distrust) this node. When the walker visits a node and turns to negative, he leaves with probability just like you will close an unpleasant website. The steady state probabilities of the walker visiting a node with negative or positive emotion represent the positive score or negative score.
A ranking algorithm referred as SignRank: we propose an iterative algorithm to rank the nodes and prove our algorithm converges.
Experiment: we compare our method with some well-known classical algorithms. It is difficult to show whether our method is better directly, so we apply these methods to a classic problem, i.e., sign prediction problem in signed network. The experiment shows our method has a higher accuracy than other methods.
This paper is organized as follows. Section 2 is about the related works. Section 3 discusses methods for the SignRank in signed networks. Section 4 presents the experiments and Section 5 gives conclusions.
2. Related Works
In recent years, signed networks have attracted more and more attention for its ability to specify trust or distrust relationships between nodes. Network modeling and network topology analysis are important foundations for the study of signed networks. BSCL  is a signed network generative model which could learn parameters automatically and generate signed networks from a given real network. The generated network keeps some key properties unchanged, especially balance/unbalanced triangle distribution. M. Ludwig et al.  proposed a balance theory [12, 13] based evolutionary model, which adds or removes edges to a social network until it reaches a steady state. IB  is inspired by balance theory and ant behavior, which considers interactions between individuals could cause edges to be changed. Measuring node relevance on signed networks is becoming an important and attractive issue. Tyler Derr et al.  have proposed a series of methods from both local and global perspectives such as signed common neighbors (SCN) and Signed Jaccard Index (SJI).
PageRank  and HITS  are the most popular ranking algorithms for unsigned network (positive edges only). They focus is on the global topology of social network. Personalized ranking is another kind of ranking research area, which ranks nodes from a specified node’s view, such as Personalized PageRank  and Bayesian Personalized Ranking .
In order to make traditional methods applicable to negative edges, researchers have proposed some new methods. Modified PageRank  separates signed network into positive and negative subgraphs and then computes PageRank score separately. This method ignores much information of global topology. PageTrust  is also extended from PageRank. It uses random walk model and considers the walker choose negative edge with a lower probability, so it could rank how trustable nodes are. Another trustable ranking algorithm is TrollTrust  which scores the nodes with probability of trustworthiness. Prestige  subtracts the number of positive edges by the number of negative edges and normalizes the result. According to these methods, positive or trust score can be counteracted by negative or distrust score. However, positive and negative scores cannot cancel each other out in some case. For example, a famous rock star has a lot of supporters and the same number opponents, but we do not think he has the same influence as an ordinary person. SRWR  is a random walk with restart model based ranking method, which could rank nodes from a personal view in the signed network, but it cannot be applied to large-scale networks. SWR  is another good random walk based method, which considers the walker will choose negative edge with smaller probability than positive edge.
3. Ranking Method
3.1. Random Walking Model for Signed Network
At first, a signed network is denoted by a weighted graph . and are defined as follows:
We will introduce a novel random walk model for signed network (RWSN for short) and simulate the behavior of users accessing online social network sites. RWSN supposes a walker randomly visits a user’s home page. After that, the walker will visit one of the neighbors of this page.
The walker could have a positive or negative emotion when accessing social networks, and the edges of social networks have positive or negative signs. The reasons why the walker has positive emotion may be the following ones:
The walker trusts the visited node.
The walker agrees with the visited node’s political views.
Some external factors are the reason.
In contrast, the walker gets into negative emotion because of the following:
The walker distrusts the visited node.
The walker disagrees with the visited node’s political views.
Some external factors are the reason.
In our model, if a walker travels through a negative link, he/she will flip his/her sign, whereas the walker will keep the sign unchanged if he/she travels through a positive link. We define such rules according to structure balance theory of sign network .
For example, Figure 1 shows a walker named Alice visits an online social network with emotion. At first Alice visits node A with positive emotion; then she has two choices denoted by actions 1 and 2. In the case of action 1, she visits A’s friend B through a positive edge and keeps positive emotion. In action 2, she visits A’s enemy C through a negative edge and turns to negative emotion.
Figure 2 shows another example, in which Alice starts walking with negative emotion, and she has three choices. In action 1, she visits A’s friend B and remains unhappy. In action 2, she visits A’s enemy C and becomes happy. In action 3, Alice is tired of them and leaves; then she will visit a node in the network randomly with random emotion.
We say that is the probability of Alice visiting node with positive emotion at time . In contrast, we use to represent the probability of Alice visiting this node with negative emotion. Therefore, the probability of Alice visiting the node i with positive emotion at time can be calculated as
We say that the subscripts and denote the nodes in the signed network. belongs to the set under the condition that there is a positive link going from to . Similarly, belongs to under the condition that there is a negative link going from to . Here is the probability of Alice accessing the node after accessing the node without taking Alice’s emotion into account, and is the number of the nodes, and is the probability of random jump due to a bad mood of Alice. We name as tiredness probability. is computed aswhere is the number of out-degree of node . The probability of Alice visiting the node with negative emotion at time is as follows:
Figure 3 is an example of trap. It shows that B only treats C as a friend, and vice versa. If Alice visits B with a positive emotion, she will only circulate between B and C forever. So we must solve such trap problem.
We use hopping probability to solve the trap problem. After visiting a node, Alice will jump to a random node with probability no matter what emotion she has. The correction equations for updating and are as follows:
Then we can use an iterative approach to update and until they converge.
3.2. Convergence Proof
where is a vector expressed as . In the above equation, P is a probability matrix, and the calculation method of P is provided as follows:
where represents the sum of row in and represents the sum of row in . We can figure out that the sum of each row in is 1.
According to Markoff’s convergence theorem , is convergent.
3.3. SignRank Algorithm
We can calculate by adjacency matrix operation according to previous equations. But it will cost a lot of time and memory to process the sparse matrices. So we will introduce a fast ranking algorithm referred to as SignRank. The input of SignRank includes the following: the positive edges set , the negative edges set , tiredness probability , the hopping probability , the max iteration time , and the stop threshold .
First, we initialize and with equal value (line 1-3). Then, during each iteration, we do the following operations.
Each positive edge is accessed, and scores of source node are added to destination node (lines 5-8).
Each negative edge is accessed, and scores of source node are added to destination node. It is worth noting that the positive score of source node is added to the negative score of destination node or vice versa (lines 9-12).
Finally, we calculate the error tolerance . If is less than , the algorithm is finished.
4. Experimental Results
In the experiment, first we use a simple example to verify the effectiveness of the algorithm; then we compare our algorithm with some other ranking algorithms to prove that SignRank is better.
An example of signed network is shown in Figure 4. There are four nodes in the network, with node A being hostile to nodes B and C, which are also hostile to A. On the other hand, node D is friendly with B and C. Obviously, in the PageRank’s view, the four nodes are the same. The result of SignRank is shown in Figures 5, 6, and 7. In these three figures, the values of tiredness probability are 0.9, 0.5, and 0, respectively. Figures 5 and 6 reveal that the negative rank of node A is significantly higher than its positive rank. It should be noted that node A does not have any positive edge, but its positive rank is not 0, because people who hate B and C may bring a positive rank to A.
In Figure 7 we set , which means that the walker will never be tired and run away. In this case, our SignRank degenerates into a PageRank algorithm. Positive and negative signs will no longer have any influence on the ranking algorithm. As a result, positive scores and negative scores of all the nodes are equal.
4.2. Evaluation Method
It is very difficult to give a direct proof that our ranking algorithm is better than other algorithms, so we adopted an indirect method that has been used by many researchers to prove the superiority of their algorithm [7, 9, 19]. This method is to use the result of ranking algorithm for sign prediction, and the quality of which can be used to evaluate the ranking algorithm. Sign prediction is an important field of the research on signed networks. When there is an edge with an unknown sign in a signed network, we predict the sign through the features of the edge. In order to implement sign prediction, we use ranking score to generate some features  for edge . is the abbreviation of reputation, which represents the popularity of nodes in the network. and , respectively, denote the reputation of the two endpoints of edge . is the abbreviation of optimism, which quantifies the pattern of voting a node in the network. and , respectively, represent the optimism of nodes and . After extracting features for each edge, a classification model can be used for sign prediction.
Reputation and optimism can be calculated through the ranking score of nodes.
represents a set of nodes that have a positive edge pointing to , and represents a set of nodes that have a negative edge to .
represents a set of nodes pointed from i through positive edges.
In this paper, we can measure a node with its positive and negative scores, so we can extend rep to and . They can show popularity and unpopularity of the node. Their calculating methods are as follows:
Correspondingly, opt is extended to and .
Therefore, in this paper, we generate eight features denoted by vector v for each edge and then use logistic regression for sign prediction.
4.3. Evaluation Metrics
We choose accuracy, recall, precision, and F1 to evaluate the quality of our method and comparative methods. And their definitions are as follows:(i)Accuracy is the proportion of correctly predicted edges.(ii)Recall is the proportion of correctly predicted edges in actually positive edges.(iii)Precision is the proportion of correctly predicted edges in predicted positive edges.(iv)F1 is the harmonic mean of precision and recall.
4.4. Comparative Methods
To study the performance of our algorithm, we apply it in sign prediction and compare it with ranking algorithms as follows.(i)PageRank  is a page scoring algorithm proposed by Google Larry Page. We calculate the PageRank value for each node and consider that .(ii)Hits  is an algorithm for analyzing the link topology of a web page. We use the authority value as the score of a node and consider that .(iii)Modified PageRank  divides the signed network into two subgraphs, and . There are only positive edges in and only negative edges in . It uses the PageRank algorithm to calculate the node scores for and , respectively.(iv)TrollTrust  expresses the positive edge in the signed network as trust and the negative edge as distrust. The calculated represents the reliability of node .
4.5. Experimental Results
Epinions: Epinions.com is a consumer review website where members present their opinions toward each other, and these opinions can be trusted or distrusted. Epinions records these trust or distrust relationships.
Slashdot: slashdot.org is a technology-related news website where users could tag each other as friend or foe. Slashdot records these friend or foe relationships.
Wiki-RFA: Wikipedia is a free online encyclopedia. If a Wikipedia editor wants to become an administrator, a request for adminship (RfA) must be submitted. Any Wikipedia member may cast a supporting, neutral, or opposing vote. Wiki-RFA records these supporting or opposing relationships.
We execute SignRank and comparison methods (PageRank, Hits, MPR, and TrollTrust) on Slashdot, Epinions, and Wiki-RFA datasets. Then we calculate features according to (17), (18), (19), and (20) for each edge based on node scores. At last, we train classifiers for sign prediction. Our experiments use 10-fold cross-validation and all results are the average of 10 repeated calculations.
Figures 8, 9, and 10 show the performance comparison of five algorithms on Slashdot, Epinions, and Wiki-RFA, respectively. The prediction accuracies of SignRank on three datasets are 91%, 97%, and 90%. It can be observed that our algorithm has better accuracy on all datasets. At the same time, SignRank is also the top performer of prediction precision, which are 93%, 97%, and 90%, respectively. Our recalls are a little lower than the comparison algorithms; however, our f1 scores are better than them. When precision and recall are opposed, f1 score would be the most important measure. Therefore, SignRank performs better in the sign prediction.
5. Conclusions and Summary
This paper presents a novel random walk model for signed network. It simulates the action of visiting online social network websites with emotion. When the visitor feels unhappy, he/she leaves. In this way, our model has a clear semantic interpretation of the ranking score, which is the steady probability of the walker visiting the node with emotion. Furthermore, this paper presents an iterative algorithm described in Algorithm 1 named SignRank to calculate such probabilities for each node. We also apply our method on sign prediction, and the result shows our method performs better than compared methods.
|Input: , , , , ,|
|4 while do|
|5 for each in do|
|9 for each in do|
|14 for each in do|
|17 for each in do|
|21 if then|
|22 return ;|
The datasets (Epinions, Slashdot, and Wiki-RFA) used to support the findings of this study are open access and they can be downloaded from http://snap.stanford.edu/data/.
Conflicts of Interest
The authors declare no conflicts of interest.
This work was supported by the National Natural Science Foundation of China under Grants 61702089 and 61501102 and the Basic Scientific Research Operating Foundation of central universities under Grant N182304021, and the Science and Technology Support Program of Northeastern University at Qinhuangdao (XNK201401).
J. Tang, Y. Chang, C. Aggarwal, and H. Liu, “A survey of signed network mining in social media,” ACM Computing Surveys, vol. 49, no. 3, December 2016.View at: Google Scholar
X. Zhao, B. Yang, X. Liu, and H. Chen, “Statistical inference for community detection in signed networks,” Physical Review E, vol. 95, no. 4, Article ID 042313, 2017.View at: Google Scholar
L. Page, “The pagerank citation ranking: bringing order to the web,” Stanford Digital Libraries Working Paper, vol. 9, no. 1, pp. 1–14, 1999.View at: Google Scholar
M. Shahriari and M. Jalili, “Ranking nodes in signed social networks,” Social Network Analysis and Mining, vol. 4, no. 1, p. 172, 2014.View at: Google Scholar
P. Symeonidis and C. Perentis, “Link prediction in multi-modal social networks,” in Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 147–162, 2014.View at: Google Scholar
J. Jung, W. Jin, L. Sael, and U. Kang, “Personalized ranking in signed networks using signed random walk with restart,” in Proceedings of the 16th IEEE International Conference on Data Mining, ICDM 2016, pp. 973–978, Spain, December 2016.View at: Google Scholar
D. Cartwright and F. Harary, “Structural balance: a generalization of heider’s theory,” Psychological Review, vol. 63, no. 5, pp. 9–25, 1977.View at: Google Scholar
T. Derr, C. Wang, S. Wang, and J. Tang, “Signed network node relevance measurements,” Physics and Society, 2017.View at: Google Scholar
C. de Kerchove and P. van Dooren, “The pagetrust algorithm: how to rank web pages when negative links are allowed?” in Proceedings of the 8th SIAM International Conference on Data Mining, pp. 346–352, April 2008.View at: Google Scholar
Z. Wu, C. C. Aggarwal, and J. Sun, “The troll-trust model for ranking in signed networks,” in Proceedings of the 9th ACM International Conference on Web Search and Data Mining, WSDM 2016, pp. 447–456, USA, February 2016.View at: Google Scholar
K. Zolfaghar and A. Aghaie, “Mining trust and distrust relationships in social web applications,” in Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing, ICCP10, pp. 73–80, Romania, August 2010.View at: Google Scholar