Ranking Spreaders in Complex Networks Based on the Most Influential Neighbors

Yi, Zelong; Wu, Xiaokun; Li, Fan

doi:https://doi.org/10.1155/2018/3649079

Discrete Dynamics in Nature and Society

On this page

Abstract Introduction Conclusion Data Availability Disclosure Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2018 | Article ID 3649079 | https://doi.org/10.1155/2018/3649079

Ranking Spreaders in Complex Networks Based on the Most Influential Neighbors

Zelong Yi,¹Xiaokun Wu,²and Fan Li³

Academic Editor: Seenith Sivasundaram

Received21 Apr 2018

Revised05 Jul 2018

Accepted09 Aug 2018

Published23 Aug 2018

Abstract

Identifying influential spreaders in complex networks is crucial for containing virus spread, accelerating information diffusion, and promoting new products. In this paper, inspired by the effect of leaders on social ties, we propose the most influential neighbors’ -shell index that is the weighted sum of the products between -core values of itself and the node with the maximum -shell values. We apply the classical Susceptible-Infected-Recovered (SIR) model to verify the performance of our method. The experimental results on both real and artificial networks show that the proposed method can quantify the node influence more accurately than degree centrality, betweenness centrality, closeness centrality, and -shell decomposition method.

1. Introduction

Identifying influential spreaders is of theoretical and practical significance in understating the dynamics of spreading in a complex network. It is conducive to containing epidemic spread, studying information dissemination, and controlling virus diffusion [1–14]. In this view, researchers have developed various methods, such as degree centrality (DC) [15], betweenness centrality (BC) [16], closeness centrality (CC) [17], and semilocal centrality [18] to identify the most influential spreaders in a network.

Kitsak et al. [1] argue that the most efficient spreaders are located at the core of a network. By using the -shell decomposition (KS) analysis [19–21], the location of each node is defined as an integer index, -shell value, according to successive layers in a network. A small -shell value represents the periphery of the network while a large -core value defines the most influential neighbors; however, many nodes with an identical -core value have different spreading influence. In other words, the -shell method manifests a relatively low performance of monotonicity. To address the issue, many other methods have been proposed to improve the effectiveness of the -shell method. For example, Zeng et al. designed a mixed degree decomposition method [22] to rank spreaders according to the links connecting to both the remaining nodes and the removed nodes. Bae et al. [23] defined a coreness centrality index by summing all neighbors’ -shell values and subsequently found that this method provides a more monotonic ranking list than other ranking methods. Ma et al. [24] proposed a gravity model by considering the -shell value of each node as its mass and the shortest distance between two nodes in a network as their distance.

In this paper, we propose a novel influence measure, the most influential neighbors’ -shell (MINK) method to quantify the spreading capability of a node. Here, we refer to the nodes with the largest -shell value as the most influential neighbors. Inspired by the effect of the leaders on social ties, we identify a spreader’s influence by focusing on its interaction with the most influential neighbors. By using the -shell method to define the most influential neighbors, our proposal also takes into account a node’s complex integration with other nodes in the network. It is worth noting that, compared with the gravity model that considers the interaction with a node’s neighbors within a given distance value r, MINK is unacted on the influence of subjective parameters. According to structural holes theory [25], structural equivalence occurs among a node’s neighbors in networks that are lacking in structural holes (“structural holes” are network gaps between unconnected nodes, creating opportunities for unique information access and control), which is to say that the neighbors tend to have strong ties with each other and bring redundant information to the node. The MINK index originates from the idea that generally an influential neighbor has more unique information and therefore exerts more influence on a node’s spreading capacity than other neighbors. In this case, the MINK index, which excludes repeated information from its less influential neighbors, is more refined and less costly in computation.

The rest of this paper is organized as follows. We briefly review previous studies and present our method in Section 2. In Section 3, we apply the Susceptible-Infected-Recovered (SIR) model to evaluate the performance of our proposed method in both real and synthetic example networks. Conclusion is provided in Section 4.

2. The MINK Index

Normally, a network with nodes and edges can be described by an adjacent matrix , where if node is connected to node , and otherwise. For the sake of simplicity, is viewed as an undirected, unweighted, and simple network.

The degree centrality, based on local information, defines the influence of a node as the number of its adjacent vertices. The degree of a node can be expressed as follows:The betweenness centrality measures the fraction of all shortest paths between each node pair which passes through the considered node . It can be described as where denotes the total number of shortest paths between vertex s and vertex and stands for the number of shortest paths from to travelling through vertex . The higher the score ranked by the method is, the more likely a node is a hub vertex, which is an information transfer station in a network.

The closeness centrality is introduced to measure how long the information possessed by a node will propagate in a network. The closeness centrality of node is defined as the reciprocal of the average of shortest distances to all the other nodes:where is the number of all nodes and stands for the geodesic distance between vertex and vertex .

Bao et al. [18] put forward a semilocal centrality by including the effects of shortest distance, the number of shortest paths, and the transmission rate simultaneously, which is defined aswhere is the number of shortest paths between nodes and , denotes the average degree of the network, and is the neighborhood set whose distance to node is less than or equal to a coverage radius . Specifically, is set in literature [18].

The -shell decomposition method [1] endows all nodes with a corresponding -shell value by removing nodes iteratively as follows. First, we start with removing all nodes with degree and continue dropping the remaining nodes until no node with exists in the network. All nodes removed are assigned with -shell value. Secondly, we iteratively remove all nodes with degree until no node with exists in the network. All of these removed nodes are assigned with . Next, we repeat this process until all nodes are removed and assigned with a corresponding -shell value. In the end, each node is defined by the KS index, according to its relative topological location in the network.

However, by using the -shell decomposition, too many nodes with different spreading influences turn out to be assigned to the identical KS index. To improve the monotonicity of the -shell decomposition, we propose most influential neighbors’ -shell (MINK) index, which is inspired by the effect of the leaders on the social ties. Specifically, leaders in social networks share information, provide advice, assign work, and collaborate with other members in the networks. The influence of other members on the network is highly determined by their connection to the leaders. Therefore, we measure the spreading ability of a node based on the interaction of its influential neighbors characterized by the largest -shell values. On one hand, a node will have a greater influence if its most influencing neighbor or itself has a higher value of KS; on the other hand, the effect increases as their distance shortens. In this way, the influence of node is measured bywhere is the shortest path distances between node and node and is the set of spreaders with the maximum -shell value.

3. Empirical Results

Susceptible-Infected-Recovered (SIR) model [26] is a simulation process to mimic the epidemic spreading. It is widely used in identifying the spreading capacity of nodes by scholars and adopted in the study of vaccination strategy and infection control [27–29] as well. In principle, the SIR model detects the influential vertices due to the fact that key nodes are more likely to play an indispensable role in information and viral transmission, and thereby an effective ranking is supposed to stand the test of real spreading coverage.

Therefore, we employ the standard SIR model herein to evaluate the performance of our proposed model. It starts from setting a node as an infected node and the remaining nodes as suspected nodes. At each step, the infected node will infect its susceptible neighbors at the spreading rate and then it will recover with probability . The process continues until all infected nodes are recovered with no infected nodes left in the network. The spreading influence of a node can be obtained by calculating the number of infected nodes at the end of the process. In the paper, we set and . By using this relatively small infection probability, we avoid the situation where most nodes of a network will be inflected easily so that the different influence of each node cannot be detected.

To check the performance of our proposed method, six real networks are introduced in this paper, including Dolphins (friendship) [30], USAir97 (US air flights network), C.elegans (neural) [31], Email (communication) [32], PGP (an encrypted communication network) [33], and Internet (router level). For simplicity, we view these networks as simple undirected and unweighted networks. The statistical properties of the six real networks are listed in Table 1, including the number of nodes , edges , the degree heterogeneity , the degree assortativity , the clustering coefficient , and the average shortest path length .

Next, applying these real networks, we compare the effectiveness of our proposed method with degree centrality, the -shell method, betweenness centrality, and closeness centrality. Both the resolution and correctness of these different ranking methods are studied, respectively.

First, following the literature [24], we define the monotonicity index to quantify the resolution of different ranking methods, as follows: where is the size of the network and is the number of the nodes with the same ranking result when implementing an algorithm. By definition, a ranking method with the monotonicity index closer to 1 has a higher resolution to distinguish nodes’ different influence. If , the ranking method is perfectly monotonic, and each node is identified by a different index value. The monotonicity indexes for different ranking methods are summarized in Table 2. The results suggest that our proposed method can generate higher resolution values than degree centrality, the -shell method, and betweenness centrality do in all six of the real networks. is close to 1 in networks C.elegans, Dolphins, and Internet. In addition, we find out that although the -shell method may identify the most influential spreaders, its resolution is relatively low in these six networks, implying that the different influences of spreaders are not classified. This means that it is necessary to develop alternative methods to overcome the disadvantage of the -shell method.

Secondly, Kendall’s tau rank correlation coefficient [34] is used to quantify the correctness of the ranking methods. Let and be a pair of joint observations that are randomly selected from ranking lists and . The observations and are concordant, if both and or if both and . They are said to be discordant, if and or if and . If or , the pair is neither concordant nor discordant. Kendall’s tau coefficient is defined as where is the number of concordant pairs, is the number of discordant pairs, and is the size of a network. Kendall’s tau is within , and the large values imply a higher level of correlation between the SIR model and the compared method. Kendall’s tau is affected by the network infection rate. In this paper, we set the infection rate to derive Kendall’s tau different under infection rates. Note that the inflection rate cannot be too large, because, with a large , the whole network will be easily infected so that the influences of different notes cannot be distinguished. The average values of Kendall’s tau under for different ranking methods are summarized in Table 3. The results indicate that our proposed model outperforms existing models generally and it is effective especially in networks USAir97 and C.elegans.

We also show how Kendall’s tau changes in the infection rate for different methods in Figure 1. As described in Figure 1, in most cases, our proposed method achieves a better performance than other methods. As the infection rate increases, Kendall’s tau using the -shell method is positively correlated with the value using our method generally, but the former is less than the latter . This implies that our method yields higher correctness than does the -shell method.

Besides real networks, we also check the effectiveness of our methods on a typical synthetic network using the Barabási-Albert (BA) model [35]. Creating the BA network starts with a network with nodes. Then, at each step, a new node is added to the network and connected to existing nodes according to the preferential attachment mechanism. In this paper, we set and _.

For the BA network, we calculate Kendall’s tau rank correlation coefficients for DC, CC, BC, and our proposed model. Figure 2 shows that MINK performs better than CC and much better than DC and BC. Note that all nodes in the BA network are assigned with the same -shell value, so we do not consider the -shell method in our comparison. The average tau values using different methods are listed in Table 4. The results indicate that our model outperforms existing models.

4. Conclusion

In this paper, we propose the MINK index to measure the ability of spreaders in complex networks using the neighbors with the largest -shell values. Our method is based on the facts that a node's spreading ability is proportional to the -shell values of itself and its most influential neighbors and decreases with the distances between itself and these neighbors. By using real networks and a synthetic network using the BA model, we compare our method with the degree centrality, the betweenness centrality, the closeness centrality, and the -shell decomposition method. The empirical results suggest that our method produces a more monotonic ranking than the degree centrality, the -shell method, and the betweenness centrality in all six real networks. Moreover, in most cases, the ranking result of our method is highly correlated with the epidemic spreading range compared with other well-known methods.

Some limitations of our method need to be addressed. First, we only investigated the performance of our method in some typical networks and the classical SIR model was used to mimic the epidemic spreading process. In practice, the structure of a network and spreading dynamic can be different. Thus, the effectiveness of this method needs to be tested more generally. Second, our MINK index is weighted by the distance between a node and its most influential neighbors, but this distance cannot be calculated if these nodes are not connected. Therefore, our method is not appropriate in identifying spreaders’ influence in an unconnected network.

Data Availability

The Dolphins and Internet network data used to support the findings of this study are available from Mark Newman's network data repository (http://www-personal.umich.edu/~mejn/netdata/). The C.elegans, Email, and PGP network data used to support the findings of this study are available from the Alex Arenas' data sets (http://deim.urv.cat/~alexandre.arenas/data/welcome.htm). The USAir97 used to support the findings of this study is available from Vladimir Batagelj and Andrej Mrvar (2006) Pajek datasets. (http://vlado.fmf.uni-lj.si/pub/networks/data/).

Disclosure

Any errors in the work are our own with no responsibility on the funders.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are grateful to Tian Bian for kindly providing help. Financial support from the National Natural Science Foundation of China (no. 71771154 and no. 71702109), the Natural Science Foundation of Guangdong Province (no. 2017A030310304 and no. 2017A030310566), and the Research Foundation of Shenzhen University (no. CCSEZR1810) is also gratefully acknowledged.

References

M. Kitsak, L. K. Gallos, S. Havlin et al., “Identification of influential spreaders in complex networks,” Nature Physics, vol. 6, no. 11, pp. 888–893, 2010.
View at: Publisher Site | Google Scholar
P. Basaras, D. Katsaros, and L. Tassiulas, “Detecting influential spreaders in complex, dynamic networks,” The Computer Journal, vol. 46, no. 4, pp. 24–29, 2013.
View at: Publisher Site | Google Scholar
L. Lü, Y.-C. Zhang, C. H. Yeung, and T. Zhou, “Leaders in social networks, the delicious case,” PLoS ONE, vol. 6, no. 6, Article ID e21202, 2011.
View at: Publisher Site | Google Scholar
J. Borge-Holthoefer and Y. Moreno, “Absence of influential spreaders in rumor dynamics,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 85, no. 2, 2012.
View at: Publisher Site | Google Scholar
K. Klemm, M. Á. Serrano, V. M. Eguíluz, and M. S. Miguel, “A measure of individual role in collective dynamics,” Scientific Reports, vol. 2, 2012.
View at: Google Scholar
D. Wei, X. Deng, X. Zhang, Y. Deng, and S. Mahadevan, “Identifying influential nodes in weighted networks based on evidence theory,” Physica A: Statistical Mechanics and Its Applications, vol. 392, no. 10, pp. 2564–2575, 2013.
View at: Publisher Site | Google Scholar
J.-G. Liu, Z.-M. Ren, and Q. Guo, “Ranking the spreading influence in complex networks,” Physica A: Statistical Mechanics and Its Applications, vol. 392, no. 18, pp. 4154–4159, 2013.
View at: Publisher Site | Google Scholar
D. Chen, R. Xiao, A. Zeng, and Y. Zhang, “Path diversity improves the identification of influential spreaders,” EPL (Europhysics Letters), vol. 104, no. 6, Article ID 68006, 2013.
View at: Publisher Site | Google Scholar
Z. Ren, A. Zeng, D. Chen, H. Liao, and J. Liu, “Iterative resource allocation for ranking spreaders in complex networks,” EPL (Europhysics Letters), vol. 106, no. 4, Article ID 48005, 2014.
View at: Publisher Site | Google Scholar
L.-F. Zhong, J.-G. Liu, and M.-S. Shang, “Iterative resource allocation based on propagation feature of node for identifying the influential nodes,” Physics Letters A, vol. 379, no. 38, pp. 2272–2276, 2015.
View at: Google Scholar
X. Zhao, B. Huang, M. Tang, H. Zhang, and D. Chen, “Identifying effective multiple spreaders by coloring complex networks,” EPL (Europhysics Letters), vol. 108, no. 6, Article ID 68005, 2014.
View at: Publisher Site | Google Scholar
S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes, “Critical phenomena in complex networks,” Reviews of Modern Physics, vol. 80, no. 4, pp. 1275–1335, 2008.
View at: Publisher Site | Google Scholar
Y. Li, P. Pin, C. Wu, and A. Barrat, “A Network Centrality Method for the Rating Problem,” PLoS ONE, vol. 10, no. 4, Article ID e0120247, 2015.
View at: Publisher Site | Google Scholar
S. Derrible and P. Holme, “Network Centrality of Metro Systems,” PLoS ONE, vol. 7, no. 7, Article ID e40575, 2012.
View at: Publisher Site | Google Scholar
P. Bonacich, “Factoring and weighting approaches to status scores and clique identification,” Journal of Mathematical Sociology, vol. 2, pp. 113–120, 1972.
View at: Publisher Site | Google Scholar
L. C. Freeman, “A set of measures of centrality based on betweenness,” Sociometry, vol. 40, no. 1, pp. 35–41, 1977.
View at: Publisher Site | Google Scholar
G. Sabidussi, “The centrality index of a graph,” Psychometrika, vol. 31, pp. 581–603, 1966.
View at: Publisher Site | Google Scholar | MathSciNet
Z.-K. Bao, C. Ma, B.-B. Xiang, and H.-F. Zhang, “Identification of influential nodes in complex networks: Method from spreading probability viewpoint,” Physica A: Statistical Mechanics and Its Applications, vol. 468, pp. 391–397, 2017.
View at: Publisher Site | Google Scholar
B. Bollobás, “Theory and Combinatorics,” in Proceedings of the Cambridge Combinatorial Conference in Honour of Paul Erdös, Academic Press, London, UK, 1984.
View at: Google Scholar
S. B. Seidman, “Network structure and minimum degree,” Social Networks, vol. 5, no. 3, pp. 269–287, 1983.
View at: Publisher Site | Google Scholar
S. Carmi, S. Havlin, S. Kirkpatrick, Y. Shavitt, and E. Shir, “A model of Internet topology using k-shell decomposition,” Proceedings of the National Acadamy of Sciences of the United States of America, vol. 104, no. 27, pp. 11150–11154, 2007.
View at: Publisher Site | Google Scholar
A. Zeng and C.-J. Zhang, “Ranking spreaders by decomposing complex networks,” Physics Letters A, vol. 377, no. 14, pp. 1031–1035, 2013.
View at: Publisher Site | Google Scholar
J. Bae and S. Kim, “Identifying and ranking influential spreaders in complex networks by neighborhood coreness,” Physica A: Statistical Mechanics and Its Applications, vol. 395, pp. 549–559, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
L.-L. Ma, C. Ma, and H.-F. Zhang, “Identifying influential spreaders in complex networks based on gravity formula,” Physica A: Statistical Mechanics and Its Applications, vol. 451, pp. 205–212, 2016.
View at: Publisher Site | Google Scholar
R. Burt, Structural Holes: The Social Structure of Competition, Harvard University Press, USA, 1992.
View at: Publisher Site
Y. Moreno, R. Pastor-Satorras, and A. Vespignani, “Epidemic outbreaks in complex heterogeneous networks,” The European Physical Journal B, vol. 26, no. 4, pp. 521–529, 2002.
View at: Publisher Site | Google Scholar
H.-F. Zhang, P.-P. Shu, Z. Wang, M. Tang, and M. Small, “Preferential imitation can invalidate targeted subsidy policies on seasonal-influenza diseases,” Applied Mathematics and Computation, vol. 294, pp. 332–342, 2017.
View at: Publisher Site | Google Scholar | MathSciNet
H. Zhang, J. Zhang, C. Zhou, M. Small, and B. Wang, “Hub nodes inhibit the outbreak of epidemic under voluntary vaccination,” New Journal of Physics, vol. 12, no. 2, Article ID 023015, 2010.
View at: Publisher Site | Google Scholar
R. Pastor-Satorras and A. Vespignani, “Immunization of complex networks,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 65, no. 3, Article ID 036104, 2002.
View at: Publisher Site | Google Scholar
D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, “The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations,” Behavioral Ecology and Sociobiology, vol. 54, no. 4, pp. 396–405, 2003.
View at: Publisher Site | Google Scholar
J. Duch and A. Arenas, “Community detection in complex networks using extremal optimization,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 72, no. 2, Article ID 027104, 2005.
View at: Publisher Site | Google Scholar
R. Guimerà, L. Danon, A. Díaz-Guilera, F. Giralt, and A. Arenas, “Self-similar community structure in a network of human interactions,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 68, no. 6, Article ID 065103, 2003.
View at: Publisher Site | Google Scholar
M. Boguñá, R. Pastor-Satorras, A. Díaz-Guilera, and A. Arenas, “Models of social networks based on social distance attachment,” Physical Review E, vol. 70, no. 5, Article ID 056122, 2004.
View at: Publisher Site | Google Scholar
M. G. Kendall, “A New Measure of Rank Correlation,” Biometrika, vol. 30, no. 1-2, pp. 81–93, 1938.
View at: Publisher Site | Google Scholar
A. Barabasi and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999.
View at: Publisher Site | Google Scholar | MathSciNet

Copyright

Copyright © 2018 Zelong Yi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1192

Downloads

824

Citations