Abstract

Assessing and measuring the importance of nodes in a complex network are of great theoretical and practical significance to improve the robustness of the actual system and to design an efficient system structure. The classical local centrality measures of important nodes only take the number of node neighbors into consideration but ignore the topological relations and interactions among neighbors. Due to the complexity of the algorithm itself, the global centrality measure cannot be applied to the analysis of large-scale complex network. The k-shell decomposition method considers the core node located in the center of the network as the most important node, but it only considers the residual degree and neglects the interaction and topological structure between the node and its neighbors. In order to identify the important nodes efficiently and accurately in the network, this paper proposes a local centrality measurement method based on the topological structure and interaction characteristics of the nodes and their neighbors. On the basis of the k-shell decomposition method, the method we proposed introduces two properties of structure hole and degree centrality, which synthetically considers the nodes and their neighbors’ network location information, topological structure, scale characteristics, and the interaction between different nuclear layers of them. In this paper, selective attacks on four real networks are, respectively, carried out. We make comparative analyses of the averagely descending ratio of network efficiency between our approach and other seven indices. The experimental results show that our approach is valid and feasible.

1. Introduction

In recent years, the research of node importance ranking has attracted more and more attention, not only because of its important theoretical significance, but also because of its extensive practical application value [1, 2]. In complex networks, the most important nodes can help us effectively to prevent network attacks [3], to obstruct the spread of computer viruses on the networks [4], to prevent the epidemic of infectious diseases in the population [5], to inhibit the spread of gossip in society [6], and to guide the dissemination of information in social networks [7, 8]. The commonly used centrality measures include degree centrality [9], closeness centrality [10], betweenness centrality [11], and Katz centrality [12], but these indices are highly dependent on the topology of the network.

Kitsak et al. [13] found that a node with high betweenness or degree value in social network research is not necessarily the most important node. The k-shell decomposition method is proposed and decomposes a network into hierarchically ordered shells by recursively pruning the nodes with degree lower than or equal to k. The k-shell decomposition method can determine the location of nodes in the network, and the core layer is considered as a highly important node set [14]. Due to the low computational complexity of the k-shell decomposition method, it is widely used to excavate and analyze important nodes in biological networks, scientific cooperation networks, friend networks, communication networks, and so on. However, some limitations are found from this approach [1]. The k-shell decomposition method only considers the influence of the residual degree in the network decomposition, but the ranking results are too coarse-grained, which makes much difference on the node. This method is not suitable for tree diagrams, regular networks, and BA networks [15]. The k-shell decomposition method has been extended and improved by many scholars. Zeng et al. [16] evaluated the residual degree and the exhausted degree simultaneously and proposed the mixed degree decomposition. Liu et al. [17] considered the k-shell information of the target nodes and the distance from the maximum k-shell nodes of the network comprehensively, which overcome the defect that the nodes importance can not be accurately measured due to the existence of a large number of nodes with the same coreness value. Garas et al. [18] performed k-shell decomposition method on weighted networks. Liu et al. [19, 20] found that the core after the k-shell decomposition was not the true core because there existed small groups in the network which were too close to each other. Based on the definition of entropy in the theory of information entropy, they proposed the connection entropy to measure the diversity of network shell connections.

Identification of important nodes in complex networks has important theoretical significance for the structure, propagation, and synchronization of complex networks. It has a very practical value for understanding the communication and control of information, disease, and rumor, marketing the promotion of new products. In order to identify the important nodes in the complex network efficiently and accurately, this paper combines the local environment, the location, and the influence of the node to the network function and describes the importance of the complexity of the complex network nodes. In this paper, the main contributions include the following: (1) the outward links diversity assessment index is proposed and the defect, caused by the same coreness value of a large number of nodes in network after the k-shell decomposition, which can not accurately measure the importance of nodes is solved. The index not only considers the position of nodes in the network, but also takes the different nuclear layer of interaction between neighbor nodes into consideration. (2) An index of important metrics is put forward, based on the multiattribute evaluation and node deletion, which not only considers the node and its neighborhood topology structure and the interaction characteristics between nodes, but also is able to dig out the important node in the core position and can identify the key node in the structure of hole position as well. (3) After the network’s deliberated attack simulation experiment was carried out in the real data set, importance of the nodes can be calculated and quantified by the decreasing ratio of the network efficiency before and after the network attack. The experiment result shows that the method proposed in the paper has better performance in identifying important nodes in complex networks and is quite suitable for large-scale quantitative analysis of important nodes.

The remainder of this paper is structured as follows. In Section 2, we will propose and describe our method. Section 3 briefly reviews seven typical centrality indices for subsequent comparative analysis. According to the calculation of the monotonicity index and the decreasing ratio of the network efficiency, it is verified that the method proposed in this paper is better than other seven indices. Section 4 summarizes the full text and looks forward to future research directions.

2. Method

Consider an undirected network with nodes and edges. Given the adjacency matrix of the network G, the degree of node i can be expressed as

Then the sum of the adjacency degrees of node i is defined as

denotes the neighbor set of node i. Degree centrality only considered the neighborhood information of nodes but ignored the topological relations between neighbors and the location of nodes in the network. Therefore degree centrality can not reflect the interaction between neighbor nodes in the calculation, and the calculation result is not accurate enough.

The k-shell decomposition method can determine the position of nodes in the network, but it only considers the influence of the residual degree when it is decomposed, which causes the ranking results to be too coarse-grained and to make the nodes less distinguishable. When an intentional attack simulation is conducted on the network, the node in the innermost kernel after the k-shell decomposition is deleted and it will be easily replaced by other nodes, if its local structure is too close and its outbound links are too small. That is to say, the node rarely interacts with other nodes and deleting it can not cause system paralysis. So the node importance is reduced. As shown in Figure 1, node B, which is in the innermost layer, does not have any outward links. Obviously, when deliberately attacking node B, other nodes in the same layer can replace it and cannot cause the system to be paralyzed. In order to overcome the defect when the coreness values of a large number of nodes in the network after the k-shell decomposition are the same, not measuring the importance of the nodes accurately, we propose the outward links diversity assessment index, which is expressed by .

Each node is assigned an index to represent its coreness, represents the maximum coreness value, represents the number of links from node i to the other nodes with coreness value, indicates the number of nodes with coreness equal to ks, and is normalized to the coreness value of node i’s neighbor in the layer. It is not difficult to see that considers not only the proportion of the number of links between the node i and its neighbor nodes in the layer, but also their core attributes. In other words, fully considers the network locations of the network nodes and also considers the interactions with neighboring nodes at different layers.

In addition to considering the number of multilevel neighborhood nodes and the interaction between nodes at different coreness levels, the topological relations between nodes and their neighboring nodes need to be considered. That is to say, the focus of the node important ranking problem cannot be limited to the core nodes in the network, nor can the nodes in the structure holes be ignored. Burt [22, 23] proposed the network constraint coefficient to measure the network constraints experienced by network nodes in forming a structural hole. The larger the value of the network constraint coefficient is, the smaller the number of neighbors and the higher the closeness between neighbors is. Such a node is disadvantaged in competition due to its lack of easy access to new relational resources. Conversely, the smaller the value of the network constraint coefficient is, the greater the chance of structural holes is formed and the more conducive the new relationship resources are obtained. From the perspective of complex network, the network constraint coefficient uses the local attribute of the network to evaluate the importance of the node, that is, the smaller the constraint coefficient value, the greater the importance of the node. Network constraint coefficient can be defined as

denotes the weight proportion of node j in all the adjacencies of node i, node m is the common neighbor node of node i and node j, and the value of is determined by the number of common neighbors m of node i and node j. The tighter the connection is, the more closed triangles they form. And the larger value of is, the less chance structural holes are formed. It can be seen that the calculation of value takes into account the nodes degree and the topology information of their neighbors. The larger the value of network constraint coefficient is, the less structural holes are formed and the less important the nodes are.

In this paper, a new local centrality metric (labeled as ) is proposed on the basis of the interaction between the nodes of different coreness layers, the number of multilevel neighborhood nodes, and the constraints on the node forming the structure hole.

is a tunable parameter to adjust the influence of constraint coefficient, and the value range is . represents the average degree of network nodes and here α is set to one. It is easy to see from (8) that the index comprehensively considers the nodes neighborhood size, the topological structure, the network location, and the interaction between nodes.

Taking the nodes A, B, C, and D in the innermost core of Figure 1 as an example, their links diversity assessment values are first calculated. The four nodes’ neighbor sets are , and . The max coreness of this network is and the number of nodes with different coreness is . So their links diversity assessment values are ,  ,  , . And the sum of their neighbors’ links diversity assessment values are , , , . Through the value calculation, it can be seen that node A has a higher diversity of outward links, and node B has the less diversity of outward links which conforms to Figure 1. Then we calculate the network constraint c of the four nodes as

  + + + + + ,   + + , and by analogy, and . Through the calculation of the constraint of the node forming structure hole, we can see that the constraint value of node A is the maximum, and the constraint value of node B is the minimum, which indicates that node A is easy to form the structure hole and is advantageous to the information spreading of nodes. The values of four nodes are , respectively. It is not hard to see that although comparing with node A, node B is located in innermost, since lacking the diversity of outward links and poor interactivity with other nodes, it is difficult to form structural holes with large constraint value, resulting in a significant decrease in the importance of node B.

3. Experimental Studies

In this section, selective attack simulation experiments on four real networks are conducted. Firstly, we use the monotonicity of ranking list to measure whether each index can distinguish the difference between nodes clearly and then compare the VKC index with the representative indices in the importance measurement of the single node. Finally, we choose a certain percentage of nodes of deliberate attacks and compare the effect of each index on the network robustness to verify the validity and applicability of the VKC index.

3.1. Network Efficiency

Invulnerability research is an important issue in complex network and has achieved many research results. The results show that networks with different structures perform different invulnerability to different types of network attacks. For example, the scale-free network has high invulnerability when faced with random attacks, but it is vulnerable to deliberate attacks. The failure of the top 5% to 10% important nodes will paralyze the whole network [24]. Therefore, we consider the connectivity of the network before and after the node deletion and the importance of the node is equivalent to the destructiveness of the network when the node is deleted. The worse the network connectivity becomes after deleting the node, the more important the node is. And the closer the ranking result is to the actual ranking result, the more accurate this method is [21]. Network efficiency [2528] is an index which tests the effect of removing nodes on network efficiency. The better the connectivity of the network is, the higher the network efficiency becomes. Assume that a node in a network is under a network attack, removing the node, which means that all the edges connected to the node are removed at the same time, which may cause some paths between the other nodes in the network to be interrupted, resulting in the shortest path increasing between some nodes, thereby increasing the average path length of the entire network and affecting network connectivity. Network efficiency is expressed as

represents the shortest path between node i and node j, and N represents the number of network nodes. The value of network efficiency is within . If equals one, it indicates that the network connectivity is the best. Otherwise, it indicates that the network consists of isolated nodes. is normalized to its possible largest value , for totally connected graph having edges.

In this paper, we select a certain proportion of the top important nodes of the network to simulate deliberate attack experiments and calculate the descending ratio of network efficiency before and after the network attack to quantitatively describe the accuracy of various indices. Assume when the network is not suffering from network attacks, the network efficiency is , and then the network efficiency is after deleting a certain proportion of important nodes. The descending ratio of network efficiency is expressed as

The range of is . When e equals one, it means that the network efficiency drops to zero after the attack, that is, the network consists of isolated nodes. When equals zero, it indicates that the efficiency of the entire network has not changed after the attack. It can be seen from (10) that the higher value is, the worse the network efficiency becomes after deleting selectively some important nodes and the more accurate the identification of the importance of these nodes is.

3.2. Datasets

Considering that the different types of social networks represent different network topology properties, we selected four real and open social network data sets for analysis and comparison. Zachary’s karate club [29] is a social network of friendships among 34 members of a karate club at a US university in the 1970s. Dolphin social network [30] is an undirected social network of frequent associations among 62 dolphins in a community living off doubtful sound, New Zealand. Books about US politics are the network of books about US politics published around the time of the 2004 presidential election and sold by the online bookseller Amazon. The network was compiled by Krebs and is unpublished but can be found on Krebs’ website. Neural network [31, 32] is the network representing the neural network of Elegans. We will give the basic structural properties and of these four networks studied in this work in Table 1. As it can be seen from Table 1, in neural network, the maximum degree of a node differs greatly from the maximum k-shell value. The degree assortativity coefficients of the four networks are less than zero, which indicates that the larger degree nodes in the network are more easily connected with the smaller degree nodes. All four networks have small-world networks features.

Table 1 shows the structural properties and the network efficiency of the real networks studied in this work. and are the number of nodes and edges, respectively. is the average degree and is the maximum degree. is the maximum k-shell value. is average path length. is degree assortativity and is clustering coefficient. is network efficiency of initial network.

3.3. Contrast Centrality Indices

Here we briefly review the definitions of seven centrality indices that will be discussed in this work. The k-shell decomposition method can determine the location of nodes in the network and nodes are assigned to values according to their remaining degree, which is obtained by successive pruning of nodes with degree smaller than the value of the current layer [13]. Clustering coefficient [31, 33] is the one representing the degree of node aggregation in the network. Centola [34] found that propagation behavior in the high-aggregation network spread faster, and the importance of the propagation of nodes is related to the clustering of the nodes. Bae and Kim [35] proposed a new important node measure index-coreness centrality, which reflects the node’s influence by calculating the sum of the values of the node’s neighbor set. The neighborhood coreness is labeled as and denotes the extended neighborhood coreness, which is the second-order neighborhood coreness. Liu et al. [36] proposed a weight degree centrality (labeled as ) to measure the influence of node propagation and regulate the weight between the degree and the ability of spreading out with a tuning parameter . And the extended weight degree centrality method is labeled as . This method has considerable performance in most experiments and here is equal to the absolute value of the degree assortativity coefficient r. Based on the idea of classical gravity formula, Ma et al. [37] put forward a gravity model which takes the value of node i as its quality and takes the shortest distance between two nodes in the network as their distance, and this model is used to evaluate the nodes importance, which is labeled as . is an extended gravity index to consider the nearest neighborhood of node i. Chen et al. [38] proposed a semilocal centrality measure as a tradeoff between low-relevant degree centrality and time-consuming measures (labeled as index). It considers both the nearest and the next nearest neighbors. The above indices are described in Table 2.

3.4. Experiment Results

The higher the resolution of the important evaluation index is, the more easier the difference between nodes better can be distinguished. To quantify the denoting of different indices, the monotonicity M of ranking list I is adopted, and the formula is as follows [35]:

represents the number of nodes that select a certain proportion of , and represents the number of nodes with the same index value i. If , it indicates that the ranking method is completely monotonic and each node is assigned a different index value. On the contrary, indicates that all nodes have the same index value and that it is completely indistinguishable from the node’s importance. In Table 3, the resolution values of the , , , , , , and are listed, respectively, when is about 25%. The value of -shell is zero in all four networks, indicating that the importance of the node is invalidated by value. It is not difficult to see from Table 3 that the values of , , and are one in four networks; that is to say, , , and can distinguish between nodes better.

First, we select four indexes with higher resolution like , , , and and compare them with to analyze the importance of single node. As mentioned in Section 3.1, the lower the network efficiency after a node is deleted, the more important the node is. Figures 2(a)2(d) show the experimental results of VKC with , , , and in four different complex networks, respectively.

It can be seen from Figure 2(a) that the importance of the nodes does not increase with the increase of the index values. The most important nodes are mainly distributed in the middle position of the index value, especially obvious performance in the middle dolphin social network and books about US politics network. The index performance of positive synchronization is better than and can better find the most important nodes earlier. The distribution of the values VKC index, index, and index presents a monotonically increasing theoretical curve in Figures 2(b) and 2(c). The corresponding color coordinate value of each node decreases with the increase of index value; that is to say, the performance of these three indices’ forward synchronization is better and can better find the most important nodes in different networks. Figure 2(d) shows that the most important nodes appear at the middle of the index value, especially in karate network and dolphin social network. In other words, the index cannot identify the most important node in the network earlier than the index. The experimental results show that the index is significantly superior to the other four indices in identifying the top-ranked important nodes. has a strong universality, which verifies the rationality and universality of the proposed model.

Digging out a set of important nodes in a complex network plays a crucial role in understanding the structure and function of the network. Next, we select the 25% top nodes ranked by eight indices, respectively, and carry out the deliberate attack simulation experiments, respectively, in four different networks. The experiment results are shown in Figures 3(a)3(d). When there are the same measure values in the ranking results of some method such as -shell, the result of each simulation experiment will change when selecting the different nodes with same values in each round. Therefore, n multiple random simulation experiments will be carried out and the arithmetic mean of the descending ratio of network efficiency will be taken. Here n represents the number of all the nodes having the same values with the selected node.

In Figure 3(a), when %, , , , and -shell can find important nodes in the network earlier than the other three indexes. However, the later follow-up of k-shell index is weak, and the effect of discovering important node is not obvious. The two curves of and are close to each other, and the descending ratio of network efficiency of and is 91.59% and 89.51%, respectively, at =17.65%. The two curves corresponding to and begin to separate from each other and the later follow-up performance of VKC is relatively better. It is not difficult to find that the slope of the curve corresponding to the index suddenly increases, indicating that the index may identify the node with the higher importance. For example, there is a sudden rise of e in curve at =5.88%, indicating that the index may identify top important nodes. and perform relatively poor. In Figure 3(b), and have good overall synchronization. When = 24.19%, the corresponding e of and are 74.13% and 73.97%, respectively. The of are 53.14 at both =17.74% and =19.35%, which indicates that the index is unstable in identifying important nodes. -shell and have relatively poor performance. In Figure 3(c), the corresponding curves and trends of each index are roughly the same, and and -shell behave relatively poorly. In Figure 3(d), , , and have relatively poor ability to identify important nodes compared to the other five indices. The performance of , , and is relatively stable, which is consistent with the value in Table 3; that is, the nodes have different index values and the ranking results are more accurate. Figures 3(a)3(d) show that, in different scale and structural properties of the networks, is more stable than the other seven indices and more accurately measures the node importance than the other seven indices, because is based on the multiattribute evaluation of degree centrality, structural hole, and k-shell decomposition method, while the other seven indices are mainly based on single attribute evaluation.

Table 4 shows the averagely increasing ratio of , which is made by index compared with the other seven indices, after selecting ten different proportions of nodes for deliberate attack simulations. In karate network, the average ability order from strong to weak to identify important nodes is , , , , -shell, , , and . In dolphin social network, the average ability order from strong to weak to identify important nodes is , G+, , , SL, , k-shell, and C. In books about US politics network, the average ability order from strong to weak to identify important nodes is , G+, , , , , -shell, and . In neural network, the average ability order to identify important nodes is , , , , -shell, , , and . It is not difficult to see that, in the different networks, these eight indices have their own advantages and disadvantages, but has the overall best effect and versatility to effectively distinguish the most important nodes.

When the structure changes as the network suffers from deliberate attacks, the ranking of nodes importance will change simultaneously. The method proposed in this paper is also suitable for the recognition of important nodes in dynamic networks; that is, VKC values of nodes in the network are recalculated after each round of network attacks.

4. Conclusions

Accurate assessment and measurement of the nodes importance have great significance to improve the robustness of the actual system and the design of system structure. On one hand, accurate assessment of the nodes importance can protect these important nodes to improve the reliability and survivability of the entire network. On the other hand, the entire network also can be destroyed by deliberately attacking on these important nodes. The method presented in this paper takes the local characteristics of nodes and their neighbor nodes into consideration; not only are the important nodes in the core position excavated, but also the key nodes in the location of the structural holes are identified, which overcome the defects from theoretical perspective such as the coarse-grained k-shell decomposition method, overconsidering the residual degree, and ill-application for BA model. We select the 25% top ranking nodes in the four real-world networks, calculate the monotonicity M, and find that the can distinguish the difference from the important nodes. In this paper, we analyze and compare with , , , and , respectively, after removing single node, and then make a comparative analysis of the averagely descending ratio of network efficiency by , , , , , , -shell, and after deleting nodes at a certain proportion in four different complex networks. The experimental results show that the method proposed in this paper has the better overall effect and strong versatility to effectively identify the most important nodes. In addition, some scholars have found that the importance of nodes not only is related to the network structure, but also has relation with the dissemination mechanism [39, 40]. Therefore, the research on the combination method of dynamic characteristics and network structure to measure nodes importance will be focused on in the future.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The work was supported by the National Natural Science Foundation of China under Grant nos. 61370083, 61672179, and 61402126, the Heilongjiang Province Natural Science Foundation of China under Grant no. F2015030, the Youth Science Foundation of Heilongjiang Province of China under Grant no. QC2016083, and the Postdoctoral Support of Heilongjiang Province of China under Grant no. LBH-Z14071.