Abstract

Measuring node importance in complex networks has great theoretical and practical significance for network stability and robustness. A variety of network centrality criteria have been presented to address this problem, but each of them focuses only on certain aspects and results in loss of information. Therefore, this paper proposes a relatively comprehensive and effective method to evaluate node importance in complex networks using a multicriteria decision-making method. This method not only takes into account degree centrality, closeness centrality, and betweenness centrality, but also uses an entropy weighting method to calculate the weight of each criterion, which can overcome the influence of the subjective factor. To illustrate the effectiveness and feasibility of the proposed method, four experiments were conducted to rank node importance on four real networks. The experimental results showed that the proposed method can rank node importance more comprehensively and accurately than a single centrality criterion.

1. Introduction

In recent years, the study of complex network theory has received sustained attention in various academic fields, such as aviation networks [1, 2], power networks [35], social networks [68], and biological networks [911]. By collecting and analyzing data from actual networks, researchers have studied the statistical characteristics [12, 13] and the dynamic behavior of networks [14]. The results showed that the status of nodes in actual complex networks is unequal. Moreover, scale-free characteristics [1517] indicate that the effect on function and structure of different nodes may vary greatly and that the cascading effect [1820] and propagation effect [21] of the network will be affected by a few important nodes.

The term “important nodes” refers to certain nodes that can affect network structure and function to a greater extent than other nodes in the network [22]. Therefore, evaluating node importance is of great theoretical and practical significance [23, 24]. For example, it can be used to prevent the spread of disease [25, 26], stop the diffusion of rumors [27, 28], ensure the smooth flow of aviation networks [1], prevent power grids from being powered off [29, 30], and keep communication networks connected [31, 32].

Current methods of node importance analysis include social network methods and systems science methods [33]. The social network methods assume that node importance is equivalent to significance and do not destroy network connectivity. On the contrary, system science methods assume that node importance is equivalent to the destructiveness of removing nodes from the network and that node importance can be determined by analyzing connectivity indictors of the deleted nodes network. Systems science methods damage the network system, resulting in network topology changes that lose information about the relationships between nodes, but social network methods can keep the original appearance of the network. Hence, in this paper, social network methods using relevant indictors of node centrality are generally used.

Although existing centrality criteria have been widely used, they do have some shortcomings and deficiencies. Degree centrality (DC) [34] assumes that the greater the number of adjacent nodes, the greater their influence. This is an expression of node importance, but it only considers local information about nodes, ignoring global network structure. Closeness centrality (CC) [35] is represented by the reciprocal of the distance between the given node and other nodes in the network. It can be treated as a measurement of how long it takes information to spread from a given node to others, but it will fail when handling networks with disconnected components. Betweenness centrality (BC) [36] measures node importance by means of the ratio of the shortest path over the nodes to the number of all paths. It compensates for the limitations of degree centrality and closeness centrality, but problems still exist. If many nodes do not belong to the shortest path of other node pairs, then the result for betweenness centrality will be zero [37]. In addition, other node importance ranking methods are also widely used, such as eigenvector centrality [38], subgraph centrality [39], cumulative nomination [40], PageRank [41], and others. In this regard, Liu et al. [42], Lü et al. [43], and Sun et al. [44] have provided excellent summaries. A large number of experts and researchers have tried to find measures to evaluate node importance from different perspectives, but only one aspect of node importance is reflected, and in an actual network, describing node importance with a single indicator will give one-sided results. To illustrate the problem, the Kite network [45] is shown in Figure 1, and the values of degree centrality, closeness centrality, and betweenness centrality are given in Table 1. Evidently, Diane is of greatest importance using degree centrality, and the most influential nodes using closeness centrality are Fernando and Garth. However, Fernando, Garth, and Diane are ranked second, fourth, and fifth, respectively, according to betweenness centrality. Based on the above analysis, it is clear that using different centrality criteria to identify important nodes in a network will produce different results.

To overcome the shortcomings of node importance ranking using a single criterion, a range of criteria must be considered that affect node importance from different perspectives. Therefore, this paper proposes a multicriteria decision-making (MCDM) method to rank node importance effectively. MCDM has been widely used in many fields [4650], and in various MCDM studies, the Technology for Order Preference by Similarity to an Ideal Solution (TOPSIS) method [51] has been successfully applied in different fields [52, 53], while the entropy weighting (EW) method [54], as an important objective weighting method, has also been applied in many fields [5557].

Therefore, the TOPSIS method and the entropy weighting method were combined to propose a novel method called EW-TOPSIS. The proposed method is based on degree centrality, closeness centrality, and betweenness centrality and computes node importance through integrated computation of these criteria. Because the measurement of node importance takes into account multiple factors that impact node importance without a one-sided emphasis on any single factor, the measurement is more accurate compared to when using a single criterion. Moreover, compared with other multicriteria decision-making methods, the proposed method uses an entropy weighting method to calculate the weight of each criterion, overcoming the disadvantage of the original TOPSIS using equal weights. This research solves the problem of partial and inaccurate ranking of node importance. It provides a beneficial supplement to network node importance measurement, enriches existing research on complex networks, and has great academic value. To demonstrate the effectiveness of the proposed method, four real networks (Zachary’s karate club, the dolphin social network, American college football, and jazz musicians) were used as experimental data. The susceptible-infected (SI) model [58] was chosen to examine the spread of the influence of the nodes ranked by the proposed method and by a single criterion. The experimental results reveal the superiority of the proposed method.

The primary contributions of this paper can be summarized as follows:

(i) A novel method of node importance ranking in complex networks based on multicriteria decision making is proposed. It comprehensively combines the advantages of various criteria from different perspectives and makes the measurement more accurate and universal.

(ii) An entropy weighting method is proposed to calculate the weight of each criterion. It can overcome the influence of subjective factors and obtain an objective result.

(iii) Four experiments on four real networks have been conducted, and the experimental results show that the proposed method has superior performance in identifying important nodes in complex networks.

The rest of this paper is organized as follows. Section 2 briefly introduces the definitions of a graph and of some centrality criteria. The proposed method to rank node importance using the entropy weighting method and TOPSIS is illustrated in Section 3. Section 4 evaluates the performance of the proposed method based on four real networks. Finally, Section 5 presents conclusions.

2. Node Importance Criteria

An undirected network can be denoted as , where and are the sets of nodes and edges, respectively. is the adjacency matrix of the network; if there is a connection between node and node , ; otherwise, .

Definition 1 (degree centrality). The degree of node denotes the number of neighbor nodes, expressed as [34]Then degree centrality can be denoted as in the following equation:Degree centrality reflects the ability of a node to communicate directly with other nodes. The greater the value of degree centrality, the more important the node.

Definition 2 (closeness centrality). The closeness centrality of node is defined as the reciprocal of the sum of the shortest distances to all other nodes, expressed as [35]where is the distance between node and node . If there is no reachable path between node and node , then (). Closeness centrality can be treated as a measurement of a node’s importance through the average spread time of information in the network. The greater the value of closeness centrality, the more important the node.

Definition 3 (betweenness centrality). The betweenness of node is defined as the fraction of shortest paths between all node pairs that pass through node , as given by [36]where represents the number of shortest paths between node and node which pass through node , and is the number of all possible shortest paths between node and node .
The betweenness centrality of node is the normalization of . For an undirected network, the maximum possible number of node pairs is , and betweenness centrality can be expressed asBetweenness centrality can be understood as the ability of a node to control the network flow traveling along the shortest path in the network. The greater the value of betweenness centrality, the more important the node.

The above criteria can measure node importance, but they are a one-sided way to rank node importance with a single criterion. Therefore, to conduct a more comprehensive and objective ranking of node importance, a novel method is proposed here, integrating the above criteria based on the entropy weighting method and TOPSIS.

3. Proposed Method

TOPSIS is a common method to solve multicriteria decision-making problems. Original TOPSIS has been used to identify important nodes [33, 59, 60]. However, original TOPSIS simply gives equal weight to each criterion, ignoring that different criteria play different roles in the decision-making process.

3.1. Constitute the Weighted Decision Matrix

If the set of nodes in a network is and the set of centrality criteria is , then represents the value of the jth criterion for the ith node. The decision matrix can be obtained asBecause the dimension is different for each criterion, it is necessary to eliminate the dimensional differences among criteria and standardize the decision matrix. The criteria can be divided into benefit criteria (the higher the criterion, the more important the node) and cost criteria (the higher the criterion, the less important the node). The above-mentioned three criteria are all benefit criteria.

For cost criteria, the standardization process can be expressed asSimilarly, for benefit criteria it is expressed aswhere , .

The normalized decision matrix can be denoted as .

The entropy weighting method, which is used to calculate the weight of each criterion, determines the weight according to the variability of the criterion. The information entropy of the jth criterion is denoted aswhere .

When , then .

Then the weighting coefficient of the jth criterion can be calculated asMultiplying the columns of the normalized decision matrix by the associated weights yields the weighted decision matrix, which can be denoted as

3.2. Calculate the Distance to the Ideal Solution

The positive ideal solution and the negative ideal solution are defined as follows:Thus, the distance of each scheme to the positive ideal solution and the negative ideal solution can be calculated by the following equations, respectively:

3.3. Rank Node Importance

The closeness degree to the ideal solution can be calculated asThe closeness degree is the measurement of a node’s importance, and the vector of node importance can be simply denoted asWhen is ranked in descending order based on the value of each node’s importance, the resulting node importance ranking is obtained aswhere .

Combined with the above theoretical analysis, the specific steps of node importance ranking in complex networks can be given as in Algorithm 1.

Input: Decision matrix .
Output: The result of ranking .
Step 1. Constitute the weighted decision matrix.
(i) Calculate the normalized decision matrix by Equations (7a) and (7b).
(ii) Determine the weights of the criteria using Equation (9) based on the entropy weighting method.
(iii) The weight of each criterion is brought into Equation (10), and the weighted decision matrix is constructed.
Step 2. Calculate the distance to the ideal solution.
(i) Calculate the positive ideal solution and the negative ideal solution by Equations (11a) and (11b), respectively.
(ii) Obtain the distance of each scheme to the ideal solution by Equations (12a) and (12b), respectively.
Step 3. Output the result of ranking.
(i) Calculate the closeness degree to the ideal solution by Equation (13).
(ii) Rank the nodes’ importance in descending order using Equation (15), and the result of the node importance
ranking can be obtained as .

4. Simulation and Analysis

4.1. Experimental Data

This section describes the use of four actual networks to verify the feasibility and effectiveness of the proposed method. (i) Zachary’s karate club [61] is a social network that Zachary has observed among a karate club’s 34 members in an American university over a few years. Each node represents a member of the club, and an edge represents the connection between two members of the club. (ii) The dolphin social network [62] is a social network that was observed by Lusseau and Newman in a group of New Zealand bottlenose dolphins for seven years. Each node represents a dolphin, and an edge represents the frequency of contact between two dolphins. (iii) The American college football [63] network describes American football games between Division IA colleges during the regular season in Fall 2000. Each node is a player, and an edge represents a regular season game between the two teams. (iv) Jazz musicians [64] is a collaborative network between jazz musicians. Each node is a jazz musician, and an edge denotes that two musicians have played together in a band.

4.2. Experimental Analysis
4.2.1. Experiment 1: Compare the Top 10 Nodes between the EW-TOPSIS Method and Centrality Criteria

In this experiment, the EW-TOPSIS method is used to identify the top 10 nodes based on the four actual networks, and the three centrality criteria DC, CC, and BC are also used for comparison. Table 2 shows the comparison results.

According to Table 2, in the karate club network, comparing the top 10 nodes of the EW-TOPSIS method and DC and CC, nine of the nodes in their top 10 lists are the same. Between the EW-TOPSIS method and BC, eight of the nodes in their top 10 lists are the same. In the dolphin network, the number of nodes in the top 10 lists that were the same between the EW-TOPSIS method and the centrality criteria (DC, CC, and BC) was seven, seven, and nine, respectively. In the top 10 list of the football network, four of the nodes were the same between the EW-TOPSIS method and DC and CC, and the top-10 nodes were all the same when comparing the EW-TOPSIS method and BC. In the jazz musicians network, the top two nodes (node 136 and node 60) using the four methods were the same, which says that node 136 and node 60 are the most important nodes and that node 136 is more important than node 60.

By comparing the top 10 nodes using four methods based on four actual networks, it is clear that the ranking results of DC, CC, BC, and the EW-TOPSIS method are different. Different centrality criteria measure node importance from different perspectives. The EW-TOPSIS method comprehensively considers multiple criteria, and the ranking results are more scientific and reasonable.

4.2.2. Experiment 2: Compare the Frequency of Nodes with the Same Ranking

Different nodes may have the same ranking, which makes it impossible to rank nodes with the same ranking accurately. For a node importance ranking method, the higher the frequency of the same ranking, the worse the performance of the ranking method. Therefore, the frequency of nodes with the same ranking can be used as an indicator to measure the performance of the method. The frequency of nodes with same ranking was compared using the four methods, with the results shown in Figure 2. For more comparisons between the existing centrality criteria and the proposed method, the maximum frequencies of nodes with the same ranking are compared in Table 3.

According to Figure 2, the frequency of nodes with the same ranking in the EW-TOPSIS method is the lowest, whereas the three centrality criteria generate nodes with the same ranking in varying degrees. From Table 3, for the karate club network, the maximum frequency of nodes with the same ranking as sorted by BC is 55.88 percent, but with the proposed method, it is only 5.88 percent. In the dolphin network, the maximum frequency of nodes when sorted by BC reaches 25.81 percent, but with the proposed method, it is 3.23 percent. The gap is even larger in the football network, where the maximum frequency as sorted by DC is as high as 57.39 percent, but that with the proposed method is 0.87 percent. The difference also exists in the jazz musicians’ network. The conclusion can be drawn that the EW-TOPSIS method is more effective than the three centrality criteria from this perspective.

4.2.3. Experiment 3: Compare the Average Infection Ability of the Top 10 Nodes

In this experiment, the SI model is used to examine the infection ability of the top 10 nodes. The importance of nodes can be regarded as equivalent to infection ability; that is to say, the higher the importance of a node is, the stronger its infection ability will be. Therefore, the average infection ability of nodes can be used as an indicator to evaluate the effectiveness of a ranking method. In the SI model, every node has a susceptible state and an infected state; infected nodes infect susceptible nodes with a certain probability, and nodes cannot be recovered once infected. The infection source node is chosen from the top 10 list, the number of infected nodes will reach after intervals of spread, and the average infection ability of the top 10 nodes can be defined as . In this experiment, the number of spread intervals was set to . To eliminate environmental randomness, 1000 Monte Carlo simulations were used to make the simulation environment more scientific and reasonable. Figure 3 shows the results of this experiment.

In Figure 3, the average number of infected nodes increases with time and finally reaches a stable value. In the karate club network, the curves in Figure 3(a1) almost overlap, and the same is the case in Figure 3(a2). DC and CC showed similar performance to the EW-TOPSIS method, with nine of the same nodes in the top 10 list. Figure 3(a3) shows that the average number of infected nodes with the EW-TOPSIS method is slightly higher than with BC, which indicates that the performance of the EW-TOPSIS method is slightly superior to that of BC. In the dolphin network, from Figures 3(b1) and 3(b3), the average number of infected nodes by the EW-TOPSIS method is larger than that by DC and BC in each time interval; obviously, the EW-TOPSIS method outperforms DC and BC. The EW-TOPSIS method showed similar performance to CC, as shown in Figure 3(b2). In the football network, the curves in Figures 3(b1), 3(b2), and 3(b3) almost overlap; that is to say, the performance between the EW-TOPSIS method and DC, CC, and BC is similar. This is also the case for the jazz musicians’ network between the EW-TOPSIS method and DC and CC, as shown in Figures 3(d1) and 3(d2). Figure 3(d3) shows that the EW-TOPSIS method is clearly superior to BC not only because it generates a greater average number of infected nodes, but also because it has more stable mean square errors.

In addition, the spread time to reach a state of 90 percent infected nodes by the EW-TOPSIS method is 11 intervals, but by BC, it is 14 intervals. Clearly, the EW-TOPSIS method has better performance than DC and BC for both the average number of infected nodes and the spread velocity, and the performance of the EW-TOPSIS method is similar to that of CC.

In short, the proposed method has almost the same performance as DC and is slightly better than CC. In the case of BC, it is obvious that the proposed method performs better. Hence, the experimental results illustrate the effectiveness of the proposed method.

4.2.4. Experiment 4: Compare the Average Infection Ability of a Single Node

To compare further the ranking performance of the four methods, the average infection abilities of a single node at the same infection rate are compared, with the results shown in Figure 4.

For the karate club network, from Figure 4(a), it is clear that node 34 node 3 node 33, where “” denotes “more important than,” and the simulation is consistent with the EW-TOPSIS method, but contrary to DC, CC, and BC. In the dolphin network as shown in Figure 4(b), it can be concluded that node 2 node 37, which is consistent with the EW-TOPSIS method, CC, and BC, but contrary to DC. Similarly, in the football network, node 7 node 2. In the jazz musicians’ network, node 83 node 168, which can be seen from Figure 4(d) and is consistent with the EW-TOPSIS method and BC, but contrary to DC and CC.

The four experiments described above indicate that the proposed method has better performance than a single centrality criterion. Node importance is ranked with different methods, and the top 10 nodes of the four real networks are obtained. Based on the ranking results, the frequency of nodes with the same ranking is analyzed, and it is discovered that the proposed method has the lowest frequency; that is to say, the proposed method is more effective from this perspective. In addition, an indicator called the average infection ability is defined to describe the infection ability of the top 10 nodes, and the average infection ability of the top 10 nodes is obtained with the SI model. The proposed method performed better in terms of both infection scale and spread velocity. The infection abilities of a single node were also compared, and the results demonstrate the superiority of the proposed method.

Many criteria are used to rank node importance in complex networks, which considers only one aspect of networks. To solve this problem, a multicriteria decision-making method has been proposed here to perform a comprehensive evaluation of node importance. In this study, an entropy weighting method was used to obtain criterion weights that can overcome the subjective effect existing in other methods [33, 59, 60]. The method proposed here enriches existing research on complex networks and has great academic value.

5. Conclusions

This paper proposes a novel method of node importance ranking based on an entropy weighting method and TOPSIS. The proposed method takes multiple centrality criteria as its decision criteria and uses an entropy weighting method to obtain the corresponding weight of each criterion, thus overcoming the effect of subjective factors. Multiple criteria were chosen from different perspectives on complex networks, and the advantages of each criterion were combined to obtain more objective and reasonable ranking results. To verify the effectiveness of the proposed method, four experiments based on four actual networks were conducted, and the SI model was used to simulate the spread ability of the top 10 nodes. The experimental results show that the proposed method has superior performance.

In this paper, the proposed method is applicable to undirected and unweighted networks, but a complex network may be directed and weighted in real life, and therefore a future research object of the authors is directed and weighted networks. Furthermore, experiment 2 showed that there are still some nodes with the same ranking; in such a case, how should their ranking be determined? In addition, some researchers have found that node importance is involved in dissemination mechanisms in addition to network topology. Therefore, future research will focus on a combination method of dynamic characteristics and network structure to measure node importance.

Data Availability

The simulation data used to support the findings of this study are available from the corresponding author upon request.

Additional Points

Highlights. A novel method of node importance ranking in complex networks based on multicriteria decision making is proposed. The proposed method comprehensively combines centrality criteria from different perspectives. An entropy weighting method, which can overcome the influence of subjective factors, is proposed to obtain the weight of each criterion. Experimental results show that the proposed method outperforms a single centrality criterion.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The work described in this paper is supported by the National Natural Science Foundation of China (Grant No. 61472443). We thank International Science Editing (http://www.internationalscienceediting.com) for editing this paper.