Mathematical Problems in Engineering

Volume 2017, Article ID 6108563, 9 pages

https://doi.org/10.1155/2017/6108563

## Combined Heuristic Attack Strategy on Complex Networks

Department of Applied Informatics and Mathematics, University of Ss. Cyril and Methodius, J. Herdu 2, 917 01 Trnava, Slovakia

Correspondence should be addressed to Iveta Dirgová Luptáková; ks.mcu@avogrid.atevi

Received 10 March 2017; Revised 28 June 2017; Accepted 14 August 2017; Published 18 September 2017

Academic Editor: Sebastian Heidenreich

Copyright © 2017 Marek Šimon et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Usually, the existence of a complex network is considered an advantage feature and efforts are made to increase its robustness against an attack. However, there exist also harmful and/or malicious networks, from social ones like spreading hoax, corruption, phishing, extremist ideology, and terrorist support up to computer networks spreading computer viruses or DDoS attack software or even biological networks of carriers or transport centers spreading disease among the population. New attack strategy can be therefore used against malicious networks, as well as in a worst-case scenario test for robustness of a useful network. A common measure of robustness of networks is their disintegration level after removal of a fraction of nodes. This robustness can be calculated as a ratio of the number of nodes of the greatest remaining network component against the number of nodes in the original network. Our paper presents a combination of heuristics optimized for an attack on a complex network to achieve its greatest disintegration. Nodes are deleted sequentially based on a heuristic criterion. Efficiency of classical attack approaches is compared to the proposed approach on Barabási-Albert, scale-free with tunable power-law exponent, and Erdős-Rényi models of complex networks and on real-world networks. Our attack strategy results in a faster disintegration, which is counterbalanced by its slightly increased computational demands.

#### 1. Introduction

Complex networks keep attracting an increasing amount of attention during the past couple of decades. This can be illustrated by a number of documents in Scopus search results for this term in the title, abstract, or key words, from 605 documents in 1986, 2372 documents in 1996, 8086 documents in 2006, and up to 20256 documents in 2016 [1]. The existence of such networks is being discovered in many areas of nature as well as society, for example, in biology as neural networks, networks of protein reactions, or plant immune signaling networks, in transport, in economics, in sociology as citation or rumor spreading networks, in computer science as the Internet, and in physics as power grids [2–6]. Mostly, these networks are considered a positive thing. A number of studies are devoted to measuring the robustness of such networks against a malicious attack or against a random degradation failure causing deletion of a node or of a connection. Such measures are used to increase security of complex systems and where possible, like in computer networks, to increase robustness, for example, by rewiring optimization [7–13].

Only in recent years, more attention has been given to malicious networks. Under such term, terrorist networks, fake-news spreading networks, malnets or botnets used in DDoS attacks, or for spreading worms and viruses, dark networks involved in various criminal activities like illegal arm selling or child pornography, and so forth can be understood [14–20]. Attack strategies on such harmful complex networks (i.e., node deletion and occasionally also edge deletion) are studied in [21–23]. For example, in terrorist networks, a sequence of individuals should be identified, whose arrest will result in the maximum breakdown of communication between remaining individuals in the network. Similar approach should work for disabling Internet access to computers used for illegal activities. Apart from networks, where the aim of the involved individuals is malicious, there exist also networks, where the harm is unintentional, but, which should nevertheless be quickly disabled. These involve spread of disease, where the goal is to design vaccination strategies to restrain the spread of pandemic diseases, when mass vaccination is an expensive process and only a specific number of people, modeled as nodes of a graph, can be vaccinated. Another example is cascading failures (blackouts) of electric power transmission network, where the goal is to prevent a total breakdown of power network by inhibiting some power transmission points or lines. Similar approach involves, for example, immunization of disease carriers [24, 25] or critical node detection [26–28], where the sequence of deletion order is not taken into account, and only the optimum subset of nodes selected for deletion is important. Lately, a more computationally demanding approach to network disintegration and attack, using stochastic and evolutionary optimization, is applied on smaller scale networks, providing better results than traditional approaches [29–32].

A related topic to edge deleting attacks is also community detection problem, because a network can be most easily dismantled by removing the edges between communities [34]. Therefore, a fast community detection algorithm can be used for edge network attack; and vice versa, approaches useful in edge removing attacks can be used in community detection.

Practically the same approach to the network disintegration measure as in network attacks can be found in Morone et al. [35–37]. They use collective influence algorithm to find the minimal set of influencers. Their algorithm has complexity in [35, 37], where* N* is the number of nodes, followed by an algorithm with improved results but worse complexity in [37]. A number of related algorithms were inspired by this approach, a good survey of the topic is in [38].

Lately, Hirsch index and its generalization [39], leading eventually to coreness value, were proposed to be a good indicator of influence of nodes. This assumption was further critically analyzed in [40].

Attack strategies are important not only as a countermeasure against harmful networks but also for potential improvements in useful engineering networks. It is popular to study the robustness of networks, but the network disruptions are usually chosen either at random or by very simple targeting methods. In engineering, it is very important to know the worst-case scenario for vulnerability analysis, which our paper addresses.

In the attack algorithms, apart from the quality of results, which are measured by the level of network disintegration, also the complexity of these algorithms matters. While the approach of Morone et al. [35, 36] can handle hundreds of millions of nodes within hours of computation, tabu search approach [29] can handle a few hundred nodes in the same time, but it should provide better results.

In this paper, we shall describe new heuristics for attack strategies, merge them in a newly designed combination of new and classical attack heuristics, and investigate the effectiveness of this new approach both on model networks and on a couple of real-world collaboration networks. Our approach should find its niche on the Pareto optimal front somewhere between collective influence approaches [35, 36] and stochastic optimization approaches [29], both in the terms of the quality of results and in the computational time.

#### 2. Materials and Methods

##### 2.1. Networks, Their Types, Models, and Measures of Robustness

In the beginning of studies of networks, a model network was considered simply as a large random graph, typically rather sparse (i.e., its number of edges is much smaller than that in a complete graph). Random ER (Erdős-Rényi) graphs [41] start from unconnected nodes, which are then connected with a uniform probability. They have a Gaussian bell-shaped degree (number of connections to other nodes) distribution and couples of nodes have a short average path; that is, almost any node can be reached from any other node by going through a relatively small number of edges. Neighbors of any node are not likely mutually connected (low number of triangles corresponds to low clustering coefficient).

It has been discovered that in most real-world networks the neighbors of a node are likely connected to each other, while the property of having short average paths is satisfied. More exactly, a small-world network is characterized by an average distance growing proportionally to the logarithm of the number of nodes in the network. First such models by Watts and Strogatz (1998) [42] were created by rewiring (with a certain probability) connections between the nodes in a regular graph. However, such graphs had very narrow degree distribution, while most of discovered small-world networks had so-called “long tail” distribution, where there are a few nodes with a very high degree. A new type of network, a scale-free one, where zooming on any part of the distribution does not change its shape, has a degree distribution where the fraction of nodes of degree* k* asymptotically follows where parameter is usually in the range .

Typical example of a scale-free network is the Barabási-Albert model starting with a few nodes (e.g., a triangle), where one node is added at a time and connected with a given number of nodes which already exist in the network. The new edges are attached to these nodes selected pseudorandomly with a probability of attachment corresponding to their current degree, so-called linear preferential attachment .

A simplest type of attack on a network is to delete its node(s) together with its/their connections, which will cause the greatest damage. However, a number of possible selections of a set of nodes to be deleted from all the network nodes are a binomial coefficient , leading to combinatorial explosion for larger values.

Network damage can be established by various measures. One possible measure is a probability of a presence of a giant component, which Molloy-Reed criterion [43] defined as a threshold of division of average of squared degrees of nodes divided by average of degrees of nodes. However, this criterion, while easily calculable, is derived for random graph and randomly deleted nodes, not for nodes deleted by heuristic methods, which targets nodes pseudorandomly or deterministically. Another measure is an average inverse geodesic (average inverse of the shortest path length between all pairs of nodes) [44]. This measure would be suitable for a slightly damaged network, which is still fully connected as one component. We are interested in the more substantial destruction of the network. Therefore, we use in our paper a measure of network damage , Unique Robustness Measure (-index) [45, 46], defined aswhere is the number of nodes in the network and is the fraction of nodes in the largest connected component after removing nodes using a given strategy. The variable is a current fraction of deleted nodes against the total initial number of network nodes. The -index thus encompasses the whole attack process, not just one moment of damage at a current fraction .

##### 2.2. Classical Attack Strategies

Finding the least number of nodes, whose removal would result in unconnected components of the network, is proved in graph theory to be an NP complete problem. A simplified problem is to find such a subset of nodes, that after their removal the remaining nodes shall be isolated. This problem is a reformulation of a node cover of a graph, which is a set of vertices such that each edge of the graph is incident to at least one vertex of the set. To find node cover is one of the famous Karp’s 21 NP complete problems [47]. This leads us to the necessity to use heuristics for the network attack.

In the most typical so-called ID attack, the nodes are deleted in descending order of their degrees [48]. The degrees are calculated and ordered only once in the original network, which is the least computationally demanding strategy of all the attack strategies. The calculation of degree centrality requires time in a sparse matrix representation, where* E* is the number of connections. A more efficient but slightly more demanding strategy known as RD recalculates the order of degrees after each removal of a node [49]. Similar couples of approaches, that is, calculating the sequence of nodes to be deleted all at once from the original network, or recalculating this sequence after each node removal, can be applied to all other centrality measures, that is, betweenness, closeness, Katz, and eigenvector centrality [50]. After RD approach, the second most effective measure generally was proved to be the betweenness centrality RB, recalculated after deletion of each node. Often used is also betweenness centrality calculated only once for the original network, named IB [50]. The betweenness centrality of a node* v* is defined aswhere is the number of shortest paths between nodes and and is the number of those paths which contain node* v*. Since the test networks do not have weighted connections, Brandes’ algorithm [51] requiring can be used, where is the number of nodes.

As in many other areas of optimization of NP problems, there exist first attempts to use a metaheuristic optimization for selection of nodes to be deleted. An example of this approach is a tabu search usage [29]. The approach seems to improve index measure of the RD attack by roughly 15 percent, but the expense is enormous, as in most metaheuristics. It requires tens of thousands of evaluations (it uses a local optimization, and, as the termination criterion, 1000 calculations are carried out when no improvement is reached; if there is an improvement in the thousands of calculations, the cycle starts all over).

##### 2.3. Branch Removal Strategy

Both the degree and betweenness criterion work very well most of the time to identify the nodes, whose removal is most likely to break the component apart, so that the largest remaining component is as small as possible. However, for an already sparse component with a large branch, these centrality measures may not always be ideal. Let us have a simple component of a network as an example, containing a tree with 10 nodes, where on the end of a linear “network” the last node has four neighbors; see Figure 1. When attack strategy would be guided by maximum degree, the node with degree 4 would be deleted, leaving the largest connected component with six nodes (the “linear part” of the tree above the deleted node). The same node with the maximum betweenness equal to 21 would be selected using the betweenness based attack. However, when we would evaluate each node by a number of nodes, by which the largest common component would be diminished, if we would delete the node, then nodes with values equal to five in the last network in Figure 1 would be selected. This would leave the largest connected component with five nodes which provides better result than both the degree and the betweenness attack.