Abstract

One of the most important assessment indicators of computer virus infections is epidemic tipping point. Although many researchers have focused on the effects of scale-free network power-law connectivity distributions on computer virus epidemic dynamics and tipping points, few have comprehensively considered resource limitations and costs. Our goals for this paper are to show that (a) opposed to the current consensus, a significant epidemic tipping point does exist when resource limitations and costs are considered and (b) it is possible to control the spread of a computer virus in a scale-free network if resources are restricted and if costs associated with infection events are significantly increased.

1. Introduction

Research on the epidemic dynamics of computer viruses has increasingly incorporated Watts and Strogatz’s [1] description of small-world networks (characterized by tightly clustered connections and short paths between node pairs) and Barabási and Albert’s [2] insights regarding scale-free networks marked by power-law connectivity distributions. The list of researchers using network approaches to computer virus models and analyses also includes Kuperman and Abramson [3], Newman [4, 5], Newman and Watts [6], Pastor-Satorras and Vespignani [711], Watts [12], and X. Yang and L. X. Yang [13]. All of these investigators have noted that the topological properties underlying communication networks exert considerable influence on computer virus epidemic dynamics and spreading characteristics and support subtle analyses that non-network-directed approaches cannot.

A central issue for researchers using a network analysis approach is whether or not tipping points exist when computer viruses are spread via the Internet [711, 1417]. According to Pastor-Satorras and Vespignani [711], Internet-based viruses and worms do not have positive epidemic tipping points, other researchers of epidemic dynamics and tipping points in scale-free networks also consistently argue that, regardless of spreading capability, all Internet-based computer viruses have high probabilities of stability and survival [1821]. Note that new computer viruses are constantly emerging on the Internet, but the majority disappear almost immediately, and a tiny minority achieve epidemic status. This observation serves as our motivation to take a more detailed look at daily interaction and communication process limitations among users of e-mail, instant messaging software, online social network platforms, USB flash drives, and smart phones rather than the topological power-law connectivity distribution properties of scale-free communication networks that have served as the focus of many network-directed epidemic studies published in the past decade.

Resource limitations and interaction costs are two Internet-based daily communication process factors that have been overlooked. We acknowledge the importance of Pastor-Satorras and Vespignani’s work on scale-free networks—their ideas have inspired numerous studies on epidemic tipping points and antivirus strategies. However, closer inspection of their mathematical analyses and simulation results reveals two incorrect assumptions: Internet-based daily interactions and communication processes are cost-free, and the impacts of resource limitations and interaction costs are minimal. Both assumptions are beneficial in terms of mathematical analyses and hypothesis testing and suitable for studying simple scenarios of malicious scripts spread by e-mail attachments sent to large numbers of recipients. However, they lose accuracy in situations where viruses are spread via attachments sent to few recipients, peer-to-peer resource sharing, Internet downloads, multimedia messaging service attachments, or Bluetooth transfers.

For this project we simulate and analyze the influences of resources and costs on computer virus epidemic dynamics and tipping points. Our four main findings are as follows (a) a significant epidemic tipping point exists when resource limitations and costs are taken into consideration, with the tipping point exhibiting a lower bound; (b) when interaction costs increase or usable resources decrease, epidemic tipping points in scale-free networks grow linearly while steady density curves shrink linearly; (c) regardless of whether Internet user resources obey delta, uniform, or normal distributions, they retain the same epidemic dynamics and tipping points as long as the average value of those resources remains unchanged across different scale-free networks; (d) the spread of epidemics in scale-free networks remains controllable as long as resources are properly restricted and intervention strategy investments are significantly increased. We believe these conclusions can assist computer scientists in their efforts to understand the epidemic dynamics and tipping points of computer virus infections and to identify potential immunization and virus control strategies [22, 23].

2. Agent-Based Epidemic Simulation Model

To simulate behavioral and transformative results arising from agent interactions, we selected the Susceptible-Infectious-Susceptible (SIS) state transfer concept as the core feature of our proposed model (Figure 1). (Our simulations are available as Java applications. For source code and binary executables, please contact the corresponding author.) Two characteristics of an infectious agent (representing computers in a communication network) at time are that it is infected at time () and is capable of infecting others. An agent that is vulnerable to a computer virus but has not yet been infected is considered a susceptible agent. The most common infection mechanism is contact with an infected agent; recovery is determined as a specific probability. A recovered agent immediately becomes susceptible again.

The agent-consumable resources in our proposed model have five reasonable properties:(a)they are finite (e.g., daily CPU/network usage time and communication bandwidth for uploads/downloads); (b)they can be temporarily exhausted (e.g., elapsed time chatting online);(c)they are nonreproducible;(d)they can recover or regenerate;(e)consumption of one kind can entail consumption of other kinds, thus reducing the total amount of available resources (e.g., large attachments require large amounts of upload/download time and communication bandwidth).

Based on these properties, a complex network is formulated consisting of agents and links (indicating interactions and contacts between two agents, with those having direct connections labeled “neighbors”). Only a small number of agents are given infectious status at the beginning of each simulation run; all others are designated as susceptible. Also at the beginning of each time step, usable resources for each agent are reset to , where , meaning that all agents are either renewed and/or receive supplemental resources. In our later experiments, the statistical distribution of usable resources can be delta (fixed value ), uniform, normal, or power-law, as long as the average value of agent resources satisfies the following equation:

Agents randomly interact with multiple neighbors during each time step, with usable resources and costs consumed during each interaction. Each agent interacts with a randomly selected neighbor agent . Regardless of the interaction result, agents and expend interaction costs , , where and , and their resources decrease accordingly. If after an interaction, that agent cannot interact with other neighbors; otherwise, agents continue to randomly select other neighboring agents for interactions until their resources are depleted.

The epidemiological status of every agent is determined at each time step using a combination of behavioral rules, original status, the statuses of neighbors, infection rate , and recovery rate . When an infected agent and adjacent susceptible agent interact, whether or not is infected by is determined by infection rate , and agent recovery and return to susceptibility is determined by recovery rate . Spreading rate is defined as ; generally, and . We defined as the density of infected agents present at time step ; when time step becomes infinitely large, represents a steady infected density. In the interest of robustness, all epidemic dynamics and tipping points discussed in this paper represent average values for 30 runs. A simulation flowchart is presented in Figure 2, and experimental parameters are described in Tables 1, 2, 3, and 4.

3. Epidemic Model Analysis

Our proposed model is expressed as where is the minimum value for the ratio between agent’s resources () in relation to interaction costs () and its connectivity (). With the exception of , the symbols used here are consistent with those used by Pastor-Satorras and Vespignani in their discussions of spreading dynamics. is the probability that a node with links is infected at time (neglecting the higher order). is a predetermined constant representing the spreading capability of specific computer viruses, defined as the ratio between the rates at which healthy agents in a population become infected and infected agents recover. The term denotes the set containing all for all positive , as well as the alternative representation . Accordingly is the probability that any given agent will be linked to an infected agent. According to Pastor-Satorras and Vespignani, this probability is proportional to the infection rate and can therefore be reduced to .

In (3.2) we define as the steady state of by solving the stationary condition . Substituting in that equation,

As shown, a trivial solution is . Next, inequality (3.3) is derived based on the possibility that the right-hand side of (3.2) has a nonsingular solution:

Without using a concave function as an alternative proof, we show that (3.3) is a contradiction. Assuming that (3.3) does not hold, it should be expressed as

After defining we observe that a trivial solution for is . Next, note that the first derivative of at 0 with respect to is larger than 0:

However, this implies that nontrivial solutions for do not exist for any , which contradicts inequality (3.4). We therefore obtained as a conclusion regarding epidemic tipping points. By deriving the above conclusion in advance, we obtained a separate conclusion for the lower epidemic tipping point boundary, (as ∞, is at minimum equal to ), which also implies that resources and interaction costs significantly affect epidemic tipping point values.

Since represents the tipping point at which a computer virus becomes epidemic, managing its value should be a primary concern for computer scientists and antivirus experts. In summary, the lower bound of epidemic tipping point decreases when interaction cost decreases or average resource increases. Accordingly, agent’s available resources increase when decreases, thereby enhancing its ability to contact most other agents via underlying communication networks. This result supports existing knowledge about immunization and antivirus strategies: restricting a computer’s resources increases the epidemic tipping point. Neglecting resources makes infinitely large, meaning that they are inexhaustible and that the epidemic tipping point will continue to approach 0 as long as the average number of links is sufficiently large. Our proposed model is therefore identical to Pastor-Satorras and Vespignani’s model in that a computer virus has the potential to achieve epidemic proportions even when the number of infected agents is very small.

Since an infection event requires sufficient resources, controlling the ratio can increase the epidemic tipping point and decrease the steady-state density . In contrast, computer viruses can spread very quickly via small e-mail attachments distributed to a large number of recipients because they can be simultaneously transmitted to many sites. Affected areas can be very large over a short time period, with disastrous results in terms of lost data, work delays, and money. Initially designed to slow the spread of a computer virus, a throttling strategy [24] for containing virus infections places restrictions on uploads/downloads from remote servers (e.g., one gigabyte per day)—in other words, resources are purposefully limited in order to increase the epidemic tipping point. Another throttling strategy is charging upload/download fees for exceeding daily limitations—that is, increasing communication costs.

4. Experimental Results

Toward the goals of determining the reliability and robustness of our results and ensuring the applicability of our conclusions to scale-free networks whose connectivity distribution probabilities satisfy where , we built 8 scale-free (Table 5) and 8 small-world networks (Table 6), all containing different numbers of nodes and links. All sensitivity analysis experiments were simulated using these networks in order to determine the consistency of our results; no weakening or side effects were observed when node and link numbers were changed. Except for node and link numbers (resulting in different average degrees of separation), all parameter settings for the 8 scale-free networks were identical (Table 5). Those networks can be classified in terms of four categories based on node number (1,000/2,000/10,000/20,000) or two categories based on average vertex degree (4 or 8 outgoing links per node). Scale-free simulation network number 3 was designated as our default; unless otherwise indicated, it was used to generate all results reported and discussed in this paper. According to those results, our conclusions are not limited to our proposed agent-based simulation models based on the 8 scale-free networks.

We used the first simulation experiment to show that a computer virus spreading in a scale-free network has a nonzero, positive, and significant epidemic tipping point if resources and interaction costs are taken into consideration—a conclusion that conflicts with those reported by past researchers. To evaluate how node and link numbers in scale-free networks affect epidemic tipping points, all experiments were simulated using scale-free (Table 5) or small-world (Table 6) networks with different numbers of nodes and links. The value of usable resources per agent was reset to 16 units at the beginning of each time step. Daily interaction and communication process costs were designated as one unit, accounting for 6.25% of agent’s total usable resources.

We used three types of complex networks to analyze relationships between effective spreading rate and steady density for our proposed model: small-world, scale-free without interaction costs, and scale-free with limited resources and interaction costs. As shown in Figure 3, the 8 simulation suites generated consistent results that did not become contradictory when node and link numbers were adjusted, suggesting that our results can be applied to different scale-free networks used to simulate computer virus diffusion scenarios. The curves marked with triangles indicate that the scale-free network version of our proposed model reached a 0 level of steady density in a continuous and smooth manner when the effective spreading rate was decreased, indicating the absence of an epidemic tipping point without interaction costs. The curves marked with squares indicate that computer viruses do have epidemic tipping points in small-world homogeneous networks. In a similar manner, the curves marked with circles also indicate that computer viruses have significant epidemic tipping points in scale-free networks when resources and interaction costs are considered (approximately 0.14 in Figures 3(a), 3(c), 3(e), and 3(g) and 0.10 in Figures 3(b), 3(d), 3(f), and 3(h)). According to these results, resources, interaction costs, and average vertex degree impact epidemic dynamics and tipping points in scale-free networks to a much greater degree than node and link numbers.

Our second simulation focused on relationships among epidemic tipping point, steady density curve, and the ratio of interaction costs to an agent’s usable resources (hereafter referred to as “the ratio”). To analyze the influences of the ratio on the other two factors, we employed 10 usable resource values (4, 8, 12, 16, 20, 24, 28, 32, 36, and 40 units) and assigned daily interaction and communication process costs as single units accounting for 25%, 12.5%, 8.33%, 6.25%, 5%, 4.17%, 3.57%, 3.13%, 2.78%, and 2.5% of the agent’s usable resources, respectively.

As shown in Figure 4(a), the epidemic tipping point significantly increased as the ratio grew. For instance, when the value of agent’s usable resources was set at 8 units at the beginning of each time step, the epidemic tipping point was approximately 0.22—significantly larger than for a small-world network with the same number of nodes and links (Figure 3, curve marked with squares) and same average vertex degree (Figures 3(a), 3(c), 3(e), and 3(g)). The opposite was also true: when the value of agent’s usable resources was set at 40 units at the beginning of each time step, the shape of the density curve was very close to that of the scale-free network without interaction costs (Figure 4(a), solid line); in addition, the epidemic tipping point decreased to 0.09. As shown in Figure 4(b), we observed (a) a linear correlation between the epidemic tipping point and the ratio, and (b) that the density curve grew at a slower rate as the ratio increased (Figure 4(a))—that is, the ratio and density exhibited a negative linear correlation when the effective spreading rate exceeded the epidemic tipping point. According to these results, when interaction costs increased or agent resources decreased, the epidemic tipping point of a computer virus spread via the Internet grew linearly, and density curve shrank linearly.

A comparison of results from our mathematical model and second simulation is presented in Figure 5. We used several probability degrees for and found that, at an of 2.7 or 2.65, the values for both curves exceeded those derived from the simulation experiment. The two curves matched at an of 2.4.

The motivation for the third simulation was to investigate the effects of the statistical distribution of an agent’s usable resources on the epidemic dynamics and tipping points of computer viruses spread via the Internet. Our specific goal was to determine how different statistical distribution types (delta, uniform, normal, or power law) and distribution parameters (average value and standard deviation in a normal distribution, or number of values and range in a uniform distribution) affect the steady density curves of viruses in contexts of limited agent resources and interaction costs (Figures 6(a)6(c) and 7(a)7(c)).

The density curves marked with diamonds, crosses, and circles in Figures 6(a) and 7(a), respectively, represent delta (fixed value = 16), uniform, and normal resource distributions; parameters are shown in Figures 6(b) and 7(b). The results indicate nearly identical epidemic tipping points and overlapping density curves (indicating no statistically significant differences) when the average values of usable resources were equal. However, as shown in Figures 6(c) and 7(c), when those same resources represented a power-law distribution (i.e., the majority of agents had very limited resources while a small number had large amounts) and no correlation existed between the total amount of agent’s usable resources and vertex degree (number of neighboring nodes), the resulting dashed density curve grew more slowly compared to those for the other three distribution types, even when they all shared the same epidemic tipping point.

As shown in Figures 5 and 6, the same results emerged as long as the average usable resource values were identical. Note that density curves and epidemic tipping points were very similar across the distribution types, regardless of whether the resources had a uniform distribution with a range of 2 or 3 or a normal distribution with a standard deviation of 2 or 3 (Figures 6(b) and 7(b)). According to the density curves shown in Figures 6(a) and 7(a), as long as researchers ensure that usable resources do not reflect a power-law distribution, at the beginning of each time step they can assign usable resources for each agent as the fixed average value of the statistical distribution derived from the real-world scenario being studied.

5. Conclusion

Ever since Watts and Strogatz [1] proposed their small-world network model and Barabási and Albert [2] introduced their scale-free network model, computer scientists and antivirus experts have used network models and agent-based epidemic simulations to analyze computer viruses in detail. To simplify their experiments, researchers have tended to overlook resource limitations and interaction costs, both of which exert significant impacts on epidemic dynamics and tipping points. In this paper we described five characteristics of network resources and proposed an agent-based epidemic simulation model for investigating how resources and interaction costs influence the epidemic dynamics and tipping points of computer viruses in scale-free networks. According to results from our first set of experiments, resources, interaction costs, and average vertex degree are among those factors exerting significant impacts on epidemic tipping points, but node and link numbers were found to have little impact. Results from our second experimental set provide insight into how the ratio of single infection event costs to total amount of agent’s resources affects density curves and epidemic tipping points. We found that, when interaction costs increased or when the total amount of agent’s resources decreased, the epidemic tipping point of an infection event in a scale-free network grew, and density decreased at certain transmission rates. Results from our third set of experiments indicate that—regardless of delta, uniform, or normal distribution—they have nearly identical density curves and epidemic tipping points as long as average resource values remain the same across different networks.

We believe these conclusions can be used to simplify the task of constructing both basic and abstract computer models and can support the efforts of computer scientists and antivirus experts to analyze core questions tied to epidemic dynamics and computer virus spreading scenarios and to design and enact effective virus control strategies at various intrusion levels.

Acknowledgment

This work was supported in part by the Republic of China National Science Council (Grant no. NSC-101-2119-M-182-001).