Abstract
A novel deterministic SEIS model for the transmission of email viruses in growing communication networks is formulated. Interestingly, the model is different from classical SEIS models not only in the form, but also in the dynamical features. We study the equilibria and their stability and analyse the bifurcation dynamics of the model. In particular, we find that the virus-free equilibrium is locally asymptotically stable for any parameter values, which may attribute to the absence of the basic reproduction number. It is shown that the model undergoes a saddle-node bifurcation and admits the bistable phenomenon. Moreover, on the basis of the Lyapunov function, the domains of attraction of equilibria are estimated by solving an LMI optimization problem. Based on the above theoretical results, some effective strategies are also provided to control the propagation of the email viruses. Additionally, our results are confirmed by numerical simulations.
1. Introduction
Currently, email has become one of the most basic applications in the Internet with the development of networked computer. As more and more people rely on email communications for business and everyday life, email viruses constitute one of the Internet major security threats. When an email virus infects a machine, it sends an infected email to the addresses in the computer's email address book. This self-broadcast mechanism allows for the virus's rapid reproduction and spread. Due to this facility, and the advantages of sending viruses through email [1], hackers mostly tend to choose the email as the vector of spreading their viruses [2]. Indeed, according to the Virus Bulletin [3] and National Computer Virus Emergency Response Center [4], the email viruses still account for a large share of the virus prevalence today.
In order to eradicate viruses, as well as to control and limit the impact of an outbreak, a lot of efforts have been devoted to develop various mathematical models to simulate the real case of viruses’ spreading [5–9]. Unlike scanning worms such as Code Red or Slammer, email viruses spread over a logical network defined by email address relationship, making traditional epidemic models invalid for modelling the propagation of email virus.
The spread of email virus is a complex process. An adequate modeling of this process requires both a correct description of the logical network along which viruses spread [10–12] and a quantitative formulation of various behavioral mechanisms of the users who motivate the spread of the virus [1, 2].
Due to the specific characteristics of email viruses propagation, some researchers focus attention on the topological structures of email networks [13], while some researchers deal with the development of new techniques for the detection and elimination of viruses [14]. Immunization strategies in email networks are presented by experiments and simulations in [15] and [16]. In 2003, Zou et al. [1] presented an email viruses’ model that accounts for the behaviors of email users, such as email checking frequency and the probability of opening an email attachment. But they rely primarily on simulation rather than mathematical analysis. In [12], the simulations for a stochastic susceptible-hidden-infectious-recovered model and mathematical analysis for homogeneous and heterogeneous SIR model are provided. When considering immunization in the heterogeneous SIR model, the saddle-node bifurcation occurs. Here we must emphasize the hidden state which is a key characteristic that the email virus different from other viruses. So based on the spreading mechanism of email viruses, we proposed a SEIS model in growing homogeneous networks by stochastic approach. Relying primarily on mathematical analysis, we give some effective control strategies.
The rest of the paper is organized as follows. In Section 2, a discrete-time model of the email viruses’ spreading is proposed whose detailed process is also given. Then we deal with the model to derive the continuous-time equations for the dynamics of email viruses spreading. In Section 3, we present a qualitative analysis of the model. We show that the virus-free equilibrium is locally asymptotically stable for any parameter values and the system admits a saddle-node bifurcation. Section 4 estimates the domain of attraction of the equilibrium on the basis of the determined Lyapunov function. The subsequent sections are the numerical examples and the conclusion and discussion.
2. Mechanism Analysis and Model Establishment
In this paper, we consider email viruses that are only transferred through users' email address books. Thus email address relationship forms a logical network for email viruses. Email address book of a user usually contains the user's friends' or business partners' email addresses. Thus if user A has user B's address, user B probably also has user A's address in his own address book. Due to this, we model the email network as an undirected graph, although this may not always be true. Without loss of generality, we can safely assume that every distinct email address is associated with one host computer which is associated with a single user. Then the logical network can be seen as host computer network, in which the vertices are the computers and the links are the address relationships of the corresponding users. Here we consider the network as a homogeneous network that assumes each computer has same neighbors which are supposed as ; that is, an email user has addresses in his own address book and his address is in users' address books.
According to the already studied propagation characteristics, the spreading mechanism of email viruses can be described as follows: a healthy computer, which links with infectious one, would receive virus emails. But it may not be infectious immediately. The email viruses are hiding in the email box. At this moment, the computer is infected but not yet infectious. Only if the user opens the virus attachments, the computer becomes infectious. Then if the computer has been found infectious, the user would take antivirus measures to clean the viruses. After that, the computer recovers from infectious state and returns to healthy state again.
Obviously, the viruses’ spreading is just the process of computers switching between different states essentially. We call computers that have not received virus emails and could be infected easily after receiving virus emails as the susceptible computers. The computer that has received virus emails but the virus attachments have not been opened is called the exposed computer. And further, a computer that has been infected by the viruses and can transmit them to susceptible computers is called infectious computer. The numbers of susceptible, exposed, and infectious computers at time are denoted as , , and , respectively. Then the total number of computers at time is . Here we assume the total number of computers is varying dynamically as time ; that is, the new computers can be connected to the network and some old computers can be removed from the network (including the obsolescence computers). Suppose is the rate at which new computers are connected and is the rate at which the computers are removed per unit time. The flow diagram of the state transition is depicted in Figure 1.
By the above analysis, we have already known the mechanism of the virus propagation. In the following, we will elaborate the probabilities of transforming from to , to , and to , by assuming the number of infected contacts as a random variable that follows the binomial distribution.
2.1. The Transition from Susceptible State to Exposed State
Consider now a computer is in the susceptible state at time . By the usual homogeneous mixing approximation, denotes the rate that a neighbor of the computer is in the infectious state at time . Suppose that neighbors of computer are independent with each other, we have independent events occurring with the same probability . Then the number of the infected neighbors of computer is a random variable that follows the binomial distribution . And the probability of the number of the infected neighbors of computer equal to at time can be written as follows: When a computer is infected, it will send a virus email to all email addresses in the user's email address book. So as long as there are infected computers in the neighbors, the susceptible computers would have chance to turn into exposed. Let denote the probability that computer still stays in the susceptible state in the time interval , then is the probability that it enters into the exposed state. is the number of the infected neighbors of computer , and if denotes the probability that the computer received virus emails from each infected neighbor per unit time, then we have
Because the infected number of the computer's neighbors can change from to and the corresponding changes from to , using (2.1) and (2.2), we obtain the transition probability averaged over all possible values of as Then in the interval , the probability of a susceptible computer transforming to the exposed state is .
2.2. The Transition from Exposed State to Infectious State
An exposed computer being infected or not depends on whether the user opens the virus emails. As long as the email user opens a virus attachment (one virus email only has one virus attachment) arbitrarily, the computer will be infected, so only if the user does not open all virus emails that he has received, the computer will stay in the exposed state. Let be the probability that the email user opens a virus attachment (); thus is the probability that the email user does not open a virus attachment. Let the number of the virus emails that the computer received during the time interval be ; then the probability that all virus attachments would not be opened is . Therefore, the probability of the computer infected is .
In addition, from (2.1), we can obtain that the average infected number of the computer's neighbors is Suppose the computer received virus emails from each infected neighbor per unit time, then in the time interval , the computer received virus emails on average; that is, .
Substituting into the above formula, the probability of an exposed computer becomes infected is in the time interval .
2.3. The Transition from Infectious State to Susceptible State
An infected computer recovering from the infection and reentering into the susceptible state is dependent on the computer users. If the computers have been found virus-infected, the users would take antivirus measures; as a result the viruses could be removed and the computers will enter into susceptible state. But if the users do not do this, the computers will stay in the infected state. So, we suppose the computers return to the susceptible state at the probability per unit time.
Based on the above analyses and Figure 1, we can indeed write the spread mechanism of email viruses as the following deterministic model: In the limit , the above model can be simplified by keeping only the two first terms in the development of the binomial. This leads to the following continuous-time model: where , .
Summing up the three equations in (2.6), we obtain when , . It is easy to see that system (2.6) can be shown to be mathematically well posed in the positive invariant region and solution in exists for all positive time.
Hence, (2.6) has the following limit system: Rescaling (2.8) by , , and still writing as , we obtain where , and the corresponding positive invariant set is .
3. Dynamic Analysis of System (2.9)
The objective of this section is to perform a qualitative analysis of system (2.9). We first consider the nonexistence of limit cycle. Then we study the equilibria and their stability. Finally, we prove the occurrence of saddle-node bifurcation.
Theorem 3.1. For system (2.9), there is no limit cycle.
Proof. Take the Dulac function
where , .
Set
we have
Therefore, there is not a closed orbit inside the first quadrant.
Next, we will study the equilibrium situation of system (2.9). The system always has the virus-free equilibrium , and the Jacobian matrix is . Obviously, is locally asymptotically stable for any parameter values.
To find the positive equilibrium, set which yields the equation of : We can obtain that(1)there is no positive equilibrium if ;(2)there is one positive equilibrium if ;(3)there are two positive equilibria if .In the following, we analyse the stability of the positive equilibria in different situations, respectively, and prove the existence of saddle-node bifurcation, to elaborate from three cases.
Case 1 (only one positive equilibrium in system (2.9)). Suppose ; system (2.9) has one positive equilibrium
the Jacobian matrix at is
Clearly, the eigenvalues of matrix are and ; is a nonhyperbolic point. Therefore, we investigate the dynamics near by the center manifold theorem [17, 18].
Firstly, shift to the origin via and ; system (2.9) can be transformed into
Secondly, define the transformation
which transformed system (3.8) into the following standard form:
where and .
By the existence theorem [19], there exists a center manifold for system (3.10), which can be expressed locally as follows:
with sufficiently small and is the derivative of with respect to .
To compute the center manifold , we suppose has the form
By the local center manifold theorem, the center manifold (3.12) satisfies
where .
Rewrite and as and , respectively, where
Substituting (3.10) into (3.13) and then equating coefficients on each power of to zero yields
Then, we get the approximation for :
Substituting (3.16) to the first equation of system (3.10), we achieve
So from (3.17) (see [20, p. 338–340]), we get the following theorem.
Theorem 3.2. If , the system (2.9) has one positive equilibrium and is a saddle node.
Case 2 (two positive equilibria in system (2.9)). Suppose , there are two positive equilibria
We first determine the stability of . The Jacobian matrix at is
Obviously, , so is a saddle.
Now we analyze the stability of the second positive equilibrium . The Jacobian matrix at is
By a similar argument as above, we obtain that
After some algebra calculations, we obtain ; is a stable node. So we have the following theorem.
Theorem 3.3. If , the system (2.9) has two positive equilibria and ; is a saddle and is a stable node.
Case 3 (saddle-node bifurcation). From Cases 1 and 2, we can see system (2.9) experiences a saddle-node bifurcation. In this part, we give the proof.
Let us consider as a bifurcation parameter, define and . Rewrite system (2.9) as
Then we have
has a simple eigenvalue with eigenvector , and that has an eigenvector corresponding to . Furthermore, the following conditions are satisfied:
According to the theorem (Sotomayor; see [21, p.148]), we get the following result.
Theorem 3.4. System (2.9) experiences a saddle-node bifurcation at the equilibrium as the parameter passes through the bifurcation value .
4. Estimation of the Domain of Attraction
From the previous study, we know, when , is a stable node, and at the same time, the virus-free equilibrium is also locally asymptotically stable, that is, the bistable state. In this case, the eventual behavior of the system is sensitive to the initial positions. So in this section, we study the domains of attraction about the two equilibria using the method in [22]. For this task, we make some preparations.
Definition 4.1 (domain of attraction [23]). The domain of attraction of the origin is given by
Consider the following system:
where is an analytical function with the following properties:(1); that is, is an equilibrium point of (4.2);(2)every eigenvalue of , the Jacobian matrix of at the origin, has negative real part; that is, is an asymptotically stable equilibrium point.
Let be the domain of attraction of the zero solution of (4.2). The following result provides a tool to determine .
Lemma 4.2 (see [24]). coincides with the natural domain of analyticity of the unique of the problem The function is positive on and for any , where is the boundary of .
Based on Lemma 4.2, the problem of determining can be reduced to finding the natural domain of analyticity of the solution of (4.3). The function is called the optimal Lyapunov function for (4.2). Generally, it is not easy to construct the optimal Lyapunov function and determine its domain of analyticity. But, in the diagonalizable case, we can determine the coefficients of the expansion of at , where reduces to the diagonal form. By the Cauchy-Hadamard-type theorem [25], we can obtain the domain of convergence of the series , and is a part of the domain of attraction.
For system (4.2), the following lemma holds.
Lemma 4.3 (see [26]). For each isomorphism and , the problem has a unique analytical solution; namely, , where is the optimal Lyapunov function for (4.2).
Let be one isomorphism which reduces to the diagonal form . Denote and , where is the optimal Lyapunov function for (4.2).
For given , we consider the expansion of at origin: and the expansions of the scalar components of at origin: then the coefficients of the development (4.5) are given by where is the th eigenvalue of , , represents the entry of which lies in the th row and the th column.
Let , be the Taylor polynomials of degree associated to at the origin. For , we have the following results [27].
Lemma 4.4. For any , there exists such that for any one has where and is its closure.
Hence, for any positive integer , is a Lyapunov function for system (4.2) in .
Now, we construct the Lyapunov function at the virus-free equilibrium. Define the following transformation:
Then system (2.9) is transformed into the following form: where For the isomorphism and the function , it follows from Lemma 4.3 that there exists a unique analytical function such that and , where is the optimal Lyapunov function for the reduced system (4.10).
Let the expansion of at be and denote the components of as The coefficients of expansion for are given by where .
Let , ; we obtain Let be the sum of the square terms of the expansion (4.12). Clearly, Correspondingly, by the reversibility of , we obtain an analytical function where By using Lemma 4.4, it is easy to see that there exists a domain of the origin and in which is a positive definite function and further, it provides a Lyapunov function to guarantee that the trivial equilibrium of system (2.9) is asymptotically stable. Obviously, this domain belongs to .
Next step, we will use the Lyapunov function to estimate the domain of attraction for the virus-free equilibrium. For this, we first introduce the following results.
Lemma 4.5 (see [28]). Let be a Lyapunov function for system (4.2) in the domain Assume, moreover, that is bounded and contains the origin. If is negative definite in , then the origin is asymptotically stable and every solution in tends to the origin as .
To determinate the whole domain of attraction of a general nonlinear system is an extremely difficult problem. However, by Lemma 4.5 it is possible to compute subsets of the domain of attraction which are defined by (4.19). Our objective is to find the maximum value of [29]. This is defined by the following optimization problem.
Geometrically, this means that we seek the global minimum of the Lyapunov function along the hypersurface .
Correspondingly, for system (2.9) and the determined Lyapunov function , the optimization problem (4.20) should be reformulated as where is defined by (4.17).
Clearly, (4.21) is a linear matrix inequality (LMI) relaxations of global optimization problem with multivariable real-valued polynomial criterion and constraints (see [30, 31] and the references therein for details).
Then is the estimation of domain of attraction for .
Let and , where (defined by (3.22)), analogously, for the positive equilibrium , after tedious computation, we can obtain the Lyapunov function: where From the discussion in Section 3, we can obtain hence , , and are all positive.
Correspondingly, we can obtain the estimation of domain of attraction for .
5. Numerical Examples
In this section, we will perform numerical examples to verify the results that obtained in previous sections, hence, provide an intuitive impression about saddle-node bifurcation, bistability, and the domain of attraction.
Firstly, we present the saddle-node bifurcation. Here we emphasize that the virus-free equilibrium is locally asymptotically stable for any parameter values, and it can be observed in Figures 2, 3, and 4 simultaneously (the green asterisk in Figures 2–4). In Figure 2, , , , , , and ; we have , , , which satisfy the condition . In this case, system (2.9) has only virus-free equilibrium, and it is asymptotically stable.
Setting , , and keeping other parameter values unchanged, we have , , , then . According to Theorem 3.2, system (2.9) has a positive equilibrium, and it is a saddle node. If , we have , , , then . According to Theorem 3.3, there have been two positive equilibria; one of them is a saddle and the other is a node.
Above all, system (2.9) experiences a saddle-node bifurcation which can be seen obviously from Figures 2–4. After crossing the critical value, the saddle node (the red asterisk in Figure 3) becomes one saddle and one node (the red pentagrams in Figure 4). Consequently, the bistable case occurs.
Secondly, we show the domain of attraction for the equilibria. From Figure 4, we can see that system (2.9) has two locally asymptotically stable equilibria under the condition . Letting the parameters be the same as in Figure 4, we give the estimated results of domains of attraction (Figures 5 and 6). From (4.17), we get the Lyapunov function corresponding to as follows: and its total derivative along the trajectory of system (2.9) has the form Then solving (4.21) by using the MATLAB tool, we obtain for . The corresponding region is which is shown in Figure 5.
In analogy with the virus-free equilibrium , we obtain the Lyapunov function of as follows: . The corresponding region is which is shown in Figure 6.
6. Conclusion and Discussion
In this paper, we have established a model to study email virus propagation and analyzed its dynamical behaviors. It is worthwhile to note that we use instead of to depict the transition from to , which is different from most infectious disease models. For example, Fan et al. [32] proposed an model as It is shown that the global dynamics is completely determined by the basic reproduction number . If , the disease-free equilibrium is globally stable. If , a unique endemic equilibrium is globally stable in the interior of the feasible region.
The reason of using is that we have considered the number of virus emails that the user received in our model, which is proportional to the relative number of infected computers, that is, . Due to this difference in the form, our model shows many different and complex dynamics. In our model, however, the basic reproduction number does not appear; hence the virus-free equilibrium is locally asymptotically stable for any parameter values. And it is proved that the model experiences a saddle-node bifurcation as parameter varies through the bifurcation value, which also occurs in vector-borne disease model [33]. Furthermore, the bistability is obtained. In short, the spreading mechanism and dynamical features of email virus are indeed different from most biological virus. This can offer a valuable perspective to the future dynamics modeling. Especially, the amount of virus is considerable in the transition from to when modeling the infectious disease. By doing so, the models would be beneficial for better understanding the infectious disease propagation.
The main aim of the model is to investigate how email viruses transmit and to identify effective strategies for their prevention and control. So based on the arguments above and the simulation results, it is straightforward to provide some control strategies.
From the theoretical analyses and Figures 2–4, we can take as a threshold parameter. If , the virus will die out; if , virus may die out or outbreak.
In the first case, to make below than zero, the most direct way is to change the parameter values which correspond to the related measures. Obviously, increases with , , and . The relationships between and , are shown in Figure 7; increases with , while decreases with . According to the sensitivity analysis, some measures can be taken to control virus propagation for instance, enhancing the people's understanding of email viruses to minimize the probability of opening a virus email () and appealing people to install the antivirus software and update in time to increase the ability of finding and removing the viruses ().
In the second case, from the definition of the domain of attraction, we know that starting from any initial points in the region of an equilibrium, the solution of a dynamical system will converge to the equilibrium. As is well known, the solution tending to the virus-free equilibrium indicates that the virus will be extinct without taking any measures, whereas tending to the positive equilibrium indicates the outbreak of the virus; that is, a large number of computers will be infected. As a result, in order to control the propagation of the virus, we should take measures to restrict the initial value of the computers within the domain of attraction for virus-free equilibrium (the white region in Figure 5), such as improving the computers' detection ability and enhancing the people's awareness of email virus defense.
It should be noted that the epidemiological model described here for email viruses could also be used to study other disease similar to email viruses from an epidemiological point of view. Moreover, in our model, we only consider some fundamental factors. In the future work; more factors will be incorporated into new and improved models for prediction and containment of email viruses.
Acknowledgments
This work is supported by the National Natural Science Foundation of China (11171314, 70871072, 61171179, and 11147015), the Program for Basic Research of Shan'xi province (2010011007 and 2010011002-1), research project supported by the Shanxi Scholarship Council of China (2010-074), the Foundation of the National Key Laboratory for Electronic Measurement Technology (9104c12040051010).