Research Article  Open Access
The Dynamical Modeling Analysis of the Spreading of Passive Worms in P2P Networks
Abstract
Passive worms are prone to spreading through PeertoPeer networks, and they pose a great threat to the security of the network. In this paper, considering network heterogeneity and the number of hops a search can reach, we propose a novel mathematical model to study the dynamics of the propagation of passive worms. For the proposed model, the basic reproduction number is derived by employing the existence of the positive equilibrium. And the stabilities of the wormfree equilibrium and positive equilibrium are analyzed. Moreover, we verify the rationality of the model established by comparing the stochastic simulation with the numerical simulation. Finally, we examine the effect of the number of hops on the spread of passive worms and discuss the various immunization strategies. We find that if , the propagation speed of passive worms is accelerated with the increase of hop count ; if , the number of infected peers decreases rapidly with the increase of the value of and drops to zero eventually. Results show that the network topology and the number of hops can affect the spread of passive worms.
1. Introduction
PeertoPeer (P2P) networks are composed of connected computers, which can function as both clients and servers. Each computer in the P2P network can request information and also offer information to other computers. P2P networks provide a very convenient way for people to share the information and gain widespread popularity. The rapid development of P2P networks has attracted the attention of the creators of viruses, worms, and other security threats. A number of viruses and worms are prone to spreading through P2P networks. A worm in P2P networks is a program that replicates itself and propagates through the network from one computer to another to infect healthy computers. According to spreading approach [1, 2], P2P worms can be divided into three categories: passive worms, reactive worms and proactive worms. Passive worms can hide themselves in files and spread as the files are downloaded and executed on healthy computers. Reactive worms propagate by using security vulnerabilities, and this propagation can occur through legitimate network behaviors. Proactive worms can automatically connect and infect neighbor computers by exploiting topological information acquired from infected ones. In this paper, we only focus on passive worms.
The propagation behavior of passive worms is similar to that of biological viruses. There have been many studies on modeling the propagation of passive worms in recent years. In 2006, the epidemiological models in P2P networks were proposed by Thommes and Coates in [3]. The models characterized the P2P virus propagation and pollution dissemination, respectively. In 2007, Zhou et al. [4] proposed a mathematical model for the propagation of passive worms based on the twofactor model [5] and analyzed the dynamics of passive worms. In 2008, Feng et al. [6] proposed three models of passive worm propagation: the SI model, SIS model, and SIR model. The key difference among the three models is the state transitions of peers. In 2009, Wang et al. [7] proposed a passive worm propagation model in unstructured P2P networks. And defense method was also studied based on healthy file dissemination. In 2010, Fan et al. [8] presented a logic matrix approach for modeling the spreading of P2P worms. This proposed model was essentially discretetime deterministic spreading model. In 2012, Rasheed [9] defined an SEI model based on the study of epidemiology and used a P2P simulator to implement this model in P2P networks. In 2014, Chen et al. [10] proposed a fourfactor propagation model for passive worms. In this model, four factors were considered: address hiding, configuration diversity, online/offline behaviors, and download duration. In the same year, Yang et al. [11] proposed two propagation models of passive worms: the static model and the dynamic model. In 2015, Feng et al. [12] proposed an analytical model for modeling the propagation of passive worms by adopting epidemiological approaches. In the proposed model, they considered the dynamic characteristics of the P2P network. In 2018, Rguibi et al. [13] presented a propagation model of passive worms in P2P networks. In this proposed model, the hesitation to open a new downloaded file was considered.
However, the above models ignore the influences of network heterogeneity [14, 15] on the propagation of passive worms. In a P2P network, each node represents a peer (i.e., a computer), and each edge between two nodes stands for a connection. Different peers have different connections with others, which exhibits an obvious heterogeneity. Additionally, the node degree distribution of unstructured P2P networks follows a powerlaw distribution, so it is of practical significance to study the propagation of passive worms in heterogeneous networks. On the other hand, a healthy peer that can be infected within a distance of TTL (Time to Live) hops from it [7, 16]. Here TTL is the threshold representing the number of hops a search can reach. For convenience, we use to stand for the number of hops in the following sections. In this paper, we take the above two aspects into account and propose a novel dynamical model to study the dynamics of passive worm propagation in P2P networks. For the proposed model, we first derive the basic reproduction number . Additionally, by using epidemiological approaches, it is proved that if , the wormfree equilibrium is globally asymptotically stable; when , there exists a positive equilibrium, which is globally asymptotically stable. And then we verify the rationality of the proposed model by comparing the stochastic simulation with the numerical simulation and investigate the effects of network topology and hop count on passive worm propagation. Finally, some immunization strategies are discussed.
The rest of this paper is organized as follows. In Section 2, we propose a novel networkbased SIS passive worm propagation model. In Section 3, the basic reproduction number is derived by employing the existence of the positive equilibrium. In Section 4, we investigate the global stabilities of the wormfree equilibrium and positive equilibrium. In Section 5, the numerical simulations are given to interpret the theoretical results, and the stochastic simulations are carried out to verify the rationality of the model. In Section 6, we discuss some immunization strategies and make a conclusion.
2. The Model Formulation
2.1. Modeling Background
To further study the propagation of passive worms, we first need to know the search mechanisms employed by P2P networks. In unstructured P2P networks, such as Gnutella, the most common mechanism is flooding. According to this mechanism, a peer searching for a file sends a query to each of its neighbors. And then each neighbor checks that it has the file matching the query. If it does, it responds to this request and then checks the number of hops the query can reach. The query is forwarded once, and the number of hops is reduced by 1. When hop count is greater than zero, it forwards the query to its neighbors; otherwise, the query is stopped. In particular, when we search for the files in the infected peers, they always respond for all the queries received. Figure 1 illustrates this process visually when hop count . From Figure 1, it can be seen that there are infected peers in the responding ones. Thus worms can spread easily from infected peers to uninfected ones. In order to model the spreading of passive worms, we need to quantify the search neighborhood.
Now, we adopt the generating function to calculate the number of peers available. The generating function for the degree distribution is denoted by and defined aswhere is the probability that a randomly chosen node in the network has degree . When studying the propagation problem on the network, we are also concerned about excess degree distribution. Since the probability that we reach a node of degree by following an edge is , the probability of having excess degree is [17]. Therefore, the generating function for the excess degree distribution is defined byIn this paper, we are particularly concerned about the number of hop neighbors. Firstly, the probability that a node has second neighbors (i.e., 2hop neighbors) is given bywhere is the probability of having second neighbors given that there are first neighbors (i.e., 1hop neighbors) and is the degree distribution of the network. From Figure 1, we can see that the number of 2hop neighbors of a node is equal to the sum of the excess degrees of each of the 1hop neighbors. Suppose the excess degrees of first neighbors are , respectively. They are independent random numbers drawn from the distribution . Then the probability distribution of the sum of integers has generating function . That is,Therefore, substituting the formulas (3) and (4) into the generating function for , we getSimilarly, if there are neighbors at distance , then the generating function for the number of hop neighbors can be represented byDifferentiating (5) and letting , we get the average number of 2hop neighbors of a node, that is , where and . Hence the average number of 2hop neighbors is rewritten as . Similarly, the average number of hop neighbors is given bywhere .
2.2. The Model
Based on modeling idea of epidemic diseases on networks, we propose a novel networkbased SIS model to study the dynamics of passive worm propagation. We divide all peers in the network into groups of sizes , and each peer in group has exactly degree , which is the number of edges emanating from a peer. If the total number of the peers is , i.e., , the probability that a randomly chosen peer has degree is , which is referred to as the degree distribution of the network. Additionally, according to the SIS transmission process, each peer can be in one of only two states: susceptible and infected. Let and denote the number of susceptible and infected peers of degree at time , respectively. In order to reflect the spreading behaviors of passive worms and simplify analysis, we make the following assumptions. The total number of the peers is invariable. The total number of files in all computers is the same. The passive worm always generates an equal number of infected files on each peer. Therefore, the networkbased SIS passive worm propagation model can be described by the following differential equations:where The interpretations of the variables and parameters are shown in Table 1.

Remark 1. For uncorrelated networks, the conditional probability that a node with degree is connected to a node with degree is independent of , and it is proportional to [18], i.e., .
3. Basic Reproduction Number
Let be the relative density of infected nodes with degree at time , and let be the probability that any connection points to an infected node. So (8) becomeswhereThe equality (11) can be reduced to
The equilibrium of the system (10) satisfiesSubstituting (13) into , we getObviously, is a zero solution of (14), and . Let By computing, we get Therefore, is a concave function for and , So the sufficient and necessary condition for the existence of a unique positive solution of for is obtained as follows:Then, we derive the basic reproduction number
Remark 2. If , then . In this scenario, is exactly the basic reproduction number of PastorSatorras and Vespignani model in [19].
4. Global Stability of Equilibria
4.1. Global Stability of WormFree Equilibrium
It is easy to check that the set is an invariant set of the system (10). Multiplying the equation of the system (10) by and then summing over , we obtain Let us consider the Lyapunov function If , then for all . And if and only if . Therefore, the wormfree equilibrium is globally asymptotically stable.
4.2. Global Stability of Positive Equilibrium
Lemma 3. Assume that satisfies . Then the solution of the system (10) satisfies and for all .
Proof. Multiplying the equation of the system (10) by and then summing over , satisfies Since , according to the existence and uniqueness of solutions of differential equations, we know that , for all .
The equation of the system (10) can be rewritten asSince , we getFrom the inequality (25), we have for all .
Additionally, the function satisfies Through the similar proof, we have . Therefore, for all .
Lemma 4. Assume that satisfies . If , then the solution of the system (10) satisfies and for any .
Proof. By Lemma 3, we know that for . According to (23), we getFrom (28), we haveWe will prove that there exists . If , thenUsing reduction to absurdity, we assume that the following holds for all :By integrating, for all , we have This contradicts . Therefore, there exists , and the inequality (30) holds when . Next, we further prove that the inequality (30) holds when . Supposing that this is not true, then there exists and it follows that Therefore, the inequality (30) holds for . If , thenBy (29) and (34), we getAccording to the definition of derivatives, we know that and the following holds:This contradicts the definition of . Thus, the inequality (30) holds when , i.e., the following inequality holds for :According to Lemma 3 and continuity of , we know thatLet , and the following holds for :For the inequality (39), we getBy the inequality (40), we easily know that for any .
Lemma 5. If the solution of the system (10) satisfies and , thenwhere and .
Proof. The way to prove it comes from [20]. Since , we know that for all , there exists and for .
By Lemma 3, we have . For , we obtainFrom the inequality (42), we have where . If , then and . Therefore, we get Let , and the first inequality in (41) holds.
Additionally, since , for all , there exists such that as , .
If , thenBy the inequality (45), we obtain where . If , then and . Therefore, the following inequality holds:Let , and we get the second inequality in (41).
Theorem 6. Assume that satisfies . If , then the solution of the system (10) satisfies , where is the coordinate component of positive equilibrium of the system (10).
Proof. It is first proven that the following relation holds:We let and define the sequenceBy Lemma 3, we know that for . According to Lemma 5, we haveIt can be easily proved that the sequence is decreasing by induction. Therefore, its limit exists and is expressed as . Setting in (49) and (50), we haveAdditionally, the following function is considered:By computing, we get If and is sufficiently small, then according to the definition of derivatives. On the basis of Lemma 5, let satisfy the following relation.And the sequence is defined asBy (54) and Lemma 5, we obtainOn the basis of (52), (54) and (55), the following inequality holds:According to (55) and (57), we get If for , then Therefore, the sequence is increasing for and its limit exists. Let . Setting in (55) and (56), we haveBy (51) and (60), we know that and are the coordinate components of positive equilibrium of the system (10). According to the uniqueness of the positive equilibrium of the system (10), we deduce that andTherefore, . Through the above proofs, we know that the positive equilibrium is globally asymptotically stable.
5. Simulation Results
In this section, we make a series of simulations to validate the proposed model in this paper. Our simulations are based on a network with powerlaw distribution , where and . And let .
In Figure 2, we examine the validity of the model through the numerical simulations and the stochastic simulations in the case of different . In Figure 2(a), the parameters are chosen as and . We give the results of the numerical simulation and the stochastic simulation about the model for . In Figure 2(b), when and , the numerical simulation about the model for is carried out to visually show the change of the proportion of infected peers with time. At the same time, the stochastic simulation for is carried out to verify the rationality of the model for . Similarly, when and , Figure 2(c) shows the results of the numerical simulation and the stochastic simulation about the model for . In a word, it can be seen from Figure 2 that the fitting results of stochastic simulations and numerical simulations are more consistent, which shows that the model established in this paper is reasonable.
(a)
(b)
(c)
In Figure 3, when the basic reproduction number , we examine the impact of on the evolution of infection prevalence, which is defined as the proportion of infected peers with respect to the total number of the peers. The parameters are chosen as and . Obviously, the number of hops has a significant effect on the spreading of passive worms. From Figure 3, it can be seen that the propagation speed of passive worms is accelerated with the increase of the value of . It is easy to understand that the larger the value of is, the larger the search neighborhood becomes. Thus, the probability that there exist infected peers becomes larger, which accelerates the propagation of passive worms. Hence, the infection becomes more prevalent. From Figure 3, we can also see that the proportion of infected peers can converge to a positive value, which shows that passive worms exist in the network and reach a stable state eventually. This is consistent with the theoretical analysis.
In Figure 4, we show the tendency of passive worm propagation when the basic reproduction number . From Figure 4, it can be seen that the tendency of the spread of passive worms is in decline and reaches a wormfree equilibrium state eventually, which is consistent with the analysis in previous sections. Additionally, we find that the larger the value of , the faster the descent.
In Figure 5, when the basic reproduction number , we examine the change of the relative densities of infected peers with different degrees in the case of and , respectively. Here, the relative density of infected peers with degree is defined as the proportion of infected peers with degree with respect to the total number of the peers with degree , i.e., the solutions of the system (10). In particular, Figure 5 gives the relative densities of the infected nodes with degree and when and , respectively. From Figure 5, we find that the nodes with large degrees are easily to be infected, and the relative densities of the infected nodes with small degrees increase obviously with the increase of . It is easy to understand that with the increase of , the number of the nodes with small degrees increases more significantly than that of the nodes with large degrees. Therefore, the probability that the nodes with small degrees are infected becomes larger.
(a)
(b)
(c)
6. Discussions and Conclusions
In this paper, after understanding the mechanism of passive worm propagation, we propose a novel networkbased SIS model to study the dynamics of its propagation in P2P networks and predict the trend of the propagation. For the presented model, the basic reproduction number is derived by employing the existence of the positive equilibrium. And then we prove that if , the wormfree equilibrium is globally asymptotically stable. Additionally, if , there exists a positive equilibrium, which is globally asymptotically stable. In order to interpret the theoretical results, we carry out the numerical simulations. We find that the simulation results are consistent with the theoretical analysis. On the other hand, we verify the rationality of the model established by comparing the stochastic simulation with the numerical simulation. We find that their fitting results are more consistent. Moreover, we examine the effect of hop count on the spread of passive worms. From simulation results, we can see that if , the propagation speed of passive worms is accelerated with the increase of the value of , which leads to an increase in the number of infected peers; if , the number of infected peers decreases rapidly with the increase of the value of and drops to zero eventually. To control the spread of passive worms, we need to restrict the number of hops a search can reach in addition to changing the network topology.
The appropriate immunization strategy is also very important for controlling the spread of the virus. Usually, there are two kinds of immunization strategies: uniform immunization and targeted immunization [21, 22]. Now, let us discuss their effects on the system (10).
Uniform immunization is a very simple immunization strategy. This strategy is to immune a fraction of the nodes in the network randomly. Let be the immunization rate and . By substituting for , the system (10) becomesFor system (62), we get its basic reproduction numberIn the formula (63), if , that is, no immunization is performed, . While , , which means that the immunization strategy is beneficial to control the spread of virus.
Targeted immunization is another effective strategy to control the spread of the virus, and it is to immune the nodes with large degrees. To this end, lower and upper thresholds and are introduced. Therefore, the immunization rate is a piecewise function about and is defined aswhere and is the average immunization rate. And then the system (10) becomesFor system (65), we get its basic reproduction numberNote that