Abstract
Due to the inequality of users’ (nodes’) status and the influence of external forces in the progress of the information propagation in a social network, the infected nodes hold different levels of propagation capacity. For this reason, the infected nodes are classified into two categories: the high influential infected nodes and the ordinary influential infected nodes which separately account for 20% and 80% by Pareto’s principle. By borrowing the SEIR epidemic model, this paper proposes an SE2IR information propagation model. Meanwhile, the global asymptotical stabilities of the spreadfree equilibrium point and local spread equilibrium point are proved for this model. This paper also puts forward a series of information control strategies including perceived values of users, social reinforcement intensity, and information timeliness in the social network. Through simulation experiments without or with control strategies on a real company email network dataset, this paper verifies the stability and correctness of the model and the feasibility and effectiveness of the control strategies in the information propagation process, presenting that the model is closer to the real process of the information propagation in the social network.
1. Introduction
With the rapid development of the Internet, more and more people are active online; sharing their hobbies, lives, and feelings; forwarding some news; spreading or reviewing some ideas; advertising their products; and so forth. Their propagation behaviors would give rise to a new topic and even public opinion in online social networks. Once an online public opinion breaks out, social public security will be threatened. Therefore, the studies on the evolution mechanisms and control strategies of information propagation have important theoretical and realistic significance in social networks.
In social networks, the process of the emergence, propagation, and extinction for information, say a message or an event, is the same as the process of the birth, spread, and death of an epidemic. In recent, based on the SIS, SIR, SIRS, and SEIR epidemic models [1–3], many scholars studied the information propagation models [4–7] in the social networks. SorianoPaños et al. [8] pointed out that vectorborne epidemics are the result of a combination of multiple factors. They derived the expression of epidemic thresholds to capture the conditions that led to the onset of epidemics. Considering that the nodes could not have lifelong immunity to some information, the nodes would change their interest with the environment and the development degree of the information in the social networks. Since the traditional SIS and SIR models have not explained the information propagation process, some research made new studies in view of the structure and propagation characteristics in the networks. Yuan et al. [9, 10] regarded the interpersonal contact as a scalefree social network and then proposed SIR models with two kinds of vulnerable groups and differential infectivity on the network, respectively. Zhang et al. [11] put forward the crossnetwork information propagation model. In the model, the nodes in the social network are divided into the source nodes, external nodes, crossnetwork nodes, and immune nodes. According to their model, they studied the process of the information spread between different networks and the influence of crossnetwork nodes on the process of information propagation in different networks. Wang et al. [12] proposed the 2SI2R rumor propagation model which discussed two types of rumors spread among the crowd at the same time. They also employed the same approach of the SIR epidemic model to analyze the dynamic properties. Huo et al. [13] considered two different spreaders: the spreaders with the high rate of active state and the spreader with the low rate of the active state. Base on this consideration, they proposed an I2SR rumor spreading model. For the I2SR model, they analyzed the locally asymptotic stability of equilibrium and global stability of internal equilibrium in the homogeneous network. Yang et al. [14] presented a controlled heterogeneous nodesbased SIRS model. They further established an optimal dynamical immunization control problem. By the numerical solution of the optimal system, they obtained the optimal immunization strategy. They found the complex impact of the network topology on this strategy. Zhang and Huang [15] proposed an optimally controlled SLBOS epidemic model which integrated the impact of reinstalling system and network topology on the propagation of computer viruses. On the basis of the latent stage, vaccination and graded cure rate, Gan et al. [16–18] proposed different computer virus epidemic models and effective strategies to restrain the virus epidemic.
For the SEIR model on the biological virus, the exposed individuals expressed that they had contact with an infected individual, but they were not infectious. Similarly, there are latent nodes who knew some information but do not transmit them and are stuck in a holding pattern in social networks. Combining the SEIR model and individual cognitive psychology, Wang and Li [19] obtained an information spreading model to confirm the influence of individual cognition on information transmission in social networks. Xiao et al. [20] designed an information spreading model based on users’ multidimensional features and dynamic evolution game mechanism. They further analyzed public opinion evolution laws through changing numbers of all kinds of groups in social networks. Considering the exposed nodes may become immune nodes at a certain probability, Liu et al. [21] presented an SEIR rumor spread model in the heterogeneous network. In view of their model, they discussed the global dynamic behavior of norumors balanced set and two immunization strategies of rumors diffusion.
In addition to the studies of the information propagation models based on epidemic models, a series of rumor (false information) control strategies are proposed to control the information propagation, guiding it toward the positive development of public opinion. Arquam et al. [22] proposed a seasonal susceptible infection recovery epidemiological model which combines the effects of temperature changes and the heterogeneous structure of the human interaction network on the spread of vectorborne diseases. Their model can be used to predict disease outbreaks and ulteriorly adopted appropriate control measures to contain the disease epidemics. Pan et al. [23] used the truth based on a set of reasonable assumptions to establish a rumortruth mixed spreading model, minimizing the influence of the rumors. Huo et al. [24] considered a rumor propagation model with latent period and varying population. They showed that the most effective rumor control forms are science education and official media coverage. Zhao et al. [25] established an isolationconversion strategy to minimize the impact of rumors through isolating rumorspreaders. He et al. [26] took blocking rumors at influential users and spreading truth to clarify rumors into account. They proposed a heterogeneous networkbased epidemic model to describe the rumor spreading in mobile social networks. They also proposed the realtime optimization strategy and the pulse spreading truth and continuous blocking rumor strategy to restrain the rumor spreading.
Another modeling way focuses on the interaction between the user and the effect of network topological structure on the information propagation process, including Ising model and Deffuant model [27, 28]. Considering that a user’s view may be affected and changed by others and the user’s attitude may change along with the events due to the intervention from human factors, Wang [29] found that the human factors affect the information propagation. Yu [30] built the entire network of venture capital industry on the data of China’s venture capital industry. By calculating the group tie density based on network interaction, state heterogeneity, and network centralitybased faults, the impact of group network characteristics on the complementarity effect is analyzed. PastorSatorras et al. [31–34] deeply analyzed the network topology effect on the information dissemination process. They found that the heterogeneity of the network structure was more remarkable; while the speed of the information propagation gradually accelerated, its scale gradually decreased. Lü et al. [35] assumed that information propagation has three rules: social reinforcement effect, memory effect, and nonredundant contact. They further put forward a probability function with saturation effect, verifying that the small world effect was the most conducive to the information propagation. Liu et al. [36] researched that the edge weight distribution and the cluster coefficient influenced the rumor spread on weighted SMS network.
However, to some extent, these studies ignored the inequality of users’ status and user influence of external forces in the process of information propagation on social networks. For the nodes of celebrity, bigV, and authoritative expert in some field, their network status is significantly higher than other nodes. They, called “opinion leader,” always guide the other nodes’ opinions in the information propagation process. Bunimovich et al. [37] discovered hidden structures, hierarchical structures, and network cores by the isospectral reduction theory in different networks. Bauer and Lizier [38, 39] used the network structure characteristics to depict users’ status in the networks. They found that the more the number of users’ neighbors are, the greater their network influence. Meanwhile, they discovered that the asymmetry of network status among users can be perceived by most users. Ren and Zhang [40] found that the node degrees make a critical difference for strategic decisionmaking. In the late twentieth century, Koch [41], an economist in Italy, discovered Pareto’s principle, i.e., 80/20 principle: “in any group of things, the most important only accounts for a small part, about 20%, while the remaining 80%, though the majority, is secondary” which is widely used in Sociology and Business Management. In 2020, our conference paper [42] simply reported an R model based on Pareto’s principle, its diseasefree and local disease equilibrium points, and the globally asymptotical stabilities. Simulation experiments on a company’s mailbox network verified the stability and correctness of the model in the process of information propagation. However, there are no detailed theoretical proofs on the globally asymptotical stabilities; and there are no control strategies for this model.
As far as we know, a few papers have adopted Pareto’s principle, users’ status, and user influence of external forces to the information propagation and control in the social network. This paper will put forward an SE2IR information propagation model (SE2IR model) to depict the evolution laws of the information propagation. In the proposed model, the infected nodes are classified into the high influential infected nodes and the ordinary influential infected nodes which separately account for 20% and 80% based on Pareto’s principle. On the basis of the proposed model, this paper analyzes the global asymptotical stability of the spreadfree equilibrium point and local spread equilibrium point. Besides, this paper also advances a new type of information control strategies including the perceived values of users, social reinforcement intensity, and information timeliness in the social network. Finally, the real data simulation experiments on a company email network without or with control strategies confirm the stability and correctness of the model and the availability of the model with the control strategies.
The rest of this paper is organized as follows: first, we propose the novel SE2IR information propagation model and analyze its equilibrium stable points in Section 2. Next, we present influence factors and parameter analysis based on the proposed model in Section 3. Then, we verify in detail the feasibility and effectiveness of the proposed model and the control strategy by experiments on a company email network data in Section 4. Finally, we make a conclusion about our work in Section 5.
2. The Analysis of the SE2IR Information Propagation Model
2.1. The Description of the SE2IR Information Propagation Model
For a social network, it can be described as a graph , where stands for the set of users and presents the set of the relations of sending and receiving information among users in the network. Based on the different interaction behavior of nodes, the nodes of the social network are divided into susceptible nodes, latent nodes, infected nodes, and removed nodes. In detail, the susceptible nodes are made up of the nodes who do not know the information in the social network; the exposed nodes are composed of the nodes who know the information but do not spread the information and are still in a wait state; the infected nodes consist of the nodes who know the information and spread it. Considering the nodes’ status and influence of external forces in the process of information propagation in the social network, the infected nodes are further categorized as high influential infected nodes and ordinary influential infected nodes. Specifically, the high influential infected nodes are comprised of the high influential/relative social status nodes which play an important role in the information transmission process; and the ordinary influential infected nodes are made up of ordinary influential/relative social status nodes which occupy a large number of network nodes on the information dissemination process. Based Pareto’s principle, the high and ordinary influence infected nodes separately occupy 20% and 80%. The removed nodes consist of the nodes who lose interest in the information and no longer pay attention to the information. Let , , , , and denote the percentages of susceptible nodes, exposed nodes, high influential infected nodes, ordinary influential infected nodes, and removed nodes at time t in the social networks, respectively. Then, . Unless otherwise noted, let , , , , and represent , , , , and , respectively.
By carefully considering the features of information propagation, the following assumptions are made: when the event erupts, new online nodes enter into the social network about the event, at a positive constant rate . The susceptible nodes convert to the high influential infected nodes, ordinary influential infected nodes, and exposed nodes by probabilities , and , respectively. The exposed nodes change into high influential infected nodes and ordinary influential infected nodes by probabilities and , respectively. The high influential infected nodes, ordinary influential infected nodes, exposed nodes, and susceptible nodes with probabilities , , , and transform into removed nodes, respectively.
Based on the above hypothesis, the following information propagation model, as shown in Figure 1, can be given:
Since the first four equations of system (1) do not include , we can get the following system (2):
The feasible region for system (2) is .
2.2. Threshold of Information Propagation
Information propagation threshold, denoted as , expresses the number of users who were infected during an average period of information spread. Let . Then, system (2) can be expressed aswhere
For and , we take the partial derivatives at :
Then, we further have
Let . Then, the spreadfree equilibrium point is ; and the information propagation threshold is
2.3. The Stabilities of Two Equilibrium Points
System (2) always has a spreadfree equilibrium point , representing that no information is spreading after a period of time in the social network. Now, let us check the stability of this equilibrium point with respect to the feasible region .
Theorem 1. If , then the spreadfree equilibrium point of system (2) is globally asymptotically stable.
Proof. Let the Lyapunov function be . Then, the total differential equation for on isWhen , we have . Therefore, we have . According to invariant set principle, the existence of the function satisfies the condition that the diseasefree equilibrium point is globally asymptotically stable.
System (2) has a local spread equilibrium point: :and . It implies that the information is invariably spread and the number of the infected nodes increases continuously at the rate .
Next, we prove the stability of the local spread equilibrium point.
Theorem 2. If , then the local spread equilibrium point is globally asymptotically stable.
Proof. Let the Lyapunov function be , where is an undetermined constant and . By system (2), we haveSo, it implies that . Then, the system can be expressed asNext, we get the derivation of the above formula at :According to Cauchy–Schwarz inequality [43], we haveLet . When , we haveNote that . Now, we consider the following function:where . Then, its derivation at is the following:For , we have . Then, we have . Therefore . For , when , that is, , we have . It further derives . Thus, we have if and only if . According to Lasalle invariant theory, when , is globally asymptotically stable.
In the same way, we can also prove that when , the system is unstable according to Lyapunov stability theory.
3. Influence Factors and Parameter Analysis of the SE2IR Information Propagation Model
On the basis of the proposed SE2IR model, this section will propose some factors influencing information propagation. Then, the influence factors are integrated into the parameters of the SE2IR model, preparing for the simulation experiment with control strategies.
3.1. Quantification of Influence Factors
For the social network , let the cardinalities of and be denoted by and . Let . For nodes , the relative influence weight of to will be introduced.(i)The influence weight of node to node , denoted by , is expressed as where represents the number of neighbors for node and stands for the set of neighbors of the node . Obviously, the greater the degree of is, the greater its influence weight for is.(ii)The relative influence weight of to , denoted by , is defined as Obviously, when , ; otherwise, .(iii)Social reinforcement intensity: let be the social reinforcement intensity. If we denote the probability of an information propagation by , then the relationship between the social reinforcement intensity and the probability of information propagation can be defined as where is the time of information propagation and is the time that one executes the social reinforcement. It easily concludes that the larger the social reinforcement intensity is, the less the probability of the information propagation is, and the better the inhibition effect of the social reinforcement is. When , it implies that the information propagation is not affected by some intervention; when , it means that the information propagation is subjected to the maximum social reinforcement, such as shutting down platform and banning web users. The earlier the execution time of the social reinforcement is, the smaller the social reinforcement intensity is.(iv)Perceived values of users: due to different cognitive abilities of users for information in the social network, such as the demand, interest, or value for the information, users have different perceived values, denoted by . A normal distribution is often used to represent an unknown random variable in natural and social sciences, such as the position of some propagating particle, measurement error, and residual error in regression. Therefore, the perceived values of users, denoted by , can be assumed as the normal distribution: where ; is the mean value of the perceived values of users and represents the standard deviation of the perceived values of users. Obviously, the larger the value of is, the higher the perceived value of a user for the information is.(v)Subjective preference coefficient of users: let be the subjective preference coefficient of users for a known information. The higher the value of is, the more dependent users are on their perceived values for the information to choose the propagation behavior; the lower the value of is, the more inclined they are to choose propagation behavior according to the social reinforcement intensity.(vi)Information timeliness: various thematic information, such as politics, economy, military, religion, sports, and election, are always propagated with different information timeliness among users. Let the information timeliness of the user in be denoted by . Inspired by the exponential distribution, the probability that the user quits the process of an information propagation at time , denoted by , is configured for .
3.2. Parameter Analysis of the SE2IR Information Propagation Model
In social networks, the state transition probabilities between different nodes play a key role in the trends of information propagation. In order to really reflect the trends, based on the proposed influencing factors, the state transition probabilities of the proposed SE2IR model are amply interpreted as follows: : in a social network, the process that a susceptible node propagates information to its neighbors is mainly leveraged by the perceived value of the node for the information and the information timeliness; when the information is good for a node, the node actively spreads the information to its neighbors; meanwhile, the newer the information is, the more attractive it is to the user. For a susceptible node, its perceived value is very large, so it could be set as . Therefore, for a susceptible , the probability that the susceptible node transfers into an infected node can be expressed as : for a susceptible node in , when is affected by a node , the probability that it transfers into a latent node is defined as Obviously, the equation adequately takes advantage of the relative influence weight of a node to another node and the perceived value of a user. : for an exposed node in , the probability that it transfers to an infected node in at time can be represented as For , it implies that the social network is not affected by any interventions from the social platform and government. Otherwise, the social network is affected by external interventions. : for an infected node in or , the probability that it transfers to a removed node in at time can be expressed as : for an exposed node in , the probability that it transfers to a removed node in at time can be presented as : for a susceptible node in , the probability that it transfers to a removed node of at time can be expressed as
From the above different state transition possibilities integrated with multiple influence factors, we could regard the parameters of different influence factors as different control strategies and then adopt them to control the evolution trends of the SE2IR model. Combining the above parameters with the SE2IR model, an information propagation control algorithm of the SE2IR model (IPC SE2IR Algorithm) is given as follows:
4. Numerical Simulation
To intuitively demonstrate the authenticity and accuracy of the proposed SE2IR control strategies, according to Algorithm 1, some numerical simulation experiments without/with control strategies are adopted on a real mailbox network dataset, as shown in Table 1. In the following, we employ the user activity (i.e., the proportion of infected nodes that reach a peak level), user percent (i.e., the proportion of some type of users’ state), and information propagation range (i.e., the proportion of removed nodes at the end time) to measure the trends of the information propagation.

4.1. Numerical Simulation of the SE2IR Model without Control Strategies
System (2) has a globally asymptotically stable spreadfree equilibrium point for and a globally asymptotically stable local spread equilibrium point for . The two cases without control strategies will be made numerical simulations with the setting environment shown in Table 2. Their results are shown in Figures 2 and 3, respectively.
In Figure 2, drops sharply at the beginning and then keeps an almost stable status after . The reason is that users do not initially know the information in the network. quickly reaches a peak at the beginning and then decreases at the turn point, keeping an almost stable status after . Finally, and tend to zero. It is found that reaches a peak and slowly decreases to zero, and reaches a peak and quickly tends to zero. The former peak value is obviously lower than the latter, and its arrival time is later than the latter. This is because the number of the ordinary influential infected nodes is obviously larger than the high influential infected nodes while the high influential infected nodes are less likely to be infected. While time increases, the information timeliness continually reduces; that is, the information is gradually losing its heat, and then the infected nodes and latent nodes are transformed into removed nodes in the network. continues to increase and finally becomes constant once the propagation steps into the stable status. When , the numbers of the high influential infected nodes and ordinary influential infected nodes who propagate the information will be constants after a period of time. Eventually, the information cannot be spread for .
In Figure 3, drops sharply at the beginning and then stands at an almost stable status after . and rapidly rise to peaks and then fall down to the equilibrium values. slowly rises to the highest value at and then keeps the equilibrium value at all the time. quickly increases to and then keeps the equilibrium value. Finally, slowly decreases at all time. After , system (2) enters the stable status, in which the high influential infected nodes and the ordinary influential infected nodes still exist in the network. In the end, the information propagation range will become wider and wider as time goes by when .
Considering that the high influential nodes that captured 20% of all nodes play an important role in the process of information propagation, it is necessary to discuss the impact of the state transition possibility from to on the system. As shown in Figure 4, when the value of constantly increases, the value of continues to increase; the earlier its peak arrival time is, the slower the peak decreases to the stable status. The reason for this is that the high influential nodes have many followers to strongly affect others. Since the number of high influential nodes is a constant in a social network, the value of tends to a constant. If more and more high influential nodes join in the propagation process, then the range of the information propagation will be promoted wider and wider.
Next, we further explore the degrees of initial infected nodes influence on the process of information propagation. Through randomly choosing a maximumdegree node, an averagedegree node, and a minimumdegree node as infected nodes, it dissects their impacts on the trends of the information propagation, respectively, as shown in Figure 5. It shows that the proportion of infected nodes increases rapidly and then decreases slowly to near zero for different initial infected nodes in the whole propagation process. For every moment, under the maximumdegree (Maxdegree) node, averagedegree node (Meandegree), and minimumdegree (Mindegree) node as an initial infected node, the proportions of infected nodes are sized down in turn. The reason is that the maximumdegree node is always the authority user or opinion leader who has a lot of fans in social networks and so has more opportunities to propagate information to others by its fans, presenting a rapid propagation speed.
From Figures 4 and 5, the high influential nodes and maximumdegree modes greatly affect the information propagation range in the network.
4.2. Numerical Simulation of the SE2IR Model with Control Strategies
In social networks, negative information may spread on a large scale, affecting social stability and solidarity to an extent. So it is important for the networks to analyze the impact of different influence factors on the SE2IR model. For this purpose, we further propose corresponding control strategies for reducing the scale of information propagation. In the following experiments, the nodes of maximum degrees are regarded as the initial propagation nodes. The effects of perceived values of users, social reinforcement intensity, and information timeliness will be analyzed on the SE2IR model as shown in Figures 6–8, respectively.
In Figure 6, the simulation parameters are , . It shows the user activity and information propagation range increase with perceived values of users. Practically speaking, the higher the value of is, the more infectious nodes are in the networks. There is a phase transition point , when , the growth rate of the information propagation range increases slowly; otherwise, the growth rate of the information propagation range increases rapidly. Therefore, for negative information, we should pay attention to the change of the perceived values of users and then timely control the perceived values of users to curb the information spread effectively in the networks.In Figure 7, the simulation parameters are , , , , and . It shows that the earlier intervention time is carried out, the fewer the infected nodes are, under the same social reinforcement intensity. With the delay of intervention times, the number of infected nodes increases gradually until a phase transition point of intervention time. It still shows that, under the same intervention time, the stronger the reinforcement intensity is, the better the suppression effect of the information propagation is. As the reinforcement intensity increases from 0 to 1, the user activity decreases in turn. Therefore, for negative information, we can effectively control the social reinforcement intensity and its intervention time to suppress the negative information propagation.
In Figure 8, the simulation parameters are , , , and . It shows that, with the increase of information timeliness, the proportion of infected nodes increases as time goes on. That is, the higher the information timeliness is, the wider the propagation range of the information is. The main reason is that users usually like to see the information that has strong information timeliness and spread it to others. Therefore, for negative information, we can appropriately control the information timeliness to reduce the speed of the negative information propagation or timely release new positive information to repress the negative information propagation.
Based on the above experiments, it concludes that we can properly choose the control strategies: perceived values of users, social reinforcement intensity, and information timeliness, to guide the evolution trends of the SE2IR information propagation model.
5. Conclusion
In this paper, the SE2IR information propagation model is proposed on social networks. In the model, according to the node status in social networks, the infected nodes are divided into high influential infected nodes and ordinary influential infected nodes, respectively, occupying 20% and 80% based on Pareto’s principle. By Lyapunov stability theory, we proved two equilibrium points, i.e., the spreadfree equilibrium point and local spread equilibrium point, are globally asymptotically stable. We further proposed some influence factors that affect information propagation including perceived values of users, social reinforcement intensity, and information timeliness as control strategies and integrated them into the parameters of the SE2IR model. Through the numerical simulation on a real dataset of a company’s mailbox network, it showed that, under the case without control strategies, all information was widely spread if but died out if , where is the basic reproduction value. It also found that the network status plays important role in the process of information propagation. The more the high influential nodes take part in the information propagation process, the wider the range of the information propagation is. The maximumdegree node as an infected node always led to a rapid propagation speed and a wide propagation range. Under the case with control strategies, we discussed the effect of perceived values of users, social reinforcement intensity, and information timeliness on the SE2IR model. We found that these control strategies greatly influence the information propagation speed and range. So, we can effectively control the perceived values of users, social reinforcement intensity, and information timeliness to repress the negative information propagation.
In the future research, we may discover more features in different social networks, integrating these into the SE2IR model. We also establish other novel information propagation models with control strategies to guide the information propagation trends.
Data Availability
No data were used to support the findings of the study.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was partially supported by the National Natural Science Foundation (nos. 61802316, 61872298, and 61902324), Chunhui Plan Cooperation and Research Project, Ministry of Education of China (nos. Z2015109 and Z2015100), “Young Scholars Reserve Talents” program of Xihua University, Scientific Research Fund of Sichuan Provincial Education Department (nos. 15ZA0130), Science and Technology Department of Sichuan Province (nos. 2021YFQ0008 and 2019GFW115), and Key Scientific Research Fund of Xihua University (no. z1422615).