Abstract

The complicated interaction patterns among heterogeneous individuals have a profound impact on the contagion process in the networks. In recent years, there has been increasing evidence for the emergence of many-body interactions between two or more nodes in a wide range of biological and social networks. To encode these multinode interactions explicitly, the simplicial complex is now a popular alternative to simple networks. Meanwhile, the time-varying network has been acknowledged as a key ingredient of the contagion process. In this paper, we consider the connectivity pattern of networks affected by the homophily effect associated with individual attributes and investigate the impact of homophily-driven group interactions on the contagion process in temporal networks. The simplicial complex modeling framework is adopted to capture stochastic interactions between passively selected nodes in the paradigm of activity-driven networks. We study the evolution of infection and the epidemic threshold of the contagion process by both analytical and numerical methods. Our results on statistical topological properties of instantaneous network may shed light on accurately characterizing the evolution curve of infection. Furthermore, we show the impact of the homophily-driven interaction pattern on the epidemic threshold, which generalizes the existing results on both the paradigmatic activity-driven network and the simplicial activity-driven network.

1. Introduction

Network modeling plays a critical role in identifying structural properties and analyzing contagion processes on networks, such as the spreading of epidemic and malware, as well as the diffusion of news and ideas. Meanwhile, it has been acknowledged that the intrinsic characteristic differences among individuals and the complicated connectivity patterns are two main important factors determining the properties of the contagion process [13]. For instance, in a social network, individuals are distinguished by their attributes such as gender and age. These, together with biased interaction patterns, induce heterogeneous rates of adoption of an idea [4]. Consequently, there are substantial works dedicated to the comprehension of the typical behavioral characteristics and distinctive connectivity features, such as the homophily [4, 5], reinforcement effect [6, 7], heterogeneous degree distribution [810], and community structures [11, 12].

Generally, there is a tacit hypothesis in network modeling approaches that the structure of complex systems is reducible to the pairwise interactions (links) of their entities (nodes) [13]. In recent years, the research community has accumulated overwhelming evidence for the emergence of many-body interaction patterns in a vast body of biological and social systems. For instance, the rich datasets available have revealed that interactions occur between more than two nodes in many systems [14]. These complicated interactions cannot be encoded in the simple network and thus prompt the need for characterizing generalized network structures. Simplicial complexes are a generalization of networks that describe interactions between more than two nodes. They can involve any number of nodes. For instance, simplices of dimension , and 3 are nodes, links, triangles, and tetrahedra, respectively, while d-dimensional simplices are their d-dimensional generalizations. A simplicial complex is a finite collection of simplices of different dimensions properly glued together. It has recently been adopted in modeling many complex interacting systems, such as the brain [15], biological protein-interaction systems [16], and social systems [17].

In many cases, the interactions among system entities are rapidly changing and evolving over time. The understanding of time-varying networks and their impacts on the dynamical process has been a long-standing area of research. Recently, an activity-driven (AD) modeling framework has been put forward to address the time-scale separation between the evolving network structure and the dynamical process [18]. In this framework, each node is characterized by a specific activity rate. At each time step, a node may become active and generate finite connections with other nodes randomly selected from the network. The simple formulation of AD networks and its extensions is amenable to analytical treatment and thus facilitates the study of contagion processes on time-varying networks. Examples include epidemic spreading processes with different network structures [19, 20], information diffusion processes with complex adoption models [21, 22], and the interplay between the epidemic spreading and awareness diffusion with different risk perception mechanisms [2325].

However, the above studies focused on the pairwise interactions between individuals, while the evolving simplicial interactions are rarely considered. This gap is bridged by the simplicial activity-driven (SAD) model proposed in the recent work [26]. Within this framework, a multiagent interaction is depicted by a simplex of nodes, which implies the interaction between any two nodes in a group created by an active node. Although this simplex description is accurate for some systems such as scientific collaboration networks, a more general group interaction may contain different interaction patterns among individuals. Take the microblogging network as an example, when a user actively posts a text and interacts with its followers, two or more of its followers may also be connected because they share common interests. Thus, the potential role of possible interactions among nodes connected to a common active node in the epidemic spreading process deserves further discussions.

As mentioned above, the intrinsic characteristic of an individual is critical for shaping the individual interaction pattern. Inspired by the existence of temporal homophily revealed in the temporal motifs of communication networks [27], we consider the case in which the interaction patterns are affected by the individual attribute. We assume that there are two types of nodes in the network depending on their attributes. At each step, an active node randomly chooses its neighbors, and these neighbor nodes would interact at different probabilities determined by their types. In particular, nodes tend to interact with others of the same type. That is to say, the formation of group interactions is homophily-driven. This model extends the paradigms of typical AD models and the SAD model, which either ignore any interaction among neighbor nodes or assume complete interactions among neighbor nodes. We refer to this model as the complex activity-driven (CAD) model since the group interactions facilitated by active nodes are now described by a simplicial complex.

In this paper, the effects of homophily-driven group interactions on the contagion processes in time-varying networks are investigated. We consider the susceptible-infected-susceptible (SIS) model for the epidemic spreading process and provide analytical analysis for both the evolution process and the epidemic threshold. In majority of existing studies, the characterization of both the evolution and the steady state of infection relies on extensive Monte Carlo simulations. Once the epidemics outbreaks, the spreading models based on the classical heterogeneous mean-field (HMF) method can only provide an upper bound for the infection level. Here, we propose a new model by developing the classical HMF method to more accurately predict the evolution of the epidemic curve. To this end, we explore the topological properties of the instantaneous network generated at each step and obtain the degree distribution and the connection correlations among nodes. It is shown that introducing these statistical information of the instantaneous network contributes to accurately characterize the evolution of epidemic. Then, we derive the epidemic threshold with the analytical statistical model. The threshold is related to the connectivity pattern determined by the node’s activity potential and the homophily effect, as well as the proportion of two types of nodes. Our analytical results can cover and generalize the existing results in both classical AD and SAD networks.

The rest of this paper is constructed as follows. In Section 2, the basic CAD model is established and the topological properties of the instantaneous network are analyzed. In Section 3, the statistical SIS epidemic model is formulated and the epidemic threshold is discussed. In Section 4, numerical simulations for the epidemic spreading process are presented. Conclusions are drawn in Section 5.

2. Model

This section introduces the CAD model and investigates the topological properties of the instantaneous CAD network.

2.1. Complex Activity-Driven Network

The complex activity-driven network is an extension of the paradigmatic activity-driven network that incorporates group interactions at each step. In the framework of AD models, each node (individual) i in a population of N nodes is endowed with an activity rate , which is taken from a predefined activity probability distribution . is defined as the probability per unit time of node i to create interactions with other nodes. In addition to the heterogeneous activity rates, individuals in the population are distinguished by their attributes. We assume that there are two types of nodes, A and B. The proportions of type A nodes and type B nodes in the network are p and , respectively. For simplicity, we assume that the activity rates of nodes are irrelevant to their types, and thus the activity probability distributions for both type A and B nodes are expected to be . Similar to the approach for static networks, we group nodes into different classes according to their types and activities. Nodes with the same type and the same activity belong to the same class and are considered to be statistically equivalent. We denote the class as the collection of all nodes of type A with the activity rate a. Other classes are denoted similarly.

The instantaneous network of the CAD model is generated as follows: At each time step t, starts with N disconnected nodes. With probability , a node i becomes active and generates connection with m other nodes chosen randomly (as shown in Figure 1). All the selected nodes are neighbors of i, which together with the node i, form the group i. There are two different roles in this group: and , which are played by the active node i and all other nodes connected by i, respectively. The nodes may interact with each other, and the probabilities for an interaction between two nodes with the same type and different types are and , respectively. Since the interactions between nodes are stochastic, there may be different interaction patterns in one group, such as pairwise interactions and interactions involving more than two nodes. With one node and m nodes in each group, the dimension of group interactions may vary from 1 to m. This is different from the SAD model that considers an m-simplex in each group. The instantaneous network is then modeled as a simplicial complex composed of lower dimension simplicial complexes in each group. All these simplicial complexes are deleted at the next time step . Without loss of generality, we set in the following.

2.2. Topological Properties of the Instantaneous Network

To quantitatively characterize the epidemic spreading process, we analyze the degree distribution and the connection correlation for nodes belonging to different classes in the instantaneous network. Since evolves with time, the topological properties of are analyzed in a statistical way. Based on the definition of m, , and , it is expected that there are active nodes with groups in , where . In each group, there are at most edges among m nodes. Since these m nodes are chosen randomly by the node, an interaction between any two nodes occurs with the average probability . Thus, it is expected that there are edges in each group, with

From the perspective of directed networks, the number of edges existing in is expected to be .

Next, we discuss the degree distribution of nodes belonging to class in the instantaneous network. At each step, the inactive nodes may be passively connected by the active nodes. The probability that an inactive node has degree k iswhere denotes the minimum integer that is no less than x, denotes the probability that the node i is connected by active nodes in , and denotes the probability of interactions between the node i and other nodes. The expression of is the same as that for AD networks [24], which is given by

In the thermodynamic limit, i.e., , can be rewritten as

In addition, the expression of iswith denoting the average probability that the node i of type A interacts with a node in the group.

In equation (2), implies that the inactive node i with degree k must be connected by at least active nodes. This is due to the fact that a node can have at most m connections in one group. Once the node i is connected by active nodes, there are edges between the node i and the nodes in the network. Accordingly, the rest of edges of the node i are created by group interactions with nodes. Since the proportions of the nodes of type A and B are, respectively, p and , the average probability for an interaction between the node i and a node in the group is expected to be . Thus, the interactions between node i and all other nodes included in groups occur with the probability .

For each active node with degree k, there must be m edges emitted from it. Hence, other edges are generated by other active nodes or through group interactions. Similar to the inactive case, the probability that an active node has degree k is

Combining equations (2) and (6), the instantaneous degree distribution of nodes in class is given by

The analysis for the nodes of type B is analogous to that of type A. For the nodes of type B with activity rate a in class , we can easily derive the instantaneous degree distributions aswith in and replaced by in and .

In Figure 2, we compare the statistical results (7) and (8) with those obtained by Monte Carlo simulations. Due to the heterogeneous activity rates in the real world, we consider different activity rates among individuals in the network. The activity probability distribution is chosen as . For simplicity, we assume that there are n different activity rates in the network and they are uniformly sampled in [24]. The sampling interval is , and the minimum activity rate in the network is denoted as . For the example of , the activity rate is selected from with . In this way, the heterogeneity of node activity rates is adjustable by the value of n. The degree distribution of the nodes with activity rate is taken as an illustration example and presented in Figure 2. It is shown that the theoretical results of both and are consistent with the simulation results. To further measure the error between the theoretical and simulation results, we define an evaluation parameter, i.e.,with and from equations (7) and (8) and their counterparts and obtained by simulations. According to the value of in Figure 2, the absolute error between theoretical and simulation results is very small. This error is reasonable due to the finite network size and independent realizations.

In addition to the degree distribution, we need to analyze the connection correlation among different node classes in the instantaneous network . The connection correlation describes a preference for nodes to interact with others in terms of their class. It is similar to the degree-degree correlation in the static network [8] and the activity-activity correlation in AD networks [24]. Here, we refer to it as the class-class correlation since both the type and the activity rate of a node are considered in each class.

To this end, we define as the probability of finding a node in class and another node in class at two ends of an edge randomly chosen in . Also, is defined as the probability that there is a node belonging to the class at the end of a randomly selected edge in . Obviously, represents the ratio of the number of edges created by one node in class and the other node in , to the total number of edges E in the instantaneous network . According to the generating rules of , we can obtain

The first (second) term corresponds to an active node in class that becomes active and connects to a node in class . The third term describes the case where both a node in class a and a node in class are chosen as nodes of an active node. Similarly, the number of edges related to classes and can be written aswith equations (1), (10), and (11), and the explicit expressions of and can be easily obtained. The analysis of and is carried out in a similar way. Obviously, , , . In addition, the sum rule of is satisfied.

According to the definition of , equals the ratio of the number of edges connected to the nodes in class to the total number of edges E in the instantaneous network . That is,

The first term corresponds to an active node in class that generates m edges connected to m other randomly selected nodes. The second term stems from the fact that an inactive node i in class is connected by an active node in , and as the role of in the group, node i interacts with other nodes with certain probabilities. Moreover, the explicit expression of can be obtained with equations (1) and (12). Based on equations (10)–(12), it is easily to prove that the quantities , , and are satisfied with .

We define as the conditional probability that a randomly selected edge of the nodes in class points to a node in the class . Then, combining equations (10)–(12), can be written aswhich indicates a connection correlation between the class and in the instantaneous network . Following the analysis steps of equation (13), we can derive the following conditional probabilities for other three cases:

The combination of any two activity rates a and can be analyzed by taking similar steps (10)–(14):

In Figure 3, the conditional probabilities of nodes with the minimum activity rate are presented. Here, denotes the conditional probability with which an edge of nodes in the class () is connected to a node in the class or . It can be found that the simulation results match well with the theoretical results (13) and (14). Also, it is observed that the homophily-driven group interactions play a significant role in shaping the connection pattern of nodes. Particularly, as shown in Figure 3(a), nodes in class tend to interact with nodes in class rather than nodes in class , even though the latter is dominant in quantity. Similarly, we introduce an evaluation parameter to further measure the error between theoretical and simulation results. That is,with and from equations (13) and (14) and its counterpart obtained by simulations. According to Figure 3, is small under the three candidates of n.

3. Analysis

In this section, we characterize the paradigmatic SIS model with the statistical properties of the instantaneous network obtained in Section 2. Based on the statistical SIS model, we discuss the epidemic threshold in the CAD networks.

3.1. Statistical SIS Model

In the SIS model, each node can be either susceptible or infectious . A susceptible node in contact with an infectious node becomes infected with an infection rate β, while each infectious node recovers from infection with a recovery rate μ.

At the time step t, we denote the fractions of susceptible and infectious nodes in the class as and , respectively. Then, it gives . To analyze the contagion process quantitatively, we define as the probability that an edge of nodes in class points to an infectious node in . Obviously, is dependent on both the class-class connection correlation property of and the infection level in each node class. Based on the analysis in Section 2, we havewith and given by equations (13) and (14), respectively.

For a node i in the class , the probability of node i with degree k is given by the degree distribution . Based on equation (7), the fractions of susceptible and infectious nodes in the class at next time step are given by

It is noted that the statistical properties of the instantaneous network , and , are included in (17). Since and are related to the parameters p, and , the potential impacts of homophily-driven group interactions on the contagion process are reflected in the statistical properties of . We refer to the SIS model described by equation (17) as our statistical model. This model is different from models derived by the classical HMF theory in most of existing studies within the framework of AD models. Leveraging the HMF theory, the SIS model on the CAD networks is derived aswhere and , respectively, denote the number of susceptible and infectious nodes in class at the time step t. With and , equation (18) can be rewritten aswith , , , and . To distinguish with the statistical model, the model (19) is referred to as the classical MF model. This model is introduced as a comparison model to measure the performance of the proposed statistical model that contains more topological properties of the instantaneous network.

3.2. Epidemic Threshold

To obtain the epidemic threshold, we analyze the stability around . In this case, we have the approximation . Then, it gives

By introducing equations (13) and (14) into , the fraction of infectious nodes in equation (17) can be rewritten aswhich is exactly equation (19) derived by the classical HMF theory. This fact then implies that the classical MF model (19) is an approximation of the statistical model (17). Meanwhile, the result of approximation is accurate only when the infection level is very low. This will be further discussed in the following numerical simulations.

By integrating equation (21) over all the node classes and ignoring the second-terms in , we can obtain the expression of :

Moreover, the closed expression for is derived by multiplying both sides of equation (21) by a and integrating over all classes spectrum. That is,

Similarly, the expressions of and that correspond to the nodes of type B can be written as

Thus, we obtain the following closed system of the master equations for , , , and :whose Jacobian matrix is given by

The critical condition for the epidemic threshold is determined by the largest eigenvalue of J. That is, ifthe epidemics will outbreak in the network. Here, we cannot get the explicit expression of equation (27) for general systems. In the following, we will discuss two special cases of the system.

If , there is no homophily effect on interactions between nodes in any groups. Thus, all nodes in the network can be considered to be of the same type. In this case, the master equations of the system can be rewritten aswhich can be further reduced to bewith and . The corresponding Jacobian matrix J is given byand the two eigenvalues of J are

Thus, the critical condition (27) for the presence of an endemic state is

Particularly, with , the results for the cases of and are also included in (32).

Furthermore, if , no interaction between nodes will be allowed and equation (32) will recover the result for AD networks. In addition, if , an interaction between any two nodes can be expected in each group. Then, equation (32) gives , which is degenerated to the results of the SAD model in [26].

Next, we discuss the case where the number of nodes of both types is the same. With , the master equation (25) can be reduced as

Following the analysis steps of equation (29), we can derive the eigenvalues of J determined by system (33). That is,

Thus, the critical condition (27) is given by

Comparing equations (32) and (35), we can see that the case of a uniform mixing of two types of nodes is equivalent to the case of without homophily effect.

4. Numerical Simulations

In this section, we perform numerical simulations to investigate the epidemic spreading process on CAD networks. For simplicity, we consider five activity rates of the nodes, i.e., [24], and the activity rate distribution obeys . Accordingly, the proportions of nodes with the activity rate 0.5, 0.6, 0.7, 0.8, and 0.9 are 0.344, 0.239, 0.176, 0.135, and 0.106, respectively. At each time step, the number of edges generated by an active node is chosen to be . In the following simulations, the size of network, the recovery rate, and the initial fraction of randomly chosen infectious nodes in the population are set to be , , and , respectively. If not specified, we assume the proportion of type A nodes, the interaction probabilities for two nodes of the same type, and different types to be , respectively. All the Monte Carlo simulation results are averaged over 40 independent realizations.

Firstly, we compare the performance of the statistical model and the classical MF model on characterizing the SIS epidemic spreading process. To this end, we depict the time evolution of the fraction of infectious nodes in the network. Under different infection rate β, the evolution curves based on the statistical model (17), the classical MF model (19) and the Monte Carlo simulations are recorded. The results are presented in Figure 4. It is observed that the statistical model captures the evolution curve for all four values of β, while the classical MF model approximates the simulation results only for the case of . This is consistent with the analysis in Section 2, as the approximation adopted in the classical MF model is accurate only when the effective infection rate is below the epidemic threshold. Furthermore, as shown in Figures 4(b)4(d), the classical MF model provides an upper bound for the infection level in simulations. The difference between this upper bound and the actual infection level increases as the infection rate β increases. According to the results of Figure 4, we can conclude that analyzing the statistical properties of the instantaneous network is beneficial to accurately characterize the contagion process in the time-varying networks.

Next, we investigate the impact of homophily-driven group interactions on the final steady state of the epidemic spreading. The growth of the final fraction of infectious nodes versus the infection rate β is presented in Figure 5. It is shown that decreases with the increasing p from 0 to 0.5. Since p describes the proportion of type A nodes in the network, an increasing indicates more uniform mixing of both type nodes. This implies that any two nodes in any groups are more likely to be of different types. Due to the homophily effect, it is expected that there are less group interactions among nodes. Consequently, as shown in Figure 5, the contagion process in the network is impeded and the epidemic threshold is enhanced with an increasing p. Furthermore, noting that the definition of type A and type B is exchangeable, the effect of a varying p from 0.5 to 1 is the same as a varying from 0 to 0.5. This fact, together with the observation in Figure 5, demonstrates that the impact of homophily effect on the contagion process is the greatest when . In fact, this result can be obtained by analyzing the average probability of an interaction between any two nodes. That is, for the given and , the value of reaches its minimum with . In addition, as illustrated in Figure 5, the epidemic thresholds suggested by the Monte Carlo simulations and the statistical model are close to the theoretical values given by equation (27). This observation, in turn, validates our analytical results presented in Section 3.

Furthermore, we consider the contagion processes on the AD networks, the SAD networks, and the proposed CAD networks. According to the analysis in Section 3, the AD model and the SAD model are indeed two special cases of the CAD model and they are obtained by letting and , respectively.

In Figure 6, we depict the growth of the final fraction of infectious nodes versus the infection rate β in the three networks. Obviously, the group interactions between two nodes promote the contagion process in the network. Compared with the AD network, the epidemic threshold decreases in both CAD and SAD networks. Specifically, the critical infection rates corresponding to the epidemic threshold are satisfied with . In addition, the theoretical results of the epidemic threshold given by equations (32) and (35) are approximate to the simulation results.

Finally, we investigate the impacts of the infection rate β and the group interaction probabilities on the epidemic threshold. Figure 7 presents a comparison between the theoretical and numerical values of the epidemic threshold for different parameter combinations, where the numerically obtained steady-state infection density is color coded and shown as the background. Here, we assume is proportional to , i.e, , and the proportionality factors τ are chosen as 0.1, 0.5, and 1 in Figures 7(a)7(c), respectively. For a fixed parameter set, the dashed curve represents the theoretical value of as a function of the infection rate β based on equation (27). The markers are the corresponding numerical results of . From Figure 7, the theoretical results agree with the simulation results for all three cases of τ. For each τ, there are two critical values of β, i.e., and . When or , the epidemic threshold is not related to . Specifically, the epidemics will vanish from the network with or it breaks out with . When , it is observed that decreases with an increase of β. Here, is obtained under and is determined by the value of . Given the three candidates of τ in Figure 7, we can see that decreases with an increase of τ due to more group interactions between nodes with different types. Moreover, since the CAD network turns into the AD network with , is exactly the epidemic threshold in the AD network.

5. Conclusions

In this paper, we investigate the effect of group interactions involving more than two individuals on the contagion process in the time-varying networks. We consider that the interactions are affected by the individual attribute. The homophily effects on interaction patterns are studied. A new network model that extends the paradigmatic activity-driven model to the framework of simplicial complex networks is presented. The statistical properties of the instantaneous network based on this model are explored and incorporated in characterizing the epidemic spreading process. Results show that these properties provide a more accurate description of the evolution of epidemic. Furthermore, it is demonstrated that the homophily-driven group interactions have a profound impact on the epidemic threshold. Our results generalize the existing results of the epidemic threshold on two paradigms of activity-driven network models. However, the proposed model also comes with some limitations. Here, we consider that the activity rate of a node is independent of its type. Future work may explore a general activity probability distribution that incorporates the effect of different node types. Moreover, we assume that the infection rates through pairwise interaction and multinode interaction are the same. This model can be further developed by exploring different infection rates for different interaction patterns.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant no. 61873194.