Abstract

Many temporal networks exhibit multiple system states, such as weekday and weekend patterns in social contact networks. The detection of such distinct states in temporal network data has recently been studied as it helps reveal underlying dynamical processes. A commonly used method is network aggregation over a time window, which aggregates a subsequence of multiple network snapshots into one static network. This method, however, necessarily discards temporal dynamics within the time window. Here we propose a new method for detecting dynamic states in temporal networks using connection series (i.e., time series of connection status) between nodes. Our method consists of the construction of connection series tensors over nonoverlapping time windows, similarity measurement between these tensors, and community detection in the similarity network of those time windows. Experiments with empirical temporal network data demonstrated that our method outperformed the conventional approach using simple network aggregation in revealing interpretable system states. In addition, our method allows users to analyze hierarchical temporal structures and to uncover dynamic states at different spatial/temporal resolutions.

1. Introduction

Temporal networks are a useful framework to represent and analyze time-dependent changes and underlying dynamics of complex systems [13]. Many phenomena, ranging from disease spread [46] and human communication [79] to financial transactions [10, 11] and human brains [12, 13], can generate large-scale temporal network data. In many cases, the temporal network data can often be broken down into a sequence of discrete system states, some of which may reoccur many times. For example, air traffic networks can show seasonal variations [14, 15] and peak/off-peak weekly patterns [15], which can be modeled and studied as a temporal sequence of distinct system states. System state detection captures the temporal state change of the whole system at a collective level, in contrast to more commonly studied node-level clustering on dynamic networks [1618]. System state detection is useful for investigating the dynamics of time-varying complex systems and making better interpretation of large-scale temporal network data sets.

To detect the system states in temporal networks, Masuda and Holme recently proposed an approach that used network aggregation and graph similarity [19]. Their method first partitioned a given temporal network into subsequences and aggregated each subsequence into a static network. Then a graph similarity was measured among the aggregated static networks to generate a distance matrix, to which hierarchical clustering was applied and the number of system states was determined using Dunn’s index [20]. In their method, the timelines of interactions between nodes within a time window were aggregated as static edge weights. Yet, these timelines of interactions may incorporate critical information in exploring the system dynamics of temporal networks. For example, the patterns regarding fluctuation of connections in brain networks abstracting interactions between distinct brain regions can indicate various brain activities or states [2123]. Additionally, the method in [19] focused on the optimal division of system states based on mathematical optimization, which may also hinder the discovery of suboptimal yet informative system states in temporal networks.

In this study, we propose a method to detect dynamic states of temporal networks using connection series between nodes, i.e., the sequence of connection status between two nodes represented as a binary-valued vector (0: disconnected, 1: connected). Figure 1(a) gives an example of the connection series between two nodes. Figure 1(b) provides a comparison of connection series and network aggregation between two illustrative cases. In Figure 1(b), though the timeline of interactions is different between windows 1 and 2, the aggregated networks (i.e., aggregation 1 and aggregation 2) are the same since the number of interactions between nodes is identical in each time window. Meanwhile, the connection series incorporates information regarding both amounts and temporal fluctuation of interactions between a pair of nodes, which may be more useful when detecting dynamic states of temporal networks.

Figure 2 shows two real-world examples of connection series of face-to-face contacts between two students in a primary school [24] and contacts between two attendees at an academic conference [25]. We can observe distinct fluctuation patterns between each pair of individuals over time. Both of the data sets were downloaded from SocioPattern.org.

Like in [19], our method divides a given temporal network data into subsequences using nonoverlapping time windows. Our method then transforms each subsequence into a connection series tensor. While tensors have been widely used in machine learning and pattern recognition research [2628], we use tensors specifically as an extended representation of adjacency matrices that involve temporal connection patterns. Namely, every element in the adjacency matrix is replaced by a connection series between the corresponding pair of nodes. These connection series tensors generated from multiple subsequences are then connected to each other into a metalevel network whose edge weights are similarities between these tensors. Nonoverlapping communities are detected on this metalevel network to classify each connection series tensor into one of distinct dynamic states (represented as communities in the metalevel network). Experiments using two empirical temporal network data sets demonstrated that our method was capable of detecting interpretable and practical dynamic states in temporal networks. Additionally, by comparing the detected states with the already known sequence of events that took place in each data set, our method also outperformed the previous approach.

The rest of this paper is organized as follows. Section 2 describes our proposed method. Section 3 describes empirical temporal network data sets used in experiments. Section 4 presents the results. In section 5, we conclude and discuss limitations.

2. Method

A schematic overview of our method is presented in Figure 3. Given a temporal network with network snapshots , where is the network snapshot at time point , in which and denote the set of nodes and the set of edges, respectively. In this representation, , where is the sampling interval of the original temporal network data set. First, we split the whole temporal network data into subsequences using nonoverlapping time windows of length . The length of each subsequence is , except for the last one that can be shorter than if is not divisible by . We denote these subsequences as . Second, we transform each subsequence into a connection series tensor by setting each element in an adjacency matrix to a connection series between the corresponding pair of nodes. We denote the obtained connection series tensors as , where is the connection series tensor. Each connection series tensor is given bywhere denotes the connection series between node and node during subsequence and is the set of nodes that appears in subsequence . Third, we quantify the similarity between every pair of connection series tensors by a measure of similarity that will be described later. Fourth, we construct a fully connected, weighted metalevel network whose nodes and edges represent these tensors and the similarities between them, respectively. Fifth, we run the Louvain method [29] on the metalevel network to classify these connection series tensors (nodes in this metalevel network) into multiple communities that are interpreted as dynamic states. We can also adjust the community resolution, a tunable parameter of the Louvain method, to study dynamic states at different spatial/temporal resolutions in a given temporal network.

2.1. Similarity between Connection Series Tensors

Let the two connection series tensors we compare be and , whose node sets may be different: say and . To make the format of and consistent, we transform them into and , respectively, both of whose node sets are redefined as . The steps of our proposed similarity measure are as follows.

Step 1. We compute the similarity between every pairwise connection series in and , .

Step 2. We average all the similarities obtained in Step 1 as the similarity between and ,where denotes the number of nodes in and . Note that does not contain self-connection series .
To compute the similarity between two connection series, , we developed a simple method informed by the well-developed similarity measures of time series [3032]. Our similarity measure is based on the principle of maximizing the number of matched items. A schematic illustration of our proposed method is shown in Figure 4. The formula of similarity between two connection series is shown as follows:where represents the number of matched elements in case , while is the length of time window. Note that the maximum length of the connection series equals the length of time window .

2.2. Community Detection on Metalevel Network

Figure 5 gives a simple example of the metalevel network. We apply community detection to the metalevel network to assign each node (= a connection series tensor, or a subsequence of the original temporal network) with a distinct dynamic state label. Many community detection algorithms have been developed and employed with varying levels of success [29, 3335]. Here we use the Louvain method [29], one of the most popular modularity maximization algorithms. In the case of a metalevel network, the modularity is defined aswhere is the edge weight between tensors and , is the sum of the weights of the edges connected to node , is the community label node is assigned to, is Kronecker’s delta function, and is the sum of all of the edge weights in the network. The Louvain method is beneficial to our work mainly for two reasons. First, it can take edge weights into account. Two nodes connected by an edge with greater edge weight (i.e., higher similarity between connection series tensors) in the metalevel network are more likely to be assigned to the same dynamic state. Second, it also provides a tunable parameter of community resolution that allows for exploration of dynamic states at different spatial/temporal resolutions of interest, which is especially helpful for unknown temporal networks.

3. Data

We used the primary school and conference data sets downloaded from SocioPattern.org to run experiments. We chose them because there were known “ground truth” states to evaluate the performance of our method. Both the data sets represent the physical proximity between people. The basic properties of the two data sets are listed in Table 1.

3.1. Primary School Data

The primary school data was collected in a primary school in Lyon, France. In the school, each of all the five grades was divided into two classes [24]. The schedule of a school day was shown in Table 2. Note that different classes took turns to take breaks in a playground and to have lunch in a canteen because the playground or the canteen could not accommodate all the students at the same time [24]. The face-to-face contacts between 232 children and 10 teachers in the school were measured and recorded by body-mounted RFID devices. Two individuals were joined when they faced each other in a close range (about 1 m to 1.5 m). The data were collected from 8 : 45 to 17 : 20 on Thursday, October 1st, 2009, and from 8 : 30 to 17 : 05 on Friday, October 2nd, 2009. We used only the first day’s data in this paper.

3.2. Conference Data

This data set is named “Hypertext 2009 dynamic contact network” on the website of SocioPattern.org, which we call “conference data” for short. The data set represents the temporal network of face-to-face contacts of about 110 attendees at an academic conference. It was collected during the ACM Hypertext 2009 conference (http://www.ht2009.org/) hosted by the Institute for Scientific Interchange Foundation in Turin, Italy, from June 29th to July 1st, 2009 [25]. The data collection method was the same as that used for the primary school data. We used only the first day’s (Monday, June 29, 2009) data in this paper, whose program was given in Table 3.

4. Experiments

We applied our proposed method to the two real-world temporal networks to demonstrate how this approach can be used to detect meaningful insights regarding complex interactions among elements in time-varying complex systems. We used the event information shown in Tables 2 and 3 as the ground truth for our results. In the experiments, we varied the community resolution parameter in the Louvain method from 1.0 to smaller values (decreasing 0.01 in each variation) to scan the hierarchical temporal structure and uncover dynamic states at different resolutions. Note that a smaller community resolution parameter in the Louvain method indicates a higher resolution of system states. For comparison, we also implemented the approach using network aggregation and graph similarity proposed in [19]. Here we choose DeltaCon [36] from multiple graph similarity measures used in [19], because it takes the node identity into account that is compatible with our proposed method, and also because it is a relatively new, computationally scalable method.

4.1. Results for Primary School Data

We partitioned the primary school data into subsequences through time windows of length of 20 minutes, which was the same as that used for the same data set in [19]. Figure 6 presents the results for the primary school data obtained by our method. Figure 6(b) exhibits the detected two dynamic states, state 0 (the approximated periods are 8 : 40∼11 : 50 and 14 : 10∼17 : 20) and state 1 (the approximated period is 11 : 50∼14 : 10) at the community resolution of 1.0. By comparing the results with the primary school’s schedule, state 1 corresponds to the lunchtime, while state 0 can be simply assumed as class time. However, the morning and afternoon breaks were not revealed with such a large community resolution parameter value (low resolution of system states).

Figure 7 provides more representative results that obtained at distinct community resolutions. The results at community resolution of 0.92 are shown in Figure 7(b). There are three detected dynamic states, state 0, state 1, and state 2, which are consistent with class time, break time (morning and afternoon), and lunchtime, respectively. The detected two breaks are longer than the break time in the schedule of a school day, which may be led by the fact that students in different classes took turns to take breaks due to the limitation of the playground [24]. The results in Figure 7(c) that were obtained by using an even smaller community resolution (0.91) show that the morning break can be broken into two dynamic states, state 1 and state 2, which may suggest a distinction in collective behaviors of different classes. If we continue to decrease the community resolution to 0.89 (Figure 7(d)), we can even find the subtle differences between morning class time (state 0 in Figure 7(d)) and afternoon class time (state 4 in Figure 7(d)). Additionally, the results in Figure 7 indicate that smaller community resolution parameter values (higher resolution of system states) are beneficial to discover more subtle dynamic states.

As a comparison, Figure 8 gives the results for the primary school data obtained by the method using network aggregation and graph similarity in [19]. This method discovered two optimal system states shown in Figure 8(b) through hierarchical clustering and Dunn’s index [20]. Though it detected the lunchtime and class time, it failed to recognize the two breaks between classes.

4.2. Results for Conference Data

The conference temporal network was divided into subsequences by nonoverlapping time windows of length of 5 minutes, which were chosen according to the shortest event (short break (18 : 05∼18 : 10) in Table 3). Figure 9 shows the results obtained by our method, which are mainly aligned with the first day’s program of ACM hypertext 2009 conference. For example, the results in Figure 9(b) suggest that coffee break 1, coffee break 2, wine and cheese welcome reception, and part of lunch break (approximately from 13 : 30∼15 : 15) are recognized as the same dynamic state 2. Workshops 1, 2, 3, and 4 correspond to state 0, state 3, state 1, and state 4, respectively. Figure 9(c) provides more subtle insights at relatively smaller community resolution (0.99), in which a new period (state 5) emerges at the end of the whole day’s program. This new period may be interpreted as the “ending time” of a conference day.

On the contrary, the results displayed in Figure 10 were obtained by using the method in [19]. As shown in Figure 10(b), the only two detected system states, state 0 and state 1, failed to match the first day’s conference program. Therefore, it performed poorly in revealing meaningful system states in the conference data.

5. Discussion

In this paper, we developed a new method for detecting dynamic states in temporal networks. We transformed a given temporal network into a sequence of tensors that consisted of connection series between each pair of nodes. These connection series can help capture the collective dynamics regarding temporal and spatial interactions between elements in time-varying complex systems. We also proposed a simple method to evaluate the similarity between two connection series tensors, which can be potentially extended to a similarity measure of two temporal networks. The results with empirical temporal network data demonstrated the effectiveness of our method in detecting dynamic states. Our method also outperformed the previous approach in [19] in revealing actual events in real-world temporal network data, which suggests that incorporating timelines of interactions between pairs of nodes within time windows helps detecting dynamic system states.

As demonstrated in Figure 7, the tunable parameter of community resolution in the Louvain method is a useful tool to detect the system states at various spatial/temporal resolutions. Users can choose the appropriate community resolution parameter according to their research interests. For temporal network data for which no ground truth or underlying processes are known a priori, one possible way to find the “right” level of system state detection is to scan the parameter space gradually from high to low values and then choose the most robust, persistent division of system states as the final result. Users may also benefit from using several community validation metrics [3739] for this purpose.

We note that the choice of the length of nonoverlapping time windows would influence the results of detected states in our method. In general, a shorter time window helps discover more subtle system states, whereas it may also make the results less robust since the temporal sparseness of interactions between nodes in many temporal networks [1] could lead to insufficient topological information in each time window. Using a large time window obviously can avoid this problem, but the results may be too coarse-grained because a large time window may include multiple system states. Figure 11 gives an example of how the time window length may influence the results of state detection, showing results of our method applied to the primary school data with varying lengths of time windows. The shorter the time windows were, the more subtle the system states were uncovered. We consider it is important to choose the length of the time windows according to the underlying dynamics of the time-varying complex systems (e.g., the duration of possible shortest events). If no such information is available a priori, one should systematically vary the length of time windows until a robust result of system states is identified.

Finally, we note several limitations of our work. First, the proposed similarity measure for the connection series tensors is limited in temporal networks with given node labels. Further methodological exploration and development would be needed to handle unlabeled temporal network data. Second, consideration of all the connection series in temporal networks may become computationally very expensive when the size of the temporal network data is very large. Third, we have used only the Louvain method and have not explored different community detection methods (although we assume the main results would not be affected much as long as community detection was done to maximize modularity). Finally, our validation of the results remained only qualitative comparison with the presumed “ground truth”, while more objective, quantitative validation requires further study using other empirical temporal network data sets whose underlying system states (ground truth) are rigorously established and available.

Data Availability

All the data used in this study are publicly available on the website of http://www.sociopatterns.org/. The codes of our proposed method are available from https://github.com/shun-cao/Detecting-Dynamic-States-of-Temporal-Networks-Using-Connection-Series-Tensors.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors greatly thank Naoki Masuda for his valuable suggestions and comments on this work. The authors also thank SocioPatterns.org for sharing the temporal network data sets for public use. This work wass supported by the National Science Foundation, under Grant no. 1734147.