Abstract

Many real-world systems of various origins are capable of self-organization to the edge of a phase transition, characterized by avalanche-like behavior. Therefore, it is important, by observing the behavior of early warning measures for dynamical series generated by systems, to timely see the early warning signals (precursors) of such self-organization and, if necessary, take preventive measures. To date, convincing evidence of self-organization to the edge of a phase transition has been obtained, but no effective precursors for this self-organization have been found. This research explores precursors for the Twitter self-organization based on the analysis of the behavior of measures directly related to the critical slowdown of the network and measures of the phase space reconstructed by the Takens method for the series of the number of network users creating avalanches of retweets in the network, corresponding to the three debates of the 2016 United States Presidential Election. We hydrated the relevant Tweet IDs, which were obtained from the Harvard Dataverse using the Social Feed Manager, to form this series. Preliminarily, we explore the potential of measures for early detection of self-organization of sandpile cellular automata as systems with Twitter-equivalent self-organization mechanisms. The equivalence is justified in the proposed discrete-time model for Twitter self-organization to the edge of a phase transition. It is found that there are more moments of the Twitter self-organization than the moments of time when debates started, and Twitter stays at the edge of a phase transition longer than the debate lasts. The effective measures, as the measures with the lowest number of false early warning signals, among all studied measures and for all studied systems, are dispersion and correlation dimension. Obtained results are practically important in the design and implementation of early warning systems for the systems with similar mechanisms for sandpile cellular automata self-organization to the edge of a phase transition.

1. Introduction

More than 35 years ago, it was found that self-organization is capable of bringing complex systems on the edge of the second-order phase transition without tuning the control parameter to a critical value (e.g., see the paper [1]). The theory explaining such self-organization has been called the theory of Self-Organized Criticality (SOC). It has been established relatively recently (e.g., see papers [2, 3]) that complex systems are not only capable of self-organization to the edge of the second-order phase transition but also capable of self-organization to the edge of the first-order phase transition characterized by two stable configurations.

Self-Organized Bistability (SOB) demonstrates the coexistence of two stable configurations in the hysteresis loop corresponding to zero order parameter and nonzero order parameter. Quantitative criteria of the system being on the edge of a phase transition are, for example, the value of autocorrelation at lag-1 of the order parameter equal to 1 and the power-law scaling exponent of the Power Spectral Density (PSD) of the order parameter belonging to the interval from 1 to 2 (e.g., see papers [2, 4]).

Recently, numerous evidences of self-organization of real-world systems of various origins to the edge of a phase transition have been obtained (e.g., see papers [5, 6]). Thus, SOC is a characteristic of social interaction networks (e.g., see papers [79]) and online social networks (e.g., see papers [1015]) such as Twitter. SOB is a characteristic of the brain (e.g., see the paper [16]).

There are also more and more research studies devoted to analyzing and searching for early warning signals (precursors) for a phase transition in complex systems (e.g., see papers [1720]). First of all, this is the search for precursors based on the analysis of the observed sequence of order parameter values generated by the system in real time. One of the results of this research is measures, by the characteristic change of which one can judge that the system is approaching the edge of a phase transition. Further such measures will be called the Early Warning Measures (EWMs). The number of EWMs and systems for which EWMs give good results in predicting critical points is regularly growing.

Despite the available variety of early warning signals for a phase transition and the existence of numerous evidences of Twitter on the edge of a phase transition, the obtained EWMs (e.g., see papers [2124]) are not suitable for use as reliable measures used in early warning systems of the Twitter self-organization to the edge of a phase transition. First of all, this is due to the lack of studies of the behavior of EWMs directly related and unrelated to the critical slowdown of the system (e.g., see paper [25]) as they approach and in the neighborhood of the edge of Twitter phase transition on model systems isomorphic to Twitter in the context of systems theory. In addition, given that in early warning systems, measures are computed for real-time observed series, the determination and investigation of the efficiency of EWMs for the Twitter self-organization to the edge of a phase transition are required to obtain reliable measures. Although the problem of measure effectiveness for systems of very different origins has been repeatedly discussed (e.g., see papers [18, 26, 27]), the question of the effectiveness of EWMs in the context of early warning systems for Twitter self-organization to the edge of a phase transition remains open.

Previously (see the paper [28]), we introduced the concept of the effectiveness of EWM and investigated the effectiveness of the measures in early warning in Sandpile Cellular Automata (SCA) self-organization to the edge of a phase transition. SCA is isomorphic to Twitter with a suitable choice of local rules and topological structure of the graph (lattice). Therefore, the present study is based on the results we obtained earlier.

Finding and analyzing effective EWMs are crucial when designing and building early warning systems Twitter self-organization to the edge of a phase transition. For example, bringing the Twitter segment, which unites network users by discussing a candidate in a political election during the preelection debate, to the edge of a phase transition characterized by an avalanche-like spread of retweets will allow the election headquarters to take the required preventive measures. Also, the presence of a Twitter segment on the edge of a phase transition is one of the indicators that bots are involved in avalanche-like spreading of microposts in the network (e.g., see paper [2931]).

To close the gaps mentioned, we investigated the behavior (see Subsection 3.2) and effectiveness (see Subsection 3.3) of various EWMs of the user number series of Twitter users initiating avalanches of retweets on the network related to the debates of the 2016 United States presidential election (see Subsection 2.2). To ensure the representativeness of the obtained results, we investigated the performance of not only the most studied EWMs, such as some sample moments, autocorrelation, and power-law exponent of spectral density, but also the understudied EWMs related to phase space reconstruction from the time series data (see Subsection 2.3). As test series demonstrating the self-organization of a system at the edge of a phase transition, we used the number series of unstable nodes of directed CSA on the Chung-Lu Graph (CLG) with Manna rules as stochastic discrete-time compartmental models describing the self-organization of Twitter segments at the edge (see Subsections 2.1 and 3.1). By the term “segment,” we denote the set of online users connected by the discussion of the debates of the 2016 United States presidential election.

2. Data Set and Methods

We investigate the behavior and performance of EWMs’ self-organizing SCA and Twitter segment into a critical state based on the behavior of discrete dynamical series. For automata, these are the dynamic series , where the number of unstable nodes of the automaton in -th iteration; for Twitter – is time series , where is the number of users of the online social network initiating chains of retweets on the network and is the time. We use the most common window measure m, which is computed in a window of fixed width . The value of used is the minimum acceptable value to obtain correct measure estimates.

By sliding the window along and with the computation of m for each window shifted by one iteration step, starting from the window (for automaton) and (for Twitter), we obtain dynamic series of measures . The characteristic behavior of the series , or zero-mean dynamic series of increments , as approaches the moment of critical transition is a precursor to the self-organization of the system (automaton/Twitter) into a critical state. The mean is computed for the values of the series and in each window.

In the following, we present a brief description of the methods for obtaining the (see Subsection 2.1) and (see Subsection 2.2) and the EWM calculation methods used in the presented study (see Subsection 2.3).

2.1. Model Time Series Generated by Sandpile Cellular Automata

In this subsection, we introduce a discrete-time model that captures the self-organization dynamics of a Twitter segment on the edge of a phase transition and is based on the Manna spread model (see the paper [32]). This model draws inspiration from the functioning of elements and their interactions within an undirected SCA created on the directed modification of Chung-Lu random graph (e.g., see the paper [33]), drawing parallels with the Twitter segment. These analogies are explored within the context of systems theory (e.g., see the paper [34]). The model’s purpose is to elucidate the avalanche-like behavior observed in the segment in transition of users as nodes from stable to the unstable state, which is a characteristic of it being at the edge of a phase transition. We are analyzing this specific type of transition due to its key role in the spread of the information. Consequently, the model helps us comprehend the similarities in the behavior of time series and . Establishing this analogy is of significant importance as it enables us to employ as a testing series for discerning the features of EWMs. These identified features can then be utilized to detect reliable early warning signals indicating Twitter’s self-organization approaching a phase transition edge, as determined through the analysis of the series.

Manna automatons on the directed CLG are capable of self-organization into a bistable state (SOB-state) corresponding to a first-order phase transition with periods of low and high activity in the system that follow each other. Frequency of the switches may be regulated by the volume of pumping and by the switching between base and facilitated models: the more we pump and the more random the model becomes, the higher the frequency of the switching between states.

In addition to the local rule of the automaton and the graph for the pumping of information into the model, we have tested three different rules:(i)Discrete Uniform Distribution (DUD): on each iteration, we drop 1 grain into the random node of the model;(ii)Exponential Distribution (EXD): on each iteration, we drop a random number of grains determined by the exponential distribution into the nodes of the model;(iii)Pareto Distribution (PAD): on each iteration, we drop a random number of grains determined by the Pareto distribution with into the nodes of the model.

For the toppling of the grains in the nodes, we use two rules:(i)Standard Manna model, whose mathematical logic is described in the formula (1). Its main idea is that the toppling grains from the node randomly fall to the nodes that are connected to the toppling node. A node of the standard automaton is unstable when :where denotes the nearest neighboring site to the site .(ii)Facilitated Manna model, whose mathematical logic is described in the formula (2). Its main idea is to add one more layer of instability into the model: a node of the facilitated automaton is unstable when and when ( is the number of topples to node at the previous iteration):

In the context of the model we have put forward, the sequence representing the number of unstable nodes in the automaton, denoted as (with t representing the iteration number), bears a resemblance to the sequence reflecting the number of users within a segment who initiate retweet chains, also denoted as . Consequently, the sequence can serve as a testing dataset for identifying the characteristics of EWM series behavior. This, in turn, allows us to analyze how the EWM series behaves when segments approach the brink of a phase transition.

2.2. Twitter Time Series

Access to the data, specifically Tweet IDs, is facilitated through Harvard Dataverse and can be accessed at [35]. Our analysis focused on tweets reflecting the responses of Twitter users to the first, second, and third debates among the candidates in the 2016 US presidential election. Adhering to policy restrictions that prevent the storage of any data beyond tweet IDs outside of Twitter, we utilized Twitter APIs and developer tokens to retrieve the data. By submitting requests to Twitter’s servers, we obtained additional information if the queried tweet still existed, and the associated account had not been deleted or blocked by the platform’s administration.

Subsequently, the acquired dataset underwent a cleaning process to eliminate extraneous information. Among other things, the dataset included numerous system notifications related to blocked accounts, which were deemed irrelevant for this study. After purging these notifications, we reorganized the data by specifically gathering tweets at the initiation of multiple retweet chains. This approach was chosen as each chain inherently contains information about the number of replies it received. Visualizing the comment chain as a tree, the initial level, or root, already provides information about the length of the entire branch.

We analyzed the following object fields:(i)tweet[“id”]–ID of the tweet;(ii)tweet[“user”][“screen_name”]–the name of the user on whose behalf the tweet was published;(iii)tweet[“retweet_count”]–field that stores the number of retweets;(iv)tweet[“created_at”]–creation time of the tweet;(v)tweet[“text”]–text of the tweet.

Through the filtering of technical and unoriginal tweets, a total of 540,975 tweets related to the first debate, 713,341 tweets related to the second debate, and 448,064 tweets related to the third debate were gathered. For enhanced visualization of the graphs, each 10-second interval was amalgamated into a single point on the time series. This reduction in granularity resulted in a decrease in the number of data points from 60 to 6 per minute, simplifying the overall presentation.

We calculated the number of retweet nucleation sources, , by summarizing the “zero” tweets with the same timestamp . In other words, is the number of sources, or unique users, for each chain of retweets that occurred at time . The modeling mechanisms for the origin of a chain (sometimes an avalanche) of retweets are discussed in Subsection 3.1.

2.3. Calculation Methods for Early Warning Measures and Their Effectiveness

We study the most effective EWMs for SCA (see the paper [28]), which are both directly related to their critical slowing down and not related to this phenomenon. Moreover, we study Welch’s PSD estimate performance for both a discrete-time Twitter segment model and real Twitter segments. The following is a fairly brief description of the computational methods. A detailed description of the methods is presented in the paper [28].

Variance (), kurtosis (), skewness (), autocorrelation at lag-1 (), and power-law scaling exponent () of PSD are window EWMs whose features of change as the system approaches are interpreted by its critical slowing down (e.g., see the paper [36]). These features are associated with a decrease in the system recovery rate as is approached. Therefore, a precursor signaling the self-organization of the system to the edge of a phase transition is a sharp increase in the values of the and , as well as a sharp increase followed by a sharp decline in the values of the and . Moreover, if the system is at the edge of a phase transition, then (e.g., see papers [1, 28]).

The precursor of the system self-organization to the edge of a phase transition is also an increase in the values of the series as the system approaches , which corresponds to an increase in spectral power at low frequencies. Moreover, if the system is on the edge of a phase transition, then (“flicker” effect) (e.g., see the papers [28, 36]). The exponent estimate is a statistical estimate of the power-law tangent for the PSD, , on a double logarithmic scale. We used Welch’s PSD estimate (e.g., see the papers [37]) as an estimate of the distribution of over frequency .

In addition to EWMs for the critical slowing down, we investigated the behavior and effectiveness of measures based on the reconstruction of the phase space, the attractor, from one-dimensional realizations of and by the time delay method (e.g., see the paper [38]). Unlike critical slowing down EWMs, which have been investigated for early detection of critical transitions in Twitter (e.g., see the paper [23]), reconstruction measures are investigated for the first time. Another motivation for investigating reconstruction-based EWMs is that they are the most efficient EWMs for SCA (see the paper [28]).

We computed the time delay using the average mutual information algorithm (see the paper [38]), from which we computed the embedding dimensionality using the false nearest neighbor algorithm (see the paper [39]). We then used the resulting embedding dimension to estimate the correlation dimension () and approximation entropy () as EWMs uncorrelated with the critical delay. The measure serves as a quantitative attribute of the attractor, encapsulating information about the degree of complexity of the behavior of the dynamical system (e.g., see the paper [40]). Reconstructed attractor has a fractal geometry if takes fractional positive values. The measure is a measure of the regularity of the series (e.g., see the paper [38]). The description of the used algorithms for estimating and is presented in [39].

We determine the efficiency of EWMs by the number of false precursors (), following our proposed approach, a detailed description of which is presented in the paper [28]. In this approach, the efficiency of the measure is determined by the number of members of the constant sign sequence , consisting of the values of the series and belonging to the segment . Here, is the normalizing multiplier, is the mean square deviation of the values of the series , belonging to the initial window (for series generated by Twitter compartmental model) and (for a series generated by Twitter).

For example, if, starting from some , ], then the precursor is true. Otherwise, e.g., when , the precursor is false. A measure is more efficient than a measure , if .

3. Results and Their Discussion

In Section 3, we present and analyze the results obtained from computing EWMs for the self-organization of cellular automata and Twitter to a critical state. We research the behavior of the calculated EWMs as the systems approach , which is both associated with their critical slowdown and unrelated to this phenomenon (see Subsection 3.2). This consideration holds significance because, given an appropriate selection of local rules, topological graph structures, and pumping conditions, automata serve as apt models for real-world systems, particularly Twitter segments (see discrete-time Twitter segment model presented in Subsection 3.1).

Therefore, the dynamic series generated by such automata can be used as series for testing various EWMs before using them for early detection of critical transitions in Twitter segments. We then introduce the notion of an effective EWM, which we use to investigate the effectiveness of the investigated EWMs (see Subsection 3.3).

3.1. Discrete-Time Twitter Segment Model

This subsection presents the discrete-time model of self-organization of the Twitter segment to the edge of a phase transition. This model is based on the analogy of the functioning of elements and links between elements of the CSA and Twitter segment, that is, the analogy of the structures of these systems in the context of systems theory [34]. The model allows us to explain the avalanche-like dynamics of the segment, characteristic of its being on the edge of a phase transition, and, accordingly, to explain the analogy in the behavior of and time series.

The importance of establishing such an analogy is directly related to the possibility of using as a test series to establish the features of EWMs’ behavior and using these features in determining reliable early warning signals for Twitter self-organization to the edge of a phase transition based on the analysis of series.

We use the term “segment” to refer to a set of network users connected by a discussion of some topic or event, such as the debates of the 2016 United States presidential election. Figure 1 demonstrates the formation of retweet chains in a segment, starting with the pumping of tweets to a segment of the network (the tweet is shown by a red lightning bolt in Figure 1(a)) and ending with the complete relaxation of the segment (see Figure 1(d)).

The described procedure aligns with a particular iteration of the self-organization process of the automaton. In the graph, nodes represent users within the segment and the edges signify interactions between these users. An edge connecting two nodes (segment users) indicates that one of the segment users is a subscriber to the other, allowing for the potential transmission or acceptance of retweets along that edge. The local propagation of retweets is visually represented by the red dashed arrow in Figure 1. The topological structure of the graph of interactions between segment users, as shown by the authors of the paper [32], corresponds to the structure of the CLG with acceptable accuracy.

Every network user can exist in either an active state, represented by the red nodes in Figure 1, indicating their readiness to send retweets to subscribers, or a passive state, denoted by the green nodes in Figure 1, indicating that they are currently unwilling or unable to send retweets. Active users are akin to unstable nodes, while passive users are akin to stable nodes within the automaton. The act of sending retweets to subscribers corresponds to the destabilization or crumbling of an unstable node in the automaton, as detailed in Subsection 2.1.

Let us examine a potential scenario concerning the origin and propagation of retweets in a segment. Assume that a user in a passive state receives a tweet, leading to a switch to an active state (see Figure 1(a)). This event, termed as “pumping,” initiates a series of retweets within the segment, triggered by this user. Subsequently, this user shares retweets with their followers, and let us posit that this retweet propagation results in their transition to an active state (see Figure 1(b)). The process of transitioning between passive and active states persists (see Figure 1(c)) until the entire segment, initially comprising only passive users, achieves a state of complete relaxation (see Figure 1(d)).

It should be pointed out that not every user of a segment that receives a retweet can switch to an active state (see Figures 1(c) and 1(d)), e.g., due to lack of interest in the retweet. In addition, we do not account for other ways of distributing microposts, such as through recommendations, in the model. The model only has incoming tweets and the reaction of segment users to them in the form of retweets sent.

The subsequent iteration initiates by disseminating tweets to specific users within the segment, setting off chains of retweets within that segment. Starting at some iteration (), the segment undergoes self-organization at the edge of a phase transition characterized by an avalanche-like spread of retweets.

Thus, within the framework of the proposed model, the series of the number of unstable nodes of the automaton, ( is the iteration number) is analogous to the series of the number of users of a segment initiating retweet chains in it, . Consequently, the series can be used as a test series to establish the features of EWM series behavior and, accordingly, to study the behavior of EWM series for segments as they approach the edge of a phase transition.

3.2. Behavior of Early Warning Measures

As a result of calculations of EWMs, we found that the behavior of a number of EWM, , as the automaton approaches , remains independent of the specific self-organization type (SOB/SOC) and pumping conditions as the automaton approaches. The variations observed are solely quantitative, holding significance in categorizing EWMs based on their efficacy. Consequently, to maintain the generality of the discourse, we will confine our discussion to delineating the series of measures derived for a sand cellular automaton undergoing self-organization in the SOB state, with pumping sourced from a discrete uniform distribution.

3.2.1. Sandpile Cellular Automata

Figure 2 illustrates the series of EWMs whose behavior aligns with a rigorous theoretical justification within the framework of critical slowing down.

The series representing the number of unstable nodes , for which the EWM series was calculated, is depicted in Figure 2(a). The critical slowing down of the automaton corresponds to an extension of its relaxation time, resulting in an augmentation of the unstable node count. Consequently, this leads to an increase in variance (), a sharp increase in the asymmetry (), and kurtosis () of the distribution of values of the series as the right boundary of the sliding window approaches (see Figures 2(b)2(d)). The rationale behind this behavior is explored in [17, 18].

Critical slowing down is also marked by an increase in “memory,” manifested as a growth in autocorrelation at lag-1 (). Additionally, as the critical state is approached, the autocorrelation converges to a value near 1 and maintains this level in the SOB state (see Figure 2(e)). This implies that the stochastic dynamics of unstable nodes in the previous ()th iteration significantly influences the number of unstable nodes in the current, th, iteration. The autocorrelation pattern is theoretically described and substantiated (refer to papers [17, 18]) and serves as an early warning signal for the self-organization of the automaton toward the edge of a phase transition.

Also, critical slowing down of the automata is accompanied by an increase in the power-law scaling exponent, , of PSD (see Figure 2(f)). The well-known “flicker” effect, , which is a precursor of the critical transition (see the paper [40]), as well as a sign of the automaton being on the edge of the phase transition (see papers [3, 4]). Hence, the measure is an EWM for critical transitions in the automaton.

Let us proceed to the consideration of the behavior of EWMs that have not been theoretically justified so far in the context of critical slowing down. Consider the EWMs of the reconstructed phase space (see Sequence (2)) of the dynamical series by the time delay method. The correlation dimension of the reconstructed attractor () increases sharply as the right boundary of the sliding window approaches , taking fractional values greater than zero in the neighborhood of (see Figure 2(g)). This behavior indicates the increasing complexity of the structure of the reconstructed attractor and the increasing degree of chaotic complexity of the series . The geometry of the reconstructed attractor is fractal if takes fractional positive values.

The approximation entropy, AppEn, decreases sharply as the right boundary of the window approaches (see Figure 2(h)). Consequently, in the neighborhood of , there is a sharp decrease in the uncertainty (irregularity and unpredictability) of the behavior of the series , i.e., the number of repeated patterns in such a series increases sharply. Thus, a sharp change in the behavior of and AppEn in the neighborhood of is an early warning signal for the automata self-organization to the edge of a phase transition.

Let us turn to the discussion of the behavior of EWMs for Twitter segments, which is presented in Figures 35. In these figures, the vertical red dashed line shows , the time interval in which the debate took place is shown in the gray-filled region and the time interval of being at the edge of the phase transition is shown in the gray-filled region.

3.2.2. First Debate

Figure 3 shows the series for EWMs calculated for the number of users of the Twitter segment initiating chains of retweets meaningfully related to the first debate, (see Figure 3(a)).

The Twitter segment self-organizes to the edge of the phase transition and stays in this critical state from 01:36:40 on September 27 to 05:21:30 on September 27. Indeed, (see Figure 3(e)) and (see Figure 3(f)) in this time interval, which is a characteristic of systems arriving at the edge of the phase transition. Approaching , corresponding to 01:36:40 on September 27, the Twitter segment critically slows down, as the (see Figure 3(b)), (see Figure 3(c)) and (see Figure 3(d)) distributions of the series sharply increase in the left neighborhood of the point .

Also, the approach to is indicated by the behavior of EWMs that are not associated with critical slowing down. Thus, in the left neighborhood of , measure increases sharply, taking fractional values between 0 and 1 (see Figure 3(g)). At the same time, the measure AppEn decreases sharply in this neighborhood (see Figure 3(h)). Thus, the behavior of EWMs presented in Figure 3 is an early warning signal for the Twitter segment self-organization to the edge of a phase transition.

Another, in our opinion, the interesting result is the entry into the critical state of the Twitter segment 3 minutes and 40 seconds after the beginning of the first debate and the relaxation of the Twitter segment from the critical state at 2 hours 51 minutes and 20 seconds after the end of the debate. The debate took place in the time interval from 01:00:00 on September 27 to 02:30:00 on September 27.

3.2.3. Second Debate

The Twitter segment, consisting of users involved in second debate communications, also self-organizes to the edge of a phase transition 01:31:20 on October 10 (this point in time corresponds to ). This is confirmed by the behavior of the EWMs (presented in Figure 4), which is a characteristic of the critical slowing down of the Twitter segment as it approaches . In the left neighborhood of (see Figure 4(b)), (see Figure 4(c)) and (see Figure 4(d)) sharply increased.

Moreover, although these behaviors were observed at , only starting at and ending at 05:04:10 on October 10, measure (see Figure 4(e)) and the inequality (see Figure 4(f)) is satisfied.

Also, in the left neighborhood of , measure increases sharply, taking fractional values from 0 to 1 (see Figure 4(g)), and the measure AppEn decreases sharply in this neighborhood (see Figure 4(h)). The Twitter segment stays at the edge of the phase transition from 01:31:20 on October 10 to 05:04:10 on October 10. This time interval is longer than the time interval corresponding to the second debate that started 01:00:00 on October 10 and ended 02:30:00 on October 10 (see Figure 4(a)).

Unlike the first debate, the Twitter segment corresponding to the second debate possibly self-organizes to the edge of a phase transition not only at 01:31:20 on October 10. Figure 4 also shows four additional time intervals that possibly correspond to the Twitter segment staying to the edge of a phase transition. At least, this is indicated by the behavior of the EWMs in the left neighborhoods of the four critical points, and the inequality is satisfied. But, despite this, it takes values from 0.7 to 0.8 for all four intervals. We do not exclude that by using interval statistical estimates of the measures , we can obtain reliable statements about the segment’s staying in these four intervals, but obtaining such estimates is beyond the scope of the presented study.

3.2.4. Third Debate

Figure 5 shows the series for EWMs corresponding to the retweet activity (see series in Figure 5(a)), associated with the third debate. There is a sharp increase in (see Figure 5(b)), (see Figure 5(c)), (see Figure 5(d)), and (see Figure 5(g)), as well as a sharp decrease in AppEn (see Figure 5(h)) as we approach c (01:32:20 on October 20).

In addition, (see Figure 5(e)) and (see Figure 5(f)) in the time interval from 01:32:20 on October 20 to 05:15:50 on October 20. Hence, the Twitter segment self-organizes to the edge of the phase transition at time and stays in such a critical state in this time interval. Moreover, as in the previous cases, the time interval corresponding to the third debate (from 01:00:00 on October 20 to 02:30:00 on October 20) is less than the time of the Twitter segment’s stay in the critical state. Apparently, as in the previous case, the EWMs are not efficient enough (see Subsection 3.3).

Consequently, it is not always the case that avalanche-like behavior of online social media is only observed during the discussion period of high-profile events, such as the debate period.

3.3. Effectiveness of Early Warning Measures

The efficiency parameters () of all investigated EWMs for output automata and Twitter segments on the edge of the phase transition are presented in Figure 6. To identify an automaton, we use the abbreviations for the pumping equations introduced in Subsection 2.1. For example, SOB-PAD stands for a sand cellular automaton with the Manna algorithm based on the Chung-Lu random graph with information pumping via the Pareto distribution that allows it to switch to the SOB state.

Let us consider the effects of the type of self-organization (SOB/SOC) and the distribution for pumping (DUD/EXD/PAD) on the efficiency of the EWM. For two automata with the same pumping, the efficiency of the EWM does not depend on the type of self-organization of the automata since takes the same values at the same distributions for pumping the automata.

An increase in the degree of pumping stochasticity, determined by an increase in the mean and variance of the distribution for pumping, leads to a decrease in the efficiency of the EWMs of all automata regardless of the type of self-organization of the automata. Measures and have the smallest , and hence, they are the most efficient EWMs regardless of the type of self-organization and distribution law for pumping automata.

Also, measures and are the most effective EWMs () for the Twitter segments (first and third debates) self-organization to the edge of a phase transition. Measures and are the most effective EWMs () for the early detection of the Twitter segment, corresponding to the second debates.

Thus, only the behavior of EWMs, based on variance and correlation dimensionality estimates, in the left neighborhood of the point is an effective precursor of the Twitter segments self-organization to the edge of a phase transition, provided that and approach a value equal to 1.

4. Conclusions

The discrete-time Twitter segment model, based on sandpile cellular automata with Manna rule on the Chung-Lu graph, is a model system that generates series of the number of unstable nodes of the automaton. Such series are analogous to the series of the number of users of a segment initiating chains of retweets in it and hence are test series for establishing the features of EWM series behavior as the system (automaton and Twitter segment) approaches the edge of a phase transition.

Such early warning signals can be used in a real-time early-warning system of self-organization to the edge of a phase transition in a real-world system if its structure is similar to that of sandpile cellular automata, i.e., the systems are isomorphic in the context of systems theory. Such systems are a segment of a stock exchange (e.g., see the paper [41]), epidemiological networks (e.g., see the papers [5, 42]), continuous media systems (e.g., see the paper [43]), and complex networks (e.g., see the paper [44]). The known models of information dynamics in Twitter (e.g., see the papers [45, 46]) do not allow us to explain the self-organization of a network on the edge of a phase transition and therefore cannot be used as models for testing EWMs.

The sharp increase in the variance and correlation dimension, as well as the proximity of the autocorrelation at lag-1 to 1 and the power-law scaling exponent of PSD to the interval from 1 to 2, can be used as effective (characterized by the smallest number of false signals) early warning signals for Twitter self-organization to the edge of a phase transition. Such features of the behavior of the autocorrelation at lag-1 to 1 and the belonging of the power-law scaling exponent of the power spectral density in the neighborhood of the edge of a phase transition are presented in the papers [4, 47, 48], but the effectiveness of EWMs directly related to critical slowing down and unrelated to this phenomenon has not been determined and investigated.

The sharp increases of EWMs based on estimates of dispersion and correlation dimensionality are effective early warning signals for the Twitter segments’ self-organization to the edge of a phase transition. At the same time, the effectiveness of such measures does not depend significantly on the pumping features of sandpile cellular automata as Twitter segment models.

The Twitter segments self-organize at the edge of a phase transition, some time after the start of a debate and stay on the edge for longer than the debate takes place. The self-organization of segments at the edge of a phase transition was established by the authors of the paper [49]. During the second debate, the segment of Twitter self-organizes on the edge five times. The stay of the segments on the edge is characterized by an avalanche-like propagation of retweets. Given that self-organization on the edge of a phase transition occurs later than the start of the debate, the start time of the debate is not an early warning time.

To conclude this section, we will point out the main limitations of the proposed approach in the context of early warning systems.

The used rule, topological graph structure, and distribution for pumping sand cell automata take place. In the context of early warning systems, the main limitation of our study is the use of an initial window of a certain width (for example, ). Thus, if , a window with width will not allow to identify , and it is necessary to use a window with, for example . Reducing the width of the window is not always acceptable because the quality of estimation of some measures, such as correlation dimensionality, is very sensitive to the sample size.

Reducing the initial window width without reducing the efficiency of EWMs is possible if some EWM can be independently estimated by several methods. For example, independent estimates for the power-law scaling exponent of the PSD are Welch’s estimate, wavelet leader estimate, and wavelet transform modulus maxima estimate. Welch’s estimate is less sensitive to changes in the width of the initial window but is sensitive to changes in the degree of inhomogeneity. This is another limitation of our approach, which is related to the choice of the estimation method of EWMs for their use in early warning systems.

The proposed approach to determining the effectiveness of EWMs is also limited to the search for early warning signals for Twitter self-organization to the edge of a phase transition, based on abrupt changes (increasing/decreasing) of EWMs as the network approaches the edge. For example, the approach is of little use for identifying the early warning signals in the behavior of point statistical estimates for the autocorrelation and the power-law scaling exponent for which the aspiration to values close to unity is primarily important. In these cases, the use of interval static estimates will eliminate this limitation.

Finally, a limitation is the need to use high-frequency series for the estimation of EWMs, for which the time step of the floppy series is much smaller than the time of finding the network at the edge of a phase transition. The use of such series for the early warning of self-organization of systems at the edge of a phase transition is necessary if the systems are characterized by a relatively short time of being in a subcritical phase. For example, for the Twitter segments, this time is less than one hour and we were able to find the precursors of self-organization since the EWM estimates are obtained for the series of retweet activity with a step of 10 seconds.

In conclusion, we observe that the chosen rule, topological graph structure, and distribution for pumping sandpile cellular automata have enabled the investigation of the efficiency of EWMs in the context of bifurcation-induced tipping. However, it is noteworthy that this form of tipping does not confine the exploration of measure effectiveness. Through a thoughtful selection of local rules and pumping configurations, it is possible to observe noise-induced and rate-induced tipping in sandpile cellular automata (e.g., see the paper [19]). Therefore, the prospect of our further research is to use directed sandpile cellular automata with Pastor-Satorras-Vespignani rules with stochastic node fluctuations as the most adequate hierarchical model of retweet propagation in the network.

Data Availability

Previously reported Tweet IDs data used to support this study are available at https://doi.org/10.7910/DVN/PDI7IN. This prior study (and data set) is cited at relevant places within the text as [35]. The raw model data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Disclosure

The work is an output of a research project implemented as part of the Basic Research Program at the National Research University Higher School of Economics (HSE University).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.