Analysis and Applications of Complex Social Networks 2018View this Special Issue
Research Article | Open Access
Complexity of a Microblogging Social Network in the Framework of Modern Nonlinear Science
Recent developments in nonlinear science have caused the formation of a new paradigm called the paradigm of complexity. The self-organized criticality theory constitutes the foundation of this paradigm. To estimate the complexity of a microblogging social network, we used one of the conceptual schemes of the paradigm, namely, the system of key signs of complexity of the external manifestations of the system irrespective of its internal structure. Our research revealed all the key signs of complexity of the time series of a number of microposts. We offer a new model of a microblogging social network as a nonlinear random dynamical system with additive noise in three-dimensional phase space. Implementations of this model in the adiabatic approximation possess all the key signs of complexity, making the model a reasonable evolutionary model for a microblogging social network. The use of adiabatic approximation allows us to model a microblogging social network as a nonlinear random dynamical system with multiplicative noise with the power-law in one-dimensional phase space.
Social networks have been studied longer than any other type of networks. It is remarkable that one of the signs of network complexity—a power law of nodes’ degree distribution —was first empirically formulated by D. Price in 1965 for social networks. In 1999, A. L. Barabasi, a physicist from the University of Notre Dame (USA), and his graduate student R. Albert determined [2, 3] that, for many networks, instead of the expected Poisson probability distribution of nodes’ degree (i.e., the number of connections a node has to other nodes), the distribution they obtained approximately followed a power law as all critical states do. In many real networks, a small number of nodes have a large number of connections, whereas a large number of nodes have just a few connections. Such networks are called scale-free networks. This name was not invented specifically for this type of networks. It came from the theory of critical phenomena, where fluctuations in critical states also follow a power law. The theory of scale-free networks is considered to be one of the scenarios complex systems follow when they come into a critical state. As of late, such networks are more often called complex networks.
An extensive body of research on the modeling of the structure and functioning of social networks is available today. This research has two directions. The first direction relates to the analysis of the social networks data (see one of the latest reviews ), while the second concerns the development of models of the structure, dynamics, and evolution of social networks. The distinction between these two directions is somewhat arbitrary, since in most cases these directions overlap (see, e.g., [10, 11]).
Starting from the second half of the century, the ideas and methods of physics have tended to infiltrate natural sciences and traditional humanities. Methods of physical modeling are often used in such areas of science as demographics, sociology, and linguistics. As a result, sociophysical models of social networks, such as the Ising model [12–15], Bose-Einstein condensate model [3, 16], Quantum walk model , Ground state and community detection [18, 19], among others, were developed.
Despite having a variety of sociophysical models, the results and theories of nonlinear science, with some exclusions (see, e.g., [20, 21]), are not used to model the evolution of social networks. First of all, we are talking about the complexity and self-organized criticality theory describing the mechanism of complexity [22–24]. Mechanisms of self-organized criticality in social knowledge creation process are presented in the paper . It is noteworthy that the key sign of complexity of a system regardless of its internal structure, i.e., one based solely on its external characteristics, was formulated in the framework of this theory. According to this theory, a system is considered to be complex if it is able to generate unexpected and/or extraordinary events (for instance, bursts of values in time series). This motivated our research. The purpose of the research is a nonlinear dynamical interpretation of the complexity of a microblogging network and the development of an appropriate network model that could explain its complexity using the third paradigm of nonlinear science called the complexity paradigm. Another motivation for the research was the results presented in [26–31] where the time series of a number of microposts are characterized by the majority of key signs of the system complexity (a detailed description of the key signs of the system complexity is presented in Section 2).
This paper is organized as follows. Section 2 deals with the key signs of the system complexity according to the complexity paradigm. Section 3 presents the results of the analysis of an empiric time series of a number of microposts, including the results of the calculation of the key signs of the complexity. Section 4 presents a model of a microblogging social network as a nonlinear deterministic dynamical system including its capabilities and restrictions. Section 5 presents a generalized model of a microblogging social network, modified by the consideration of stochastic sources and a decrease in the order parameter, as well as the results of an analysis in the adiabatic approximation. Section 6 contains the main results of the research and a discussion.
2. Nonlinear Dynamical Interpretation of Complexity
The development of any branch of science leads to the formulation of paradigms, namely, initial conceptual schemes, models of problem statements, and solutions of the problems. At this time, three paradigms have been developed in nonlinear science. The first paradigm is that of self-organization. The second is the paradigm of deterministic chaos. The most recent development of nonlinear science is closely linked to the third paradigm, which could be defined as a paradigm of complexity that has the theory of self-organized criticality as its foundation. The paradigm of complexity lies at the junction of the first two paradigms. If the first two paradigms deal with order and chaos, respectively, the third is usually described as “life on the edge of chaos” .
Since it is impossible to rigorously define complexity, our research is limited to consideration of the key signs of system complexity defined in the publications by Per Bak and co-authors [22–24], and their application to the interpretation of the complexity of microblogging social networks. As stated in the introduction, first of all, we consider the complexity of external system manifestations regardless of internal structure. For the purposes of this research, we define “external system manifestations” as signals (the time series of a number of microposts) of a microblogging social network generated as a result of nontrivial interactions within a very large pool of users.
One of the key signs of complexity is its inclination to the occurrence of catastrophic events—either unexpected (i.e., nonpredictable) or extraordinary (i.e., prominent among similar events), or both. Importantly, in either case we can conclude that the system that has generated such an event is complex. From simple systems, we could expect predictability and similarities in their behavior. As for the signals of a microblogging system, such events qualitatively correspond to considerable bursts seen on a plot of value increments of the time series of a number of microposts. One of the quantitative criteria of the existence of catastrophic events is the existence of power low of the probability density function (PDF) for the values of the time series. It is worth mentioning that, in the majority of cases, the occurrence of such events on the network signal level corresponds to the qualitative restructuring of the system, i.e., a transition from a polycentric state to a monocentric state, and vice versa (such transitions are thoroughly described in ).
Another key sign of complexity is scale invariance, meaning that events or objects lack their own characteristic dimensions, durations, energies, etc. At the level of external manifestations of a microblogging network, scale invariance means that the time series of a number of microposts are fractal or multifractal time series (such time series are described in detail in ).
In a general case, a power low for PDF is a statistical expression of scale invariance of the time series:where usually . PDF (1) belongs to the class of fat-tailed PDFs. For statistical description of catastrophic events, PDF (1) is a rule with almost no exceptions. PDF (1) differs from compact distributions (for example, Gaussian distribution) because the events corresponding to the tail of the distribution are not rare enough to be neglected. PDF (1) reflects a strong interdependence of the events. For example, such distribution may be caused by an avalanche-like increase of the number of microposts in the network as a result of a “chain reaction” caused by reposting.
Another manifestation of the scale invariance of the time series is the existence of the power spectral density (PSD) specific for flicker noise:where . The existence of PSD (2) means that a considerable part of the energy is linked to very slow processes. For a microblogging network, the existence of PSD (2) means that it is impossible to predict the behavior of the time series of a number of microposts without considering global information exchange processes.
The aforementioned features of PDF and PSD are not the only criteria of scale invariance. Besides PDF and PSD, we used a fractal dimension and a Hurst exponent along with other quantitative measures and criteria. It is important to stress that the scale invariance and an inclination to catastrophes are typical only for systems that are far from equilibrium. Therefore, a nonequilibrium state of the system and, therefore, a nonlinearity are the necessary conditions for the complexity of the system.
Lastly, the third key sign that characterizes complex systems is their integrity. The integral properties of a system usually are statistically described by power-law space and time correlations. These correlations are known as distant space and time correlations. The existence of distant time correlations or long memory in time series is characterized by the autocorrelation function (ACF) in the following form :where . The existence of the relationship (3) implies the absence of characteristic times at which the information about the previous events could be lost. A catastrophic behavior and integrity are connected in the following way: for the catastrophic behavior, part of the system should be able to function in coordination. For a microblogging network, an avalanche-like increase of the number of microposts is possible when a user and his followers, followers of these followers, etc., are working in coordination. Integrity is possible in complex systems only due to the processes of self-organization. Here we talk about coarse scale properties of the system, since minor changes in system parameters do not affect its integrity.
Therefore, a microblogging network is a complex system when all the key signs of complexity listed above are satisfied. This statement forms the foundation of our research and is key to the construction of a model of microblogging network evolution.
3. Analysis of Empirical Data from Twitter
Empirical data used for our research is a sample of more than 3 million microposts (tweets, retweets, and links) about the first US presidential debates of 2016. The sample includes microposts posted by more than 1 million users from 13:45 on September 26, 2016, to 11:00 on September 27, 2016, with 1-second increments.
Figure 1 shows the total number of microposts vs. time (Twitter time series, ). It is easy to see that has extraordinary events and unexpected events (bursts).
To estimate the correlation dimension () and embedding dimension (), we used the Grassberger–Procaccia algorithm . We obtained for .
Hence, the process leading to the series is not random; it depends on a limited number of key parameters . The series is not stochastic; it is chaotic. For instance, for a stochastic series corresponding to Gaussian noise, for , and if the series corresponds to generalized Brownian, then noise for .
Using the R/S analysis, we obtained the Hurst exponent (). To calculate the fractal dimension of a time series () we used the algorithm presented in . We obtained the following results: , . Therefore, is a fractal time series (the fractal dimension is not an integer and exceeds the topological dimension of the time series). Moreover, is persistent; i.e. the time series is trend-resistant (). Such a time series has a long memory and is inclined to follow trends .
Figure 2 shows PDF for the increments (returns) of time series and PDF for a normal distribution.
Empirical probabilities lie outside the normal PDF in the intervals and . This means that heavy tails exist. D’Agostino’s K-squared test  also confirms the possibility to reject the null hypothesis about the normality of the distribution at the significance level of 0.01 when the statistics . Another proof that heavy tails exist is presented by the fact that the distribution follows a power law of probability distribution. Figure 3 shows PDF and the complimentary cumulative distribution function (CCDF). Both functions are well approximated by linear functions.
Let us determine the type of noise (parameter in PSD ) for . To calculate , we used the detrended fluctuation analysis method (DFA) . After the calculations, we obtained the scaling exponent and the PSD parameter . The value obtained corresponds rather to a flicker noise () than to any other type of noise. The value obtained by the DFA method coincides with the value obtained via the approximation of the time series PSD by a linear function. The PSD obtained by applying fast Fourier transform to is shown in Figure 4 on a log-log scale. A linear fit yields .
The autocorrelation function () for an time series is described by a decreasing power function (3) with the exponent . Hence, this function has long memory.
Figure 5 presents and its linear approximation on a log-log scale. A linear fit gives .
4. Microblogging Social Network as a Nonlinear Deterministic Dynamical System
4.1. Main Assumptions for the Model
A social network is a macroscopic system. The number of users for such a system is . This assumption is justified for Twitter, since, according to the existing estimations, ~108. In the proposed model, out of all possible degrees of freedom, we choose and consider just a few macroscopic degrees of freedom (phase or dynamic variables corresponding to hydrodynamic modes in physics). Such a reduction can be justified by the synergetic subordination principle. This principle states that, during the evolution, the hydrodynamic modes suppress the behavior of microscopic degrees of freedom and fully determine the system’s self-organization. As a result, the cooperative behavior of a system is determined by several hydrodynamic variables that represent the amplitudes of hydrodynamic modes. This way we do not need an infinite number of microscopic degrees of freedom and there is no need to thoroughly study the microscopic interactions between the users of a social network.
A social network is modeled as a point autonomous dynamical system. This model was chosen because it is possible to compare the results with empirical data provided by the Twitter time series. Each user of a social network can be in one of the two possible states: either passive (-state) or active (-state). A Twitter user in -state can send microposts to other network users. In this state, a network user has enough information to send microposts. If a user is in -state (the user does not have enough information), he or she cannot send microposts.
A microblogging social network is an open nonequilibrium system. A social network is capable of information exchange with the environment. The incoming flow of external (for the system) information comes into the system from different sources, for example, from other mass media. This flow feeds the network with information and creates an inverse population of states of network users: , where is the number of network users in -state, and is the number of network users in -state.
The distribution of Twitter users can be represented with good accuracy by a Boltzmann distribution: where is the amount of information the users in -state possess, is the amount of information the users in -state possess, and is a parameter that describes the average intensity of stochastic interactions between the network users. A simple analysis (4) allows us to define two macroscopic network states: a steady state and a nonequilibrium state. If , then . In this case the network is in a steady state. If , then . In this case the network is in a nonequilibrium state. Since a social network is constantly fed with information, it is constantly in a nonequilibrium state, creating an avalanche of microposts. Because of the constant feed of information, a steady state can almost never be reached. It is very important that the existence of chaotic states is a fundamental property of open nonequilibrium systems.
4.2. Phase Variables and Relationships between Them
Let us define the phase variables of a dynamical system. These variables will be used to model Twitter as an open nonequilibrium system. These variables are as follows: is the deviation of the number of microposts () from the corresponding equilibrium value ( is the number of microposts in the steady state); is the deviation of aggregated intrasystem information () the network users possess from the corresponding equilibrium value ( is the aggregated intrasystem information the network users possess when Twitter is in steady state); is the instantaneous difference at the moment of time between the numbers of strategically oriented social network users (users following a particular strategy) in -and - states. If is the difference between the total number of users in -and - states, then users act randomly (randomly oriented users).
According to , business users and spam users can be considered as strategically oriented users.
Business users follow marketing and business agendas on Twitter. The profile description strongly depicts their motive, and a similar behavior can be observed in their tweeting behavior. Spammers mostly post malicious tweets at high rates. Automated computer programs (bots) mostly run behind a spam profile and randomly follow users, expecting a few users to follow back.
Personal users and professional users can be considered randomly oriented users. Personal users are casual home users who create their Twitter profile for fun, learning, to get news, etc. These users neither strongly advocate any type of business or product, nor have profiles affiliated with any organization. Generally, they have a personal profile and show a low to mild behavior in their social interaction. Professional users are home users with professional intent on Twitter. They share useful information about specific topics and involve in healthy discussion related to their area of interest and expertise.
Let us determine relationships between the dynamic variables and their rates of change.
The rate of deviation of the number of microposts is determined by the relaxation of a social network into a steady state () and the change in deviation of aggregated intrasystem information from the equilibrium value ():The term in Eq. (5) is due to the relaxation of the social network as a nonequilibrium system. According to Le Chatelier's principle, when a system deviates from the steady state, this generates “forces” that try to restore the system back to the steady state. As follows from Eq. (5), without the term the equation takes the following form:The solution to Eq. (6) is given by the function . Hence, when (the social network tends to its steady state). In Eq. (6), is the time of relaxation to the steady state.
The term in Eq. (5) can be easily explained: as the deviation of aggregated intrasystem information increases, the rate of the deviation of the number of microposts increases as well.
The rate of the deviation of the aggregated intrasystem information from the equilibrium value is determined by the relaxation of a social network towards a steady state () and the product :The term in Eq. (7) is also explained by Le Chatelier’s principle, as in Eq. (6). The term appears because the amount of information each user of a social network acquires from a stream of microposts is proportional to the deviation of the number of microposts and depends on the state of the user in the social network. In other words, the average contribution to the deviation of aggregated intrasystem information is proportional to the product of the deviation of the number of microposts and the difference between the numbers of users in - and - states.
Finally, the third equation describes the change in inversion of population of strategically oriented users and can be written as follows:where is the corresponding relaxation time, and is the initial number of strategically oriented users (this value reflects the intensity of information feeding into the social network). In other words, is the difference between the numbers of strategically oriented users of a social network which are in - and -states at the time . The term reflects the effective power that a stream of microposts applies to create aggregated intrasystem information in a social network. This power can be positive or negative.
Thus, the evolution of a microblogging social network can be described by the well-known Lorenz system of equations:
4.3. Synergetic Interpretation of a Nonlinear Dynamical System
The system of self-consistent equations (9) is a well-known method to describe a self-organizing system. The Lorenz synergetic model was first developed as a simplification of hydrodynamic equations describing the Rayleigh-Bénard heat convection in the atmosphere; it is now a classical model of chaotic dynamics. Further research on the Lorenz system presented in a series of publications proved that the system provides an appropriate kinetic picture of the cooperative behavior of particles in any macroscopic dynamical system where the actualization of potential order is possible. Processes in such self-organizing complex systems in nonequilibrium state lead to the selection of a small number of parameters from the complete set of variables that describe the system; all other degrees of freedom adjust to correspond to these selected parameters. Following the terminology used in the synergy theory, these parameters are the order parameter (), conjugated field (), and control parameter (). According to the Ruelle-Takens theorem, we can observe a nontrivial self-organization with strange attractors if the number of selected degrees of freedom is three or more.
In the system of equations (9) is a coefficient, and positive constants , are measures of feedback in a social network. Functions , , describe the autonomous relaxation of the deviation of the number of microposts, deviations of aggregated information, and inversion of population of strategically oriented users of a social network to the stationary values , , with relaxation times , , .
Eq. (10) takes into account that, in the autonomous regime, the change in the aforementioned parameters of a social network is dissipative. In addition, Le Chatelier’s principle is very important: since the growth of the control parameter is the reason for self-organization, the values and must vary so as to prevent the growth of . Formally, this fact could be explained as the existence of a feedback between the order parameter and the conjugated field . Lastly, a positive feedback between the order parameter and the control parameter leading to the growth of the conjugated field is very important, since this feedback is the reason for self-organization.
4.4. Capabilities and Restrictions of a Deterministic Model for the Interpretation of a Social Network’s Complexity
First of all, we have to note that Eq. (9) was first obtained by Edward Lorenz in 1963 as a result of some simplifications of the problem of a liquid layer heated from below. In this problem Eq. (9) is obtained when the flow velocity and temperature of the initial hydrodynamic system are presented as two-dimensional truncated Fourier series and the Boussinesq approximation is used. For the problem of convection in a layer, the Lorenz equations serve as a rough, not very accurate approximation. It is only adequate in the region of regular modes where uniformly rotating convection cells are observed. The chaotic regime typical of Eq. (9) does not describe the turbulent convection. However, the Lorenz equations became a suitable model for describing systems and processes of various natures: convection in a closed loop, single-mode laser, water wheel rotation, financial markets, transportation flows, dissipative oscillator with nonlinear excitation, and some others.
How reliable is the model (9) for the description of the evolution of a microblogging social network? We will consider the model “reliable” if there is a good correlation between theoretically predicted and empirically observed key signs of complexity of the system. The results of the comparison of key signs of complexity for the theory-based deviations of the number of microposts and the corresponding empirical data are presented below.
As shown earlier (see Eq. (4)), a steady state of the network is almost impossible to achieve due to a constant information feed. Theoretically, a dynamical system (9) has an asymptotically stable zero stationary point as a node for In this case, , and as . However, in practice, a microblogging network as an open nonequilibrium system always has a non-zero difference between the numbers of strategically oriented users that are in -state and in -state at the time . Therefore, despite a theoretical feasibility of the steady state for a social network, this state cannot be achieved in practice. When the difference between the numbers of strategically oriented users that are in -state and in -state at the time reaches some critical value , Eq. (9) enters a chaotic regime, and a strange attractor appears. A transitional state that corresponds to cannot be realized in practice.
We will consider as constant for a long enough period of time and compare different measures for theoretical (, the solution of system (9) in chaotic regime) and empirical data The model of a social network presented in the form (9) explains the fractal and chaotic nature of the observed : and . However, the model (9) cannot explain the observed key signs of complexity of a social network. Theoretical constitutes a time series without memory ( exponentially decreases); PSD is constant (white noise, ); PDF is multi-modal with “truncated tails” (see Figure 6).
Compactness and multi-modality of the distribution are determined by the existence of three stationary points of the dynamical system (9).
Thus, the Lorenz system (9) is not a reliable model for the description of the evolution of a microblogging social network as a complex system.
5. Microblogging Social Network as a Nonlinear Random Dynamical System
As shown earlier, the nonlinear dynamic model (9) explains the fractality and chaotic nature of empirical as well as the dissipative nature of the system. On the other hand, Eq. (9) cannot explain some other phenomena found in empirical data, and first of all, the key signs of complexity of a social network: a power law of PDF, -noise, and long memory. Let us consider different ways of improving (generalizing) Eq. (9) in order to adequately describe a microblogging social network.
Since the correlation dimension and embedding dimension of the empirical time series (, ) exceed the corresponding theoretical values (, ), one of the ways to improve Eq. (9) is to increase the number of phase variables of the dynamic system. Another approach to improving Eq. (9) is to consider the self-consistent behavior of the order parameter, conjugated field, and control parameter taking into account the noise for each of those parameters. Different generalizations of Eq. (9) have been proposed and studied by Alexander Olemsky and collaborators [42–44], in particular, in the context of its applications to the study of self-organization of continuum, evolution of financial markets and economical structure of society, cooperative behavior of active particles, and self-organized criticality.
Taking into account stochastic terms and the fractionality of the order parameter, Eq. (9) takes the following form:In Eq. (10), are noise intensities for each phase variable; is white noise, where , ; . The random dynamic system (RDS) (10) is a generalization of the deterministic dynamic system (9) where stochastic sources are added, the feedback is weakened, and the order parameter is relaxed. The replacement of the order parameter by a smaller value () means that the process of ordering influences the self-consistent behavior of the system to a lesser extent than it does in the ideal case of .
For the convenience of the analysis of Eq. (10) we will transform it into a dimensionless form. Then time , deviation of the number of microposts (), deviation of the aggregated intrasystem information (), the difference between the numbers of strategically oriented users in different states (), and corresponding noise intensities () will be scaled as follows: Now Eq. (10) can be written down as follows:Let us analyze RDS (12) in adiabatic approximation when the characteristic relaxation time of the number of microposts in a network considerably exceeds the corresponding relaxation times of aggregated intrasystem information and the number of strategically oriented users: . This means that aggregated intrasystem information and the number of strategically oriented users follow the variation in the deviation of the number of microposts (). When , the subordination principle allows us to set in Eq. (12), i.e., to disregard the fluctuations in and .
For a microblogging social network functioning as an open nonequilibrium system, the adiabatic approximation means that when the external information feed tends to zero, the stream of microposts slowly decreases and at the same time the aggregated intrasystem information and the number of strategically oriented users in active state decrease as well.
An adiabatic approximation is a necessary condition for the transformation of the three-dimensional RDS with additive noise (12) into a one-dimensional RDS with multiplicative noise of the following form: The terms of Eq. (13) corresponding to the drift and diffusion (intensity of the chaotic source) have the following form:The Langevin equation (13) has an infinite set of random solutions . Their probability distribution () is given by the Fokker-Planck equation:In the stationary case () the distribution is given by the following relationship:As a result, the stationary probability distribution density of the deviation of the number of microposts from the corresponding equilibrium value has the following form: where is a normalization constant.
Before we draw any conclusion about distribution (18), let us direct our attention to one significant fact that distinguishes theory from practice. Distributions of real systems and processes regardless of their nature cannot have an infinite expected value or variance. Therefore, power-law PDFs like ( is chosen for the purposes of analysis of expression (18)) are approximate and not valid for large . The exponential decrease of PDF corresponds to the intermediate asymptotics, and in practice instead of heavy tails we should have semi-heavy tails (see distribution in Figure 2): where the scaling function is approximately constant at and quickly decreases when . Here the “heaviness of the tail” is shifted toward the intermediate range of values. Thus, the dimensionless deviation of the number of microposts scaled for serves as a scaling variable in (19). Since the integral in Eq. (16) is regular at , the PDF obtained has a power-law form.
The power law for PDF of the deviation of the number of microposts , which is equivalent to for large times, was obtained and justified analytically. However, we could not obtain analytical expressions for PSD, , or the correlation and fractal dimensions. Therefore, we present below the results of numerical calculations for a family of realizations of RDS (13) for based on algorithms studied and used earlier.
Let us determine the type of noise typical for . We used the DFA method to calculate We obtained the scaling exponent and . The value obtained for corresponds rather to flicker noise () than to any other type of noise. The value obtained by the DFA method is close to the value obtained through fitting PSD time series by a linear function. PSD obtained by applying fast Fourier transform to is presented in Figure 7 in log-log scale. A linear fit gives .
To estimate the correlation dimension () and embedding dimension (), we used the Grassberger–Procaccia algorithm. We obtained for . Hence, the process that leads to the series is not random; it is controlled by a limited number of key parameters. The series is chaotic rather than stochastic.
Using the results of R/S analysis we determined the Hurst exponent (). To calculate the fractal dimension of the time series () we used the algorithm described in . We obtained , . Hence, is a persistent fractal time series. Such time series has a long-term memory and tends to follow trends.
Therefore, the generalized Lorenz system (12) adequately models the evolution of a microblogging social network as a complex system. The characteristics of -realizations of RDS (12) are quantitatively close to the corresponding characteristics of empirical time series.
6. Results and Discussion
For the convenience of further discussion, Table 1 presents the results of calculations of key characteristics and properties of complex systems (i.e., systems that tend to have unexpected and/or extraordinary events).
The empirical time series of microposts has all the key properties of complexity: a power-law PDF, noise that is close to flicker noise, time correlations with long memory, and scale invariance in a time series of microposts. The existence of bursts in time series of microposts (see Figure 1) allows us to conclude that a microblogging network is a complex system, and it is far from equilibrium. The time series of microposts is characterized by scale invariance; i.e., it is a fractal time series. Such time series, in particular, are characterized by power-law PDFs caused by an avalanche-like increase of the number of microposts (see bursts in Figure 1) after a “chain reaction” of reposting. An avalanche-like increase of the number of microposts is possible if a user coordinates his actions with his followers, followers of those followers, and so on. This defines a connection between the catastrophic behavior and integrity of a microblogging network.
For a description of the evolution of a microblogging network, the nonlinear dynamical system model (9) is a rough, not very accurate approximation. First, the model does not predict the occurrence of catastrophic values in a time series of the number of microposts which would signify the complexity of a microblogging network, or the existence of long memory or the time series’ tendency to follow trends. Despite this deficiency, Eq. (9) allows one to study social networks far from equilibrium (see distribution (4) and comments thereon), and it also explains the existence of dynamical chaos in a time series as well as their fractality.
The nonlinear random dynamical system (10) is a generalization of the model (9) that accounts for external stochastic sources and the fractionality of the order parameter (weakening of feedback and relaxation of the order parameter). This model adequately describes the evolution of a microblogging system.
Quantitative characteristics of the model (10) in adiabatic approximation are close to the corresponding characteristics of the empirical time series of microposts (see Table 1). An adiabatic approximation allows us to reduce a three-dimensional random adiabatic system with additive noise (10) to a one-dimensional random dynamical system with exponential multiplicative noise (13).
The main results of this research were obtained by analyzing a single time series of microposts whose values however constitute a representative sample. Similar results of analysis of an empirical time series of a microblogging network are presented in [24–29]. We cannot claim that the time series samples studied by us or other researchers are representative, which would be essential for a generalization of the results onto the entire general population. In the framework of this approach, it is necessary to analyze all the available data on microposts and users collected since the beginning of the microblogging network. However, this step could be avoided if we consider the scale invariance of social networks. This allows us to extrapolate and interpolate the results of the network analysis onto any large or small scale. Hence, the fractality of a sample predetermines the fractality of the entire network. A justification of the scale invariance for Twitter is presented in . Therefore, the conclusion about the complexity of microblogging networks in the framework of the paradigm of complexity is justified.
What follows from the fact that a microblogging network is complex? We can give two answers to this question. The first is connected with the possibility of second-order phase transitions in a microblogging network; the second concerns the analysis and prediction of a time series of microposts. Let us elaborate on each answer.
It has been established that time series of microposts are characterized by long-range time correlations. This is true both for empirical time series and for realizations of the random dynamical system (13). Long-range correlations and other characteristics of time series discussed above are typical of critical phenomena such as second order phase transitions.
For simplicity, let us consider the kinetics of a phase transition in a microblogging network in the framework of the model (9) taking into account the stochasticity of the feed (the difference in the initial number of strategically oriented users in active and passive states ). In this case, it can be shown (a detailed proof lies outside the scope of this paper) that as increases and exceeds a certain critical value, a microblogging network evolves according to the strategy chosen by a relatively small number of strategically oriented users. The aforementioned avalanche-like increase of the number of microposts takes place. The critical value of is determined by the geometric mean of the total and critical values of the number of strategically oriented users. A formalism that leads to the above result is presented in .
The results obtained in this paper are valuable from both theoretical and practical points of view. Firstly, they show that the systems under consideration (in this case the number of microposts) are deterministic despite having noise components (i.e., they are not stochastic). This allows us to use the theory of dynamical systems and analyze the time series of microposts in a different way, using the dimension theory and the theory of dynamic systems. Secondly, the values of invariants obtained can help solve the problem of prediction. For example, the embedding dimension shows how many terms of a time series determine the next term, whereas the correlation entropy and the largest Lyapunov exponent allow us to estimate the time of predictability of the system.
In conclusion, we would like to mention that there exist many interesting problems that are not studied yet, such as critical phenomena of self-organization in microblogging networks based on the analysis of the nonlinear random dynamic system (10). This will be the subject of our future research.
Data was obtained by hydrating a list containing 3,183,202 identifiers of tweets from the set of 12 lists of identifiers provided by Harvard University. This list is about the 2016 USA presidential elections: «2016 United States Presidential Election Tweet Ids» (2016). The list was created by Justin Littman, Laura Wrubel, and Daniel Kerchner. The authors of the list used SocialFeed service to gather data after the first debates. Tweets on the subsequent debates were not included in the sample. The sample obtained has about 1 million empty entries. This happened because some users whose identifiers were in the initial list later removed their tweets or made them private. The resulting sample has the following characteristics: a micropost can correspond to one hashtag or several hashtags (#debate, #debates, #debatenight, #debate2016, #debates2016); the presence of the micropost’s author in the list of followers of one or several users (CPD (@debates), Hillary Clinton (@HillaryClinton) и Donald J. Trump (@realDonaldTrump)); 2,290,855 microposts; 934,656 users; 76,458 time intervals; one-second increments. After the list of tweets was received, the information was extracted in id:original_id format. Here id is a unique identifier of the user who made the retweet; the original_id is a unique identifier of the user who made the initial tweet. If a tweet is not a retweet, id and original_id coincide.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
The work was supported by the Russian Foundation for Basic Research (grant 16-07-01027).
- D. J. De Solla Price, “Networks of scientific papers,” Science, vol. 149, no. 3683, pp. 510–515, 1965.
- A. Barabasi and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999.
- R. Albert and A. Barabási, “Statistical mechanics of complex networks,” Reviews of Modern Physics, vol. 74, no. 1, pp. 47–97, 2002.
- C. T. Butts, “The complexity of social networks: Theoretical and empirical findings,” Social Networks, vol. 23, no. 1, pp. 31–71, 2001.
- J. Skvoretz, “Complexity theory and models for social networks,” Complexity, vol. 8, no. 1, pp. 47–55 (2003), 2002.
- M. G. Everett, “Role similarity and complexity in social networks,” Social Networks. An International Journal of Structural Analysis, vol. 7, no. 4, pp. 353–359, 1985.
- H. Ebel, J. Davidsen, and S. Bornholdt, “Dynamics of social networks,” Complexity, vol. 8, no. 2, pp. 24–27 (2003), 2002.
- S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D. W. Hwang, “Complex networks: Structure and dynamics,” Physics Reports, vol. 424, no. 4-5, pp. 175–308, 2006.
- S. Tabassum, F. S. F. Pereira, S. Fernandes, and J. Gama, “Social Networks Analysis: An Overview,” WIREs Data Mining and Knowledge Discovery, pp. 1–21, 2018.
- S. Saganowski, B. Gliwa, P. Bródka, A. Zygmunt, P. Kazienko, and J. Kozlak, “Predicting community evolution in social networks,” Entropy, vol. 17, no. 5, pp. 3053–3096, 2015.
- P. De Meo, F. Messina, D. Rosaci, and G. M. L. Sarné, “Forming time-stable homogeneous groups into Online Social Networks,” Information Sciences, vol. 414, pp. 117–132, 2017.
- A. Grabowski and R. A. Kosiński, “Ising-based model of opinion formation in a complex network of interpersonal interactions,” Physica A: Statistical Mechanics and its Applications, vol. 361, no. 2, pp. 651–664, 2006.
- S. Dasgupta, R. K. Pan, and S. Sinha, “Phase of Ising spins on modular networks analogous to social polarization,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 80, no. 2, Article ID 025101, 2009.
- G. Bianconi, “Mean Field Solution of the Ising Model on a Barabási-Albert Network,” Physics Letters A, vol. 303, no. 2-3, pp. 166–168, 2002.
- C. Li, F. Liu, and P. Li, “Ising model of user behavior decision in network rumor propagation,” Discrete Dynamics in Nature and Society, vol. 2018, Article ID 5207475, 2018.
- J.-L. Guo, Q. Suo, A.-Z. Shen, and J. Forrest, “The evolution of hyperedge cardinalities and bose-Einstein condensation in hypernetworks,” Scientific Reports, vol. 6, Article ID 33651, 2016.
- M. Faccin, T. Johnson, J. Biamonte, S. Kais, and P. Migdał, “Degree Distribution in Quantum Walks on Complex Networks,” Physical Review X, vol. 3, no. 4, Article ID 041007, 2013.
- J. Reichardt and S. Bornholdt, “Statistical mechanics of community detection,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 74, no. 1, Article ID 016110, 2006.
- B. P. Chamberlain, J. Levy-Kramer, C. Humby, and M. P. Deisenroth, “Real-time community detection in full social networks on a laptop,” PLoS ONE, vol. 13, no. 1, Article ID e0188702, 2018.
- Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos, “Nonlinear dynamics of information diffusion in social networks,” ACM Transactions on the Web (TWEB), vol. 11, Article 11, no. 2, 2017.
- N. Hegde, L. Massoulie, and L. Viennot, “Self-organizing flows in social networks,” Theoretical Computer Science, vol. 584, no. 13, pp. 3–18, 2015.
- P. Bak, C. Tang, and K. Wiesenfeld, “Self-organized Criticality: An Explanation of 1/f-noise,” Physical Review Letters, vol. 59, no. 4, pp. 381–384, 1987.
- P. Bak, C. Tang, and K. Wiesenfeld, “Self-organized Criticality,” Physical Review A, vol. 38, no. 1, pp. 364–374, 1988.
- P. Bak, How Nature Works: The Science of Self-organized Criticality, Springer-Verlag, 1996.
- B. Tadić, M. M. Dankulov, and R. Melnik, “Mechanisms of self-organized criticality in social processes of knowledge creation,” Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 96, no. 3, Article ID 032307, 2017.
- M. Aguilera, I. Morer, X. Barandiaran, and M. Bedia, “Quantifying Political Self-Organization in Social Media. Fractal patterns in the Spanish 15M movement on Twitter,” in Proceedings of the 12th European Conference on Artificial Life, pp. 395–402, Michigan, USA, 2013.
- K. Lyudmyla, B. Vitalii, and R. Tamara, “Fractal time series analysis of social network activities,” in Proceedings of the 2017 4th International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PIC S&T), pp. 456–459, IEEE, Kharkov, Ukraine, October 2017.
- T. De Bie, J. Lijffijt, C. Mesnage, and R. Santos-Rodriguez, “Detecting trends in twitter time series,” in Proceedings of the 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, Vietri sul Mare, Salerno, Italy, September 2016.
- A. Mollgaard and J. Mathiesen, “Emergent user behavior on twitter modelled by a stochastic differential equation,” PLoS ONE, vol. 10, no. 5, pp. 1–12, 2015.
- A. Dmitriev, V. Dmitriev, O. Tsukanova, and S. Maltseva, “A nonlinear dynamical approach to the interpretation of microblogging network complexity,” Studies in Computational Intelligence, vol. 689, pp. 390–400, 2017.
- B. Tadić, V. Gligorijević, M. Mitrović, and M. Šuvakov, “Co-evolutionary mechanisms of emotional bursts in online social dynamics and networks,” Entropy, vol. 15, no. 12, pp. 5084–5120, 2013.
- М. М. Waldrop, Complexity: The Emerging Science at the Edge of Order and Chaos, Touchstone, New York, USA, 1993.
- O. A. Tsukanova, E. P. Vishnyakova, and S. V. Maltseva, “Model-based monitoring and analysis of the network community dynamics in a textured state space,” in Proceedings of the 16th IEEE Conference on Business Informatics, CBI 2014, pp. 44–49, Switzerland, July 2014.
- Ming Li, “Fractal Time Series—A Tutorial Review,” Mathematical Problems in Engineering, vol. 2010, Article ID 157264, 26 pages, 2010.
- P. Grassberger and I. Procaccia, “Measuring the strangeness of strange attractors,” Physica D: Nonlinear Phenomena, vol. 9, no. 1-2, pp. 189–208, 1983.
- M. Z. Ding, C. Grebogi, E. Ott, T. Sauer, and J. A. Yorke, “Estimating correlation dimension from a chaotic time series: when does plateau onset occur?” Physica D: Nonlinear Phenomena, vol. 69, no. 3-4, pp. 404–424, 1993.
- M. M. Dubovikov, N. V. Starchenko, and M. S. Dubovikov, “Dimension of the minimal cover and fractal analysis of time series,” Physica A: Statistical Mechanics and its Applications, vol. 339, no. 3-4, pp. 591–608, 2004.
- B. B. Mandelbrot and J. W. Van Ness, “Fractional brownian motions, fractional noises and applications,” SIAM, vol. 10, no. 4, pp. 422–437, 1968.
- R. B. D’Agostino, A. Belanger, and R. B. D’Agostino, “A suggestion for using powerful and informative tests of normality,” The American Statistician, vol. 44, no. 4, pp. 316–321, 1990.
- R. B. Govindan, J. D. Wilson, H. Preil, H. Eswaran, J. Q. Campbell, and C. L. Lowery, “Detrended fluctuation analysis of short datasets: an application to fetal cardiac data,” Physica D: Nonlinear Phenomena, vol. 226, no. 1, pp. 23–31, 2007.
- M. M. Uddin, M. Imran, and H. Sajjad, “Understanding Types of Users on Twitter,” in Proceedings of 6th ASE International Conference in Social Computing, Stanford, USA, 2014.
- A. I. Olemskoi, A. V. Khomenko, and D. O. Kharchenko, “Self-organized criticality within fractional Lorenz scheme,” Physica A: Statistical Mechanics and its Applications, vol. 323, no. 1-4, pp. 263–293, 2003.
- A. I. Olemskoǐ, “Theory of stochastic systems with singular multiplicative noise,” Physics-Uspekhi, vol. 41, no. 3, pp. 269–301, 1998. &publication_year=1998" target="_blank">Google Scholar
- A. I. Olemskoi and A. V. Khomenko, “Three-Parameter Kinetics of a Phase Transition,” Journal of Theoretical and Experimental Physics, vol. 81, no. 6, pp. 1180–1192, 1996.
- S. Aparicio, J. Villazón-Terrazas, and G. Álvarez, “A Model for Scale-Free Networks: Application to Twitter,” Entropy, vol. 17, no. 12, pp. 5848–5867, 2015.
Copyright © 2018 Andrey Dmitriev et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.