#### Abstract

Chaotic dynamical systems are ubiquitous in nature and most of them does not have an explicit dynamical equation and can be only understood through the available time series. We here briefly review the basic concepts of time series and its analytic tools, such as dimension, Lyapunov exponent, Hilbert transform, and attractor reconstruction. Then we discuss its applications in a few fields such as the construction of differential equations, identification of synchronization and coupling direction, coherence resonance, and traffic data analysis in Internet.

#### 1. Introduction

Chaotic dynamical systems are ubiquitous in nature such as the tornado, stock market, turbulence, and weather. Their functions are different in different situations. For example, in the case of tornado, the chaotic behavior is harmful to human beings and need to be avoided or controlled. But in the case of the activities in human brain, the chaotic behaviors are useful and necessary to sustain the normal functions of brain. Thus, it is an important task to understand chaos and let it serve human society better.

The chaos has been studied for a long time. It is usually believed that Poincaré is the first one who studied chaos. In the later 19th century, Poincaré studied the restricted three-body problem where one body is negligible small, compared to the other two. Poincaré found that the solution of this simple system is very complicated and cannot be given precisely. Then in 1963, Lorenz revealed the “butterfly effect” in studying the weather prediction and is thus recognized as the father of chaos. But the formal use of chaos is from the paper of “Period three implies chaos” by Li and Yorke in 1975. After that, chaos have been widely studied and a lot of important concepts has been introduced, such as the dimensions, Lyapunov exponents, Fourier transform and Hilbert transform, and attractor reconstruction.

The most striking feature of chaos is the unpredictability of its future. This feature is usually called as the “sensitivity dependence on initial conditions” or “butterfly effect.” For example, if two initial conditions have a small difference , their difference after time will be with , that is, exponential separation. Thus, a tiny difference or even a cutoff error will be blown up quickly and results in a big difference in the near future. “Similar causes have similar effects” is invalid in chaotic systems except for short periods. A system is called chaotic system, provided that its maximum Lyapunov exponent is positive.

Mathematically, chaos can be produced by both discrete and continuous equations [1–6]. The discrete systems can be expressed as such as the Logistic map, Henon map, standard map, tent map, circle map, and Ikeda map. The continuous systems can be expressed as differential equation with three or more degrees of freedom . Typical chaotic flows are the Lorenz equation, Rössler equation, Duffing's equation, Chua's circuit, and so forth. The discrete map and continuous flow are not unrelated but have a close relationship. The discrete map (1.1) can be considered as a projection of flow (1.2) on a specific Poincaré surface of section. Letting and taking limit, (1.1) will be transformed into (1.2). On the other hand, (1.2) can be transformed into (1.1) by integration, that is, with being the returning time to the Poincaré section. A common point between the discrete and continuous systems is that both are deterministic. That is, each initial condition uniquely determines the time evolution. Thus, the chaos is usually called “deterministic chaos” and can be considered as some kind of randomness produced by deterministic system. A necessary condition for (1.1) and (1.2) to show chaos is that the functions and must be nonlinear.

There are infinite unstable periodic orbits in chaos, which form invariant sets [1, 2]. An invariant set is the image of itself under time evolution. Once a trajectory is located in these invariant sets, it will stay there forever. Comparing with the dense chaotic orbits, the invariant sets are seldom and its measure is zero. In experimental situations or in numerical simulations, as there is always noise or cutoff error, an arbitrary trajectory will never stay at these invariant sets. The invariant sets are thus not observed in general. Given a long enough time, a trajectory will lose memory on its initial condition and eventually settle on a restricted geometry, called stable attractor. Once a trajectory enters the attractor, it will stay in it forever if there is no external perturbation. The trajectory will go through all the phase points there with its nature measure except the unstable periodic orbits. That is, the attractor is also an invariant set. This property is called ergodicity in chaotic attractor.

Mathematically, there are a lot of measures such as the Lebesgue measure, Hausdorff measure, counting measure, and natural measure. Suppose that the relative probability for a continuous variable to appear at some location is described by the probability density . The measure is the probability to find in an area , that is, . The average of a function is

When the measure is an invariant measure and the average of function on equals the average on time, becomes an ergodic measure with

This measure does not include those unstable periodic orbits in the attractor and is called as natural measure. To use this measure, one waits until all transients have been wiped off and looks for an invariant probability measure describing the distribution of typical orbits. Thus, in the natural measure, we say that the chaotic trajectories are ergodic. Based on this property, many quantities are more conveniently defined as averages over the natural measure in phase space. Because of the nonperiodicity of chaotic trajectory, the attractor is usually called as “strange attractor,” in contrast to the attractors of fixed point and periodic orbits.

Practically, we usually do not have the dynamical equations (1.1) and (1.2), like the earthquake, brain and stock market, and so forth, thus the details of the system equations in the phase space and the asymptotic invariant set that determines what can be observed through experimental probes are unknown. What we can obtain is some time series from one or at best a few of the dynamical variables of the system, which poses a big challenge to characterize the chaotic systems. An interesting question is how one go from one or a few time series to the multivariate state or phase space which is required for chaotic motions to occur?

Fortunately, this problem has been intensively studied in the past decades and has now formed a subject called * data mining or data analysis* [7–11]. The basic assumption is that the measured time series comes from the attractor of the unknown system with ergodicity, which contains the information of the attractor. Then one can use the measured time series to figure out the properties of attractor such as its dimension, its dynamical skeleton, and its degree of sensitivity on initial conditions. The delay-coordinate embedding technique established by Takens [12] provides a practical solution to this task.

Suppose that an experiment is conducted and one time series is measured, where is the sampling interval. Such a time series can be, for instance, a voltage signal from a physical or biological experiment, or the concentration of a substance in a chemical reaction, or the amount of instantaneous traffic at a point/node in the Internet, and so on. In principle, the measured time series comes from an underlying dynamical system with (1.1) and (1.2), with or without the influence of noise. Most commonly, the time series is a consequence of scalar measurements of some quantity which depends on the current state of the system, taken at a fixed sampling time: where, more often than not, the measurement function is unknown as , is the -dimensional variable and is the measurement noise. Let us neglect the effect of noise at this level of presentation. Equation (1.5) becomes

The total recording time is then . In general, should be sufficiently large so that the full dynamics are exhibited in the time series. This is what we have in hand. In the following we will show how to extract the information on the unknown attractor from (1.5) or (1.6).

#### 2. Basic Concepts and Analytic Tools of Chaotic Time Series

To characterize the attractor, some basic concepts have been introduced such as dimensions and Lyapunov exponents [1, 2]. These quantities are invariant under the evolution operator of the system and thus are independent of changes in the initial conditions of the orbit, and both are independent of the coordinate system in which the attractor is observed. Therefore, we can evaluate them from experimental data. The advantage of these quantities is that each one of them will turn a sequence of data into a single number, which is convenient for comparison between different time series.

As an attractor is generally located in a finite area, its trajectory shows some kinds of oscillatory behavior, that is, recurrence. To show the oscillatory behavior from a given scalar time series, a convenient way is to transform it into a -dimensional vector by the Hilbert transform. To obtain the attractor of the corresponding time series, we have to reconstruct an auxiliary phase space by an embedding procedure, that is, the delay embedding technique.

##### 2.1. Dimensions

Dimension quantifies the self-similarity of a geometrical object [7, 8]. For a homogeneous object, its dimension is a fixed number; while for a heterogeneous object, its different parts may have different dimensions and need to be characterized by the multifractal dimension. We here focus on the homogeneous objects. In this situation, a finite collection of points is zero dimensional, lines have dimension one, surfaces two and bulks three, and so forth. A chaotic attractor usually has a fractal dimension, that can be characterized by several ways. Three of them are convenient for numerical calculation, which are the box-counting dimension, information dimension, and correlation dimension.

For calculating the box-counting dimension , we divide the attractor into spheres/boxes of radius and ask how many boxes do we need to cover all the points in the data set. If we evaluate this number as a function of as it becomes small, then we define the box-counting dimension by Take Henon map as an example. Its attractor is shown in Figure 1(a), where the square lattice has length . We count the number of square lattice which contains at least one point of the trajectory and then obtain one point in Figure 1(b). Then we change the in Figure 1(a) into and do the same process to obtain another point in Figure 1(b). In this way, we get Figure 1(b) in which the slope of the straight line is .

**(a)**

**(b)**

The shortage of box-counting dimension is that it treats all the boxes with points of the trajectory as the same, no matter how many points in each box. Information dimension overcomes this shortage and is defined in terms of the relative frequency of visitation of a typical trajectory by where

is the relative frequency with which a typical trajectory enters the th box of the covering. Comparing with the definition of , we see that the box-counting dimension is weight-free while the information dimension is weighted on the visiting frequency in the specific boxes. When is a constant, will become .

When dimension is high, the calculation of both and will be time consuming and a convenient dimension for this case is the correlation dimension . Comparing with the box-counting dimension and the information dimension , the correlation dimension is much easier to be calculated and is defined by

The sum in the numerator of (2.4) can be expressed as the following correlation sum [9]:

where is the Heaviside function, for and otherwise, and stands for the distance between points and . Thus the sum just counts the pairs whose distance is smaller than . Then (2.4) becomes

As (2.5) is very easy to be calculated, the correlation dimension has been widely used in time series analysis.

In practice, the sum in the equation of also depends on the embedding dimension as depends on the embedding space. The correlation dimension can be calculated in two steps. First one has to determine the correlation sum , for the range of available and for several embedding dimensions . Then we inspect for the signatures of self-similarity. If these signatures are convincing enough, we can compute a value for the dimension. Grassberger and Procaccia demonstrated that if the embedding dimension and the number of data points are large enough and if the data are sufficiently noise-free, then the function versus has a linear region, called the scaling region [13, 14]. The slope of the function in that linear region is . Due to such reasons, the correlation dimension usually is estimated by examining the slope of the linear portion of the plot of versus for a series of increasing values of .

##### 2.2. Lyapunov Exponents

Lyapunov exponent is the most important quantity to chaotic systems as a positive maximal Lyapunov exponent is a strong signature of chaos. In the contrary, a zero maximal Lyapunov exponent denotes a limit cycle or a quasiperiodic orbit and a negative maximal Lyapunov exponent represents a fixed point. An -dimensional system has Lyapunov exponents with in descending order.

Consider a dynamical system described by (1.2). Taking variation of both sides of (1.2) yields the following equation governing the evolution of the infinitesimal vector in the tangent space at : Solving for (2.7) gives

where is a linear operator that evolves an infinitesimal vector at time to time . The mean exponential rate of divergence of the tangent vector is then given by

The Lyapunov exponents are given as where is an -dimensional basis vector. Obviously, each Lyapunov exponent is an average of the local divergence rates over the whole attractor. For chaotic system, values of do not depend on the choice of the initial condition because of the ergodicity.

To check whether a time series is chaotic or not, one needs to calculate the , that is, the maximal Lyapunov exponent . This task is much easier than the calculation of all the . The reason is that a chaotic trajectory will automatically go to its maximum expending direction. This property can be conveniently used to calculate the . Similarly, a chaotic trajectory will automatically go to its maximum contracting direction if we let it do backward evolution with , which can be used to calculate the smallest Lyapunov exponent . Numerically, one can calculate the as follows. Choose two very close initial points and let their distance be . After integrating the dynamical system for a small time interval , their distance will become . Considering that the attractor has a finite size, it is very easy for the trajectory to reach the boundary of attractor. Once it happens, the distance will not be exponentially growing. A convenient way to overcome this problem is to do renormalization, which makes the evolution begin at a small distance again. In detail, we choose a new point at the end of one trajectory and let their distance be . Doing the integration again, we can get another . In this way we have Figure 2 shows its schematic illustration.

The approach shown in Figure 2 can be also applied to the case of discrete systems. Consider an -dimensional map . Let be its state at time . We make a new state by adding a small perturbation . That is, . Let both and evolve for a time period to obtain their distance . Then the maximal Lyapunov exponent can be given by

where .

Practically, we cannot use the above approach to experimental data or time series as there may not always have a pair of points with distance at time . Thus we need to do some modification on it. The detailed process is as follows. Because of the recurrence of chaotic trajectory, one can always find two close points with distance at time . Then follow their trajectory to time and measure their distance . The evolution time is . After that, we do renormalization to find a new beginning point. Considering that the candidates for choosing are not sufficient, we choose one from them by the following two rules: () the distance must be small; () to keep the direction of the maximal expending, the angle between and must be small. Figure 3 shows its schematic illustration. In this way we have

Experimental data are generally contaminated by noise. Its influence can be minimized by using an averaging statistics when computing Lyapunov exponents. The concrete process can be taken as follows. Choose a point and select all its neighbors with distance smaller than . The size of the neighborhood should be as small as possible, but large enough such that on average each reference point has at least a few neighbors. Compute the average over the distances of all neighbors to the reference part of the trajectory as a function of the relative time. Substitute this average into (2.11) or (2.13) to get the maximal Lyapunov exponent . To calculate more Lyapunov exponents from data except the , see [15, 16] for details.

##### 2.3. Fourier Transform and Hilbert Transform

Fourier spectral analysis is traditionally the time series analysis method and very powerful in revealing the periodicity of time series. Its basic idea is that most signals, and all engineering signals, can be represented as a sum of sine waves. By Fourier transform, a time series is transformed into a frequency spectrum, that is, from real space to frequency domain or Fourier space. The Fourier transform establishes a one-to-one correspondence between the signal at certain times (time domain) and how certain frequencies contribute to the signal. Thus, instead of describing the statistical properties of a signal in real space one can ask about its properties in Fourier space.

The Fourier transform of a function is given by and that of a finite, discrete time series by

Here, the frequencies in physical units are , where and is the sampling interval.

A necessary condition for the Fourier analysis to be meaningful is that the time series should be piecewise stationary. However, a chaotic time series has a broad-band power spectra for which the Fourier spectrum gives no indication about the deterministic origin, let alone the fundamental invariants of the underlying dynamical system. For chaotic time series, a powerful approach is the Hilbert transform [10].

Consider a time series . One performs a mathematical transform to obtain the corresponding imaginary part , yielding the following complex signal :

is called * analytic signal* and the value of is obtained from through a transform called * Hilbert transform*:

where stands for the Cauchy principal value for the integral.

The Hilbert transform is closely related to the Fourier transform and their relationship can be seen clearly as follows. Observe that if the real signal has a Fourier transform , then the complex signal, , the spectrum of which is composed of positive frequencies of only, is given by the inverse transform of , where the integration goes only over the positive frequencies: The factor of is inserted so that the real part of the analytic signal is , not one half of that. The explicit form of can then be obtained in terms of the real signal . Because the complex signal can be written as The following mathematical identity [17]:

gives

which yields

which is the analytic signal corresponding to the real signal . The imaginary part of (2.23) is just the Hilbert transform (2.17).

The analytic signal corresponds geometrically to a rotation with amplitude and phase as follows:

By the phase variable we can obtain an instantaneous frequency

where and denote the derivatives of and to , respectively. Note that the instantaneous frequency is fundamentally different from the concept of frequency in the Fourier transform defined in the base of simple harmonic functions. Here measures the rate of rotation in the complex plane of analytic signal and is very useful in detecting the phase synchronization.

##### 2.4. Attractor Reconstruction

Reconstruction of phase space is so important that we can do nothing without doing it. For example, our just introduced dimensions and Lyapunov exponents are very important concepts to the description of chaotic attractor. However, their calculations depend on the trajectory in attractor. For a measured time series, it is just a scalar measurement from one or a few variables of the attractor but not a trajectory. Thus, we need firstly to figure out the trajectory from the given time series. How to do that is a big challenge. Fortunately, the delay-coordinate embedding technique laid by Takens [12] gives the mathematical foundation to solve this problem. He showed that if the dynamical system and the observed quantity are generic, then the delay-coordinate map from a d-dimensional smooth compact manifold to , , is a diffeomorphism on . That is, under fairly general conditions, the underlying dynamical system can be faithfully reconstructed from time series in the sense that a one-to-one correspondence can be established between the reconstructed and the true but unknown dynamical systems.

Takens' delay-coordinate method goes as follows. From a measured time series with being the sampling interval, the following vector quantity of components is constructed:

where , is the delay time which is an integer multiple of , and is the embedding dimension. Clearly, what time lag to use and what dimension to use are the central issues of this reconstruction based on delay-coordinate embedding. Let us discuss under what condition of and , the reconstructed vector can represent the true trajectory of the unknown attractor.

###### 2.4.1. Embedding Dimension

To have a faithful representation of the true dynamical system, the embedding dimension should be large enough. Takens' theorem provides a lower bound for . Let us figure out this bound. We note that in a space of dimension one subspace of dimension and another subspace of dimension generically intersect in subspaces of dimension If this is negative, then there is no intersection of the two subspaces. Figure 4 shows examples. In Figure 4(a) we have and , thus we obtain , which means that the intersection set consists of points, and the intersections are generic because small perturbations cannot remove them. In Figure 4(b) we have and , thus we obtain , which means that two one-dimensional curves do not intersect generally in a three-dimensional space. In Figure 4(c) we have , and , thus we obtain again, which means that the intersections are generic. To obtain a one-to-one correspondence between points on the invariant sets in the actual and reconstructed phase spaces, self-intersection must not occur. Otherwise, there will be two directions at the self-intersection, which will destroy the one-to-one correspondence.

**(a)**

**(b)**

**(c)**

When we ask about the intersection of two subspaces of the same dimension , namely, the orbit with itself, then intersections fail to occur when or The question of unfolding a set of dimension from projections to lower dimensions concerns self-intersections of the set with itself as the projections are made. So the relevant criterion for unfolding a single object is . We wish be an integer and let the lowest which satisfies the unfolding condition be the embedding dimension . Thus, we have , where means taking integer. For example, the box counting dimension of the strange attractor for the Lorenz model is which would lead us to anticipate to unfold the Lorenz attractor.

In some situations, the needed embedding dimension is very large such as the EEG (electroencephalogram) data. As large embedding dimension requires long time series ( points) and are thus computationally expensive, we hope to relax the condition , especially for calculating some statistical quantities such as Lyapunov exponents and fractal dimension. To reduce the cost of calculation, we may choose , provided that the self-intersections is neglectable. Schroer et al. show that, provided that the dimension of measurement space is large than the information dimension of the underlying dynamics, a prediction based on the reconstructed self-intersecting attractor is possible * most of the time* [18].

###### 2.4.2. Time Delay

The embedding theorem is silent on the choice of time delay to use in constructing -dimensional data vectors. Considering that the in (2.26) are serving as independent variables in the reconstruction space, they should be independent from each other and thus the delay time needs to be chosen carefully. In this sense, the chosen of should satisfy three conditions.

(1)It must be some multiple of the sampling time , since we only have data at those times.(2)If is too short, the coordinates and will not be independent enough and the reconstructed attractor will be accumulated around the diagonal line. That is, we will not have seen any of the dynamics unfold in that time.(3)Finally, since chaotic systems are intrinsically unstable, if is too large, any connection between the measurements and is numerically tantamount to being random with respect to each other.Therefore, the delay time should be large enough so that and are rather independent, but not so large that they are completely independent in a statistical sense. This is a difficult problem and can be solved by the mutual information whose first minimum is the good candidate of .

###### 2.4.3. Mutual Information

For a measured data , we create a histogram for the probability distribution of the data. Denote by the probability that the signal assumes a value inside the th bin of the histogram, and let be the probability that is in bin and is in bin . Then the mutual information for time delay reads In the special case of the joint probabilities and the expression yields the Shannon entropy of the data distribution. Apart from this degenerate case, the value of the mutual information is independent of the particular choice of histogram, as long as it is fine enough. In the limit of large , and have nothing to do with each other and thus factorizes to and the mutual information becomes zero. The mutual information between measurements and time lagged measurements is both easy to evaluate directly from the time series and easy to interpret. A very nice property of average mutual information is that it is invariant under smooth changes of coordinate system.

Fraser and Swinney suggest that one use the function as a kind of nonlinear autocorrelation function to determine the optimal [19]. The first minimum of marks the time lag value to use in time delay reconstruction of phase space where adds maximal information to the knowledge we have from , or, in other words, the redundancy is least. Let us take the Lorenz system as an example. Figure 5(a) shows how the mutual information changes with when the parameters are taken as . It is easy to see that the first minimum of occurs at . Figures 5(b) to 5(d) represent the reconstructed attractors by , respectively. Obviously, Figure 5(c) reconstructed by the first minimum of reflects the structure of double-scroll best.

**(a)**

**(b)**

**(c)**

**(d)**

The time delay can be also chosen by other methods such as the autocorrelation function [7]. It is pointed out the delay spanned window , rather than and separately, is a crucial issue for the attractor reconstruction. The reason is that the window determines the characteristics of the correlation integral whose linear part gives the correlation dimension . However, the first minimum of mutual information is not consistently successful in identifying the optimal window [20]. The optimal window length for successful embedding should satisfy with being the mean orbital period of the underlying chaotic system and is approximated from the oscillations of the time series [21].

#### 3. Modeling and Forecasting: Applications of Chaotic Time Series Analysis

The time series analysis has been widely applied in all the fields where the dynamical equations are unknown and only one or a few data are available. The most popular application is to calculate the dynamical parameters which partially reflect the properties of the attractor. For example, from a time series one can first use the delay-coordinate embedding technique to reconstruct the phase space and then calculate its dimensions and Lyapunov exponents and so forth. by the methods introduced in the previous section. As this part is relatively easy and straightforward, we will not focus on it in this section. We would like to introduce a few typical cases where the time series analysis is necessary and useful tool to tell more information on the system, not only limited to the attractor.

##### 3.1. Constructing Dynamical Equations

The most important question of time series analysis may be that: Is it possible to figure out the ordinary differential equations (ODEs) from a scalar time series, which govern the behavior of the system? This problem has been well studied and the answer is yes. A lot of methods have been developed to solve this problem. We here introduce two of them, that is, the standard approach and synchronization approach

###### 3.1.1. The Standard Approach

This approach assumes every ODE can be written in a standard form [22, 23]

where is an observable, is a polynomial of an order : For example, the Rössler system , , can be rewritten as , , . The Lorenz system , , can be rewritten as , , . Then the task is to determine the coefficient from the data.

Reference [24] points that the above method is inefficient in many situations. In particular, universal models often contain a large number of coefficients and demonstrate divergent solutions. For solving this problem, [24] gives a modified approach as follows. Its main idea is to let the function depend explicitly on time and takes the form

where the linear terms and is chosen for simple and effective approximation. In general, using higher powers of and are also possible. For (3.3), we still have one parameter to be determined, that is, the driving frequency or driving period . The idea to find is as follows [25]. First, a sufficiently big value of a polynomial order is selected. Second, an initial estimate of the period value is found (this estimate can be derived as a location of a peak in the power spectrum of the observed series). At this value of , one obtains the values of , , , and an approximation error by the linear least squares method. Then, the trial value of is varied through a certain range near the initial estimate and approximation is performed for each of the trial values. The graph of has a sharp and deep minimum which corresponds quite precisely to the “true” value of the driving period.

###### 3.1.2. Synchronization Approach

Recently, Sorrentino and Ott proposed an adaptive strategy that, by using synchronization, is able to obtain a set of ODE that describes the unknown real system [26]. They assume that the ODEs governing the system dynamics are expressible or approximately expressible in terms of polynomials of an assigned degree. Then they extract the whole set of parameters of the unknown system from knowledge about the dynamical evolution of its state vector and its first derivative. The detailed steps are as follows [26].

Consider a system with and , where is a degree two polynomial

For example, in the case of Rössler system, and , all the coefficients (, and ) are zero, except , , , and . Although the system function is unknown, it is appropriate to try to model the dynamics of the true system by with Then the task is how to make evolve to the true coefficient . To accomplish this goal, we envision coupling the true system to the model system (3.4). It is then hoped that when the synchrony is achieved, the model coefficients will be a good approximation to the corresponding coefficients of the real system.

The coupling from the true system to the model is designed as follows: The quantity is in general an vector of observable scalar quantities that are assumed to be known functions of the system state . is an constant coupling matrix. Here we assume and , where is a scalar quantity and is the identity matrix of dimension . To obtain the synchronized solution , we introduce a potential where denotes the sliding exponential average . Note that is a function of time and also of the coefficients , and . Also note that if are chaotic, then the quantities , vary chaotically as well. It is easy to see that the potential satisfies . Once , we have synchronization and the coefficients are obtained.

To make , we let the parameters , and adaptively evolve according to the following gradient descent relations:

. Our hope is that , and will converge under this evolution to the true parameter values, .

We consider the first equation of (3.8). Let denote evaluated at , we have . Substituting this into the right-hand side of the first equation of (3.8), we obtain Similarly letting denote evaluated at , the second equation of (3.8) gives Finally, we consider the third equation of (3.8) with denote evaluated at . Then

We here consider the case where are very large. For this situation the solutions , and rapidly converge to the minimum of the potentials, indicating we can set the averages on the right-hand side of (3.9)–(3.11) to zero. We further consider that the average is performed over a time scale , which is much larger than the time scale for variation in , in which case , and vary slowly compared to . Under these conditions, (3.8) and (3.9)–(3.11) then yield

Equation (3.12) constitutes a system of linear equations for the quantities , and . In practice, it is inconvenient to explicitly calculate the integrals for these quantities in terms of the form at every time step. Instead we will use the fact that satisfies the differential equation and obtain as a function of time by solving (3.14). Thus the adaptive system for finding estimates of the quantities , and is (3.6) for and (3.12) for , and , where the various terms in (3.12) are of the form obtained by integrating (3.14). Sorrentino and Ott have applied this method to both the Rössler and Lorenz systems and confirmed that it works well [26].

##### 3.2. Detecting the Phase Locking

Another important application of time series analysis is in biology and medical sciences. It is found that in living systems, synchronization is often essential in normal functioning, while abnormal synchronization can lead to severe disorders. For example, the EEG data from human brain shows that synchronization under normal conditions seems to be essential for the binding problem; whereas epilepsies are related to abnormally strong synchronization [27–29]. In principal, the synchronization relationship between two time sequences may be phase synchronization (PS), generalized synchronization (GS), Lag synchronization (LS), and complete synchronization (CS), and so forth, depending on the coupling strength [30].

For two coupled identical systems, one may observe CS where there is an invariant synchronization manifold [31]. And for two coupled nonidentical systems, it may show PS and then LS when coupling is increasing [32]. At PS, their frequencies are locked whereas their amplitudes remain chaotic and uncorrelated [33–35]. And at LS, the corresponding variables become identical after the transformation of time shift. For two coupled different systems, we may observe GS when coupling is large enough [36–45], where there is a functional relation between their states.

CS is very easy to be detected by directly checking the identity between the measured two time series. However, it is not so easy to detect the PS, LS and GS. To detect PS, we first need to calculate the phase from the two time series by the Hilbert transform. If they satisfy the condition where are integers, the two time series are phase-locking or in PS. To check LS, we calculate a similarity function :

which is a time-averaged difference between the variables and taken with the time shift . Then we search for its minimum . If there exists a with , we have LS, that is, . Figure 6 shows the similarity function for two coupled nonidentical Rössler systems where . From Figure 6 we see that the LS is possible for .

To check GS, there is a convenient method, that is, the auxiliary system approach [38]. The idea is as follows. Consider two coupled systems . Then we construct an auxiliary response system identical to , link it to the driving system in the same way as is linked to . That is, . Instead of checking the relationship between and , we check the relationship between and . If there is an identical synchronization between and , then we have a GS between and . Unfortunately, this method fails for the time series where the dynamical equations are unknown. An alternative approach to detect the GS is to estimate the maximal conditional Lyapunov exponent . There is GS if . The conditional Lyapunov exponent is determined by the variational equation of the response system at where denotes the Jacobian matrix with respect to the variable. For a dynamical system with no explicit driving and response parts, the GS can be determined by the transverse Lyapunov exponent which is from the variational equation on the invariant manifold . Its detail can be found in [41, 46].

For the case of nonstationary time series, we do not have a fixed relationship like the CS, PS, LS, and GS for all the time, that is, the synchronized relation may change with time such as in the EEG data with epileptic seizure. An approach based on the permutation entropy may be useful for this case [47, 48]. Let us briefly introduce the concept of entropy. In probability theory, entropy quantifies the uncertainty associated to a random process. Consider an experiment with outcome . Assume that the probability of is with . If has a probability very close to , then in most experiments the outcome would be thus the result is not very uncertain. One does not gain much information from performing the experiment. One can quantify the “surprise” of the outcome as . The entropy associated to the experiment is

which is simply the expectation value of the information produced by the experiment. Entropy quantifies the information content, namely, the amount of randomness of a signal.

For two measured scalar time series data we divide the long time series into segments with fixed finite length . Then in each segment denoted as (), we use the technique of sliding window analysis to partition the time series data into short sequences of a given length . Each shifting in the time series corresponds to a new short sequence. For example, and are different sequences. For a time series with length , the total short sequences in one segment will be . Inside a short sequence, we distinguish it as a pattern by the natural position order. Hence we have four different patterns for , such as , , , and . Using to denote the probability that one of the patterns appears in the time series of a segment, we can define the permutation entropy as Because the time series is always changing with time, changes with time period interval and will be determined by the local topological structure of the time series. For two time series with GS, we cannot expect them to have the same permutation entropy in the corresponding time period because they are not identity, but we would expect the corresponding have similar changing tendency. That is, we require the changing tendency of will be in step if there is GS between the two time series and . By this way we can figure out the degree of GS [47, 48].

##### 3.3. Inferring Coupling Direction

Sometimes, knowing the synchronized relationship between two time series is not enough, and we need to know more information about them such as which one is the driving system and which one is the response system. For example, in the prediction and control of epileptic seizure, we need to find out the focus. This problem has been well studied in the past decade and a number of approaches have been proposed, such as the cross-correlation functions, cross-spectral analysis, Granger causality, the nearest neighbors, and phase dynamical modeling [49–53]. We here introduce typical three of them: the Granger causality, the nearest neighbors, and the phase dynamical modeling.

###### 3.3.1. Granger Causality

Granger causality is based on the notion of predictability [49]. In general, in a coupled system that involves two interacting fields, Granger causality tests whether past values of one field () statistically help to predict the current values of the other field () better than using past values of alone. Should past values of contain information about current values of beyond that contained in the preceding sequence (or any other variables contained in the information set), variability in the field is said to “Granger causal” variability in the field. Similarly, we can test whether previous values of cause variability in the present values of .

Consider two time series and . If there is a causal relation between process and process , the prediction of signal can be improved by incorporation into the model the past of signal . As a result, we have a “joint”/bivariate autoregressive (AR) model where and are polynomials that can be determined from the current data. is the correlation length to the previous values, and describes “inertial” properties of the influence. If , then the influence is instantaneous, otherwise it is nonlocal in time. Different values of and need to be tested in order to select those values that provide the most faithful results. In the framework of linear AR, we have In (3.22) the coefficients and are selected in order to minimize the mean square error

The null hypothesis that does not Granger cause is supported when , which reduces (3.23) to

where the coefficients are selected in order to minimize the mean square error

When appears to be less than the , it is assumed that process influences process . This model leads to the well-known alternative test statistics, the Granger-Sargent test [54]

Thus, the influence of on is characterized by the value of the normalized prediction improvement and the reverse influence of on , is described by an equation similar to (3.26) in which and should be interchanged.

###### 3.3.2. The Nearest Neighbor Approach

This method is based on the existence of GS and is a nonlinear prediction approach [55–57]. Since many features of real data such as EEG signals cannot be generated by linear models, it is generally argued that nonlinear measures are likely to give more information than the one obtained with conventional linear approaches. The main idea is that for a driver/response system with , the response system will follow the driver system . Therefore, the nearest neighbors of will have corresponding nearest neighbors of but the inverse does not work. Based on this feature, the direction of coupling can be figured out.

Consider two time series and . Let us reconstruct delay vectors and , where , is the embedding dimension, and denotes the time lag. Let and , , denote the time indices of the nearest neighbors of and , respectively. Figure 7 shows the schematic illustration with . For each , the square mean Euclidean distance to its neighbors is defined as and the -conditional squared mean Euclidean distance is defined by replacing the nearest neighbors by the equal time partners of the closest neighbors of ,

If the point cloud has an average squared radius , then if the system is strongly correlated, while if they are independent. Accordingly, we can define an interdependent measure as [28, 56, 57]

Since by construction, we have

Low values of indicate independence between and , while high values indicate synchronization. This dependence becomes maximal when .

The opposite dependence is defined in complete analogy. It is in general not equal to . Both and may be of order . Thus, can depend on , and at the same time can depend on . If , that is, if depends more on than vice versa, we say that is more “active” than .

This approach is good at the case of only two coupled systems. For the case of multiple coupled systems, we need to pay attention to not only the values of or but also their difference. Take a ring of three unidirectionally coupled oscillators as an example. We will obtain that both the values of and between the oscillators and are greater than zero, but it does not means that their coupling is bidirectional. That is, the indirect coupling should be also considered.

###### 3.3.3. Phase Dynamical Modeling

This approach is presented by Rosenblum and Pikovsky [58] and widely applied in real situations [59, 60]. The main idea is to use a general property that a weak coupling affects the phases of interacting oscillators, whereas the amplitudes remain practically unchanged. In detail, consider a case of weak coupling where the dynamics can be reduced to those of two phases :

where the small parameters characterize the strength of coupling and the random terms describe noisy perturbations that are always present in real-world systems. Functions are -periodic in all arguments. If the coupling is bidirectional, both and will be greater than zero. In the case of an unidirectional driving, say from system 1 to system 2, we have and . System (3.31) describes the phase dynamics of weakly coupled noisy limit cycle oscillators, Josephson junctions, and phase locked loops, as well as phase dynamics of weakly coupled continuous-time chaotic systems [58].

The coupling direction is represented by the ratio between and , thus the goal is to estimate the ratio from the measured time series. To do it, we first extract the phase from data by Hilbert transform approach, where , is the sampling interval, . Then we compute for each time point the increments , is a time delay. These increments can be considered as generated by some unknown two-dimensional noisy map . Next, we fit (in the least mean square sense) the dependencies of on and using a finite Fourier series as the probe function: The sum in (3.32) is taken to the terms with for , for , and . The results of fitting are used to quantify the cross-dependencies of phase dynamics of two systems by means of the coefficients defined as

Finally, the * directionality index* is introduced as
In the case of unidirectional coupling , while for nearly symmetrical coupling is close to zero.

##### 3.4. Other Applications

###### 3.4.1. Coherence Resonance Induced by Noise

When two chaotic systems are coupled bidirectionally, there will be a synchronization when the coupling strength is strong enough. However, when there is noise in the system, the synchronization manifold will be destroyed from time to time, resulting in some kind of bursting. It is revealed that the time intervals of bursting will show some regularity, which can be enhanced by an optimal external noise and results in a phenomenon called coherence resonance [61, 62]. Coherence resonance is referred to the fact that noise can actually be utilized to improve the temporal regularity of the bursting time series, in the absence of an external periodic signal.

Consider two coupled Lorenz systems

where is the coupling strength, are independent Gaussian noise, and quantifies the noise strength. When , there is a synchronized manifold for [61]. But for , the difference between the two systems exhibits on-off intermittence. Take two time series from the variable of the two coupled systems and let denote their difference. Figure 8 shows the time series of for three typical . It is easy to see that the bursting interval, denoted by in Figure 8(b), shows different behaviors in the three panels of Figure 8.

(a) |

(b) |

(c) |

To show the regularity of time series , Figure 9 shows the corresponding power spectra. From this figure we see that there are no peak in the spectra for both the small and large , indicating a lack of temporal regularity in the bursting time series. A pronounced peak does exist at the intermediate noise level (see Figure 9(b)), indicating the existence of a strong time-periodic component in the time series. This apparent temporal regularity can be quantified by the characteristics of the peak at a nonzero frequency in the spectrum. In particular, we utilize the quantity , where is the height of the spectral peak, is the half width of the peak [63]. By its definition, a high value of indicates a strong temporal regularity in the bursting time series. This phenomenon can be also described by another quantity , where and are the average and variance of the interval [64]. Both quantities and show a bell shape on the noise strength , that is, a resonance on , which has been confirmed by experiment [62].

(a) |

(b) |

(c) |

###### 3.4.2. Correlation of Traffic Data in Internet

Undoubtedly, the internet has become a very important tool in our daily life [65–68]. The operations on the internet, such as browsing World Wide Web (WWW) pages, sending messages by email, transferring files by ftp, searching for information on a range of topics, and shopping, have benefited us a lot. Therefore, sustaining its normal and efficient functioning is a basic requirement. However, the communication in the internet does not always march/go freely. Similar to the traffic jam on the highway, the intermittent congestion in the internet has been observed [69]. Once it happens, the accumulated packets in Internet will increase linearly with time. Figure 10 shows three status from an Internet model [70], where the top, middle and bottom curves denote the phases of congestion, busy/buffer and free, respectively. From Figure 10 it is easy to see that there exist erratic fluctuation, heterogeneity, and nonstationarity in the data. These features make the correlation difficult to be quantified.

A conventional approach to measure the correlation in this situation is by the detrended fluctuation analysis (DFA), which can reliably quantify scaling features in the fluctuations by filtering out polynomial trends. The DFA method is based on the idea that a correlated time series can be mapped to a self-similar process by integration [71–74]. Therefore, measuring the self-similar feature can indirectly tell us information about the correlation properties. The DFA method has been successfully applied to detect long-range correlations in highly complex heart beat time series [71], stock index [72], physiological signals [73], and particle condensation [74].

The DFA method is a modified root-mean-square (rms) analysis of a random walk and its algorithm can be worked out as the following steps.

(1)Start with a time series , where , and is the length of the data, and integrate to obtain where . (2)Divide the integrated profile into boxes of equal length . In each box, we fit to get its local trend by using a least-square fit. (3)The integrated profile is detrended by subtracting the local trend in each box: (4)For a given box size , the rms fluctuation for the integrated and detrended signal is calculated (5)Repeat this procedure for different box size .For scale-invariant time series with power-law correlations, there is a power-law relationship between the rms fluctuation function and the box size :

The scaling represents the degree of the correlation in the signal; the signal is uncorrelated for and correlated for . Using the DFA method to the data in Figure 10, we find that the values of for the three curves from top to bottom are , and , respectively, indicating that the data of congestion phase are correlated while data of free phase are uncorrelated.

Except the above applications, there are many other fields of time series analysis such as chaos controlling, testing for nonlinearity with surrogate data, EEG data analysis and multifractals, and stock market analysis, which have gotten widely interesting from both physics and engineers [7–9].

#### 4. Conclusions

The chaos theory and its time series analysis have been well studied in the past decades. A lot of important results have been achieved. This paper is contributed to the brief summary of chaotic time series analysis. The concepts, such as dimension, Lyapunov exponents, Hilbert transform, and attractor reconstruction, have been discussed. Several applications of time series analysis have been explained in detail, such as constructing dynamical equations, detecting the phase locking, inferring coupling direction, and coherence resonance induced by noise and correlation of traffic data in Internet. These are only a few of the applications of time series analysis, and a lot of other applications are not included here, such as the analysis of transient chaotic time series. Even in these mentioned fields, new techniques and methods are continuously showing up.

#### Acknowledgments

The author thanks Z. Ruan and L. Liu for discussions. This work is supported by the NNSF of China under Grant nos. 10775052, 10975053, and 10635040.