Mathematical Problems in Engineering
Volume 2010 (2010), Article ID 720190, 31 pages
doi:10.1155/2010/720190
Review Article

Chaotic Time Series Analysis

Institute of Theoretical Physics and Department of Physics, East China Normal University, Shanghai 200062, China

Received 25 December 2009; Accepted 7 February 2010

Academic Editor: Ming Li

Copyright © 2010 Zonghua Liu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Chaotic dynamical systems are ubiquitous in nature and most of them does not have an explicit dynamical equation and can be only understood through the available time series. We here briefly review the basic concepts of time series and its analytic tools, such as dimension, Lyapunov exponent, Hilbert transform, and attractor reconstruction. Then we discuss its applications in a few fields such as the construction of differential equations, identification of synchronization and coupling direction, coherence resonance, and traffic data analysis in Internet.

1. Introduction

Chaotic dynamical systems are ubiquitous in nature such as the tornado, stock market, turbulence, and weather. Their functions are different in different situations. For example, in the case of tornado, the chaotic behavior is harmful to human beings and need to be avoided or controlled. But in the case of the activities in human brain, the chaotic behaviors are useful and necessary to sustain the normal functions of brain. Thus, it is an important task to understand chaos and let it serve human society better.

The chaos has been studied for a long time. It is usually believed that Poincaré is the first one who studied chaos. In the later 19th century, Poincaré studied the restricted three-body problem where one body is negligible small, compared to the other two. Poincaré found that the solution of this simple system is very complicated and cannot be given precisely. Then in 1963, Lorenz revealed the “butterfly effect” in studying the weather prediction and is thus recognized as the father of chaos. But the formal use of chaos is from the paper of “Period three implies chaos” by Li and Yorke in 1975. After that, chaos have been widely studied and a lot of important concepts has been introduced, such as the dimensions, Lyapunov exponents, Fourier transform and Hilbert transform, and attractor reconstruction.

The most striking feature of chaos is the unpredictability of its future. This feature is usually called as the “sensitivity dependence on initial conditions” or “butterfly effect.” For example, if two initial conditions have a small difference 𝛿 𝑥 , their difference after time 𝑡 will be 𝛿 𝑥 𝑒 𝜆 𝑡 with 𝜆 > 0 , that is, exponential separation. Thus, a tiny difference or even a cutoff error will be blown up quickly and results in a big difference in the near future. “Similar causes have similar effects” is invalid in chaotic systems except for short periods. A system is called chaotic system, provided that its maximum Lyapunov exponent is positive.

Mathematically, chaos can be produced by both discrete and continuous equations [16]. The discrete systems can be expressed as 𝐱 𝑛 + 1 𝐱 = 𝐟 𝑛 ( 1 . 1 ) such as the Logistic map, Henon map, standard map, tent map, circle map, and Ikeda map. The continuous systems can be expressed as differential equation 𝑑 𝐱 ( 𝑡 ) 𝑑 𝑡 = 𝐅 ( 𝐱 ( 𝑡 ) ) ( 1 . 2 ) with three or more degrees of freedom 𝐱 ( 𝑡 ) = [ 𝑥 1 ( 𝑡 ) , 𝑥 2 ( 𝑡 ) , 𝑥 3 ( 𝑡 ) , , 𝑥 𝑚 ( 𝑡 ) ] . Typical chaotic flows are the Lorenz equation, Rössler equation, Duffing's equation, Chua's circuit, and so forth. The discrete map and continuous flow are not unrelated but have a close relationship. The discrete map (1.1) can be considered as a projection of flow (1.2) on a specific Poincaré surface of section. Letting 𝛿 𝑡 = 1 and taking limit, (1.1) will be transformed into (1.2). On the other hand, (1.2) can be transformed into (1.1) by integration, that is, 𝐱 𝑛 + 1 = 𝐱 𝑛 + 𝑡 + 𝑇 𝑛 𝑡 𝐅 ( 𝐱 ( 𝑡 ) ) 𝑑 𝑡 with 𝑇 𝑛 being the returning time to the Poincaré section. A common point between the discrete and continuous systems is that both are deterministic. That is, each initial condition uniquely determines the time evolution. Thus, the chaos is usually called “deterministic chaos” and can be considered as some kind of randomness produced by deterministic system. A necessary condition for (1.1) and (1.2) to show chaos is that the functions 𝐟 and 𝐅 must be nonlinear.

There are infinite unstable periodic orbits in chaos, which form invariant sets [1, 2]. An invariant set is the image of itself under time evolution. Once a trajectory is located in these invariant sets, it will stay there forever. Comparing with the dense chaotic orbits, the invariant sets are seldom and its measure is zero. In experimental situations or in numerical simulations, as there is always noise or cutoff error, an arbitrary trajectory will never stay at these invariant sets. The invariant sets are thus not observed in general. Given a long enough time, a trajectory will lose memory on its initial condition and eventually settle on a restricted geometry, called stable attractor. Once a trajectory enters the attractor, it will stay in it forever if there is no external perturbation. The trajectory will go through all the phase points there with its nature measure except the unstable periodic orbits. That is, the attractor is also an invariant set. This property is called ergodicity in chaotic attractor.

Mathematically, there are a lot of measures such as the Lebesgue measure, Hausdorff measure, counting measure, and natural measure. Suppose that the relative probability for a continuous variable 𝑥 to appear at some location is described by the probability density 𝑃 ( 𝑥 ) . The measure 𝜇 ( 𝐷 ) is the probability to find 𝑥 in an area 𝐷 , that is, 𝜇 ( 𝐷 ) = 𝐷 𝑃 ( 𝑥 ) 𝑑 𝑥 . The average of a function 𝑓 ( 𝑥 ) is

𝑓 ( 𝑥 ) = 𝑓 ( 𝑥 ) 𝑃 ( 𝑥 ) 𝑑 𝑥 = 𝑓 ( 𝑥 ) 𝑑 𝜇 ( 𝑥 ) . ( 1 . 3 ) When the measure 𝜇 is an invariant measure and the average of function 𝑓 ( 𝑥 ) on 𝜇 equals the average on time, 𝜇 becomes an ergodic measure with

𝑓 ( 𝑥 ) = l i m 𝑇 1 𝑇 𝑇 0 𝑓 ( 𝑥 ) 𝑑 𝑡 = 𝑓 ( 𝑥 ) 𝑑 𝜇 ( 𝑥 ) . ( 1 . 4 ) This measure does not include those unstable periodic orbits in the attractor and is called as natural measure. To use this measure, one waits until all transients have been wiped off and looks for an invariant probability measure describing the distribution of typical orbits. Thus, in the natural measure, we say that the chaotic trajectories are ergodic. Based on this property, many quantities are more conveniently defined as averages over the natural measure in phase space. Because of the nonperiodicity of chaotic trajectory, the attractor is usually called as “strange attractor,” in contrast to the attractors of fixed point and periodic orbits.

Practically, we usually do not have the dynamical equations (1.1) and (1.2), like the earthquake, brain and stock market, and so forth, thus the details of the system equations in the phase space and the asymptotic invariant set that determines what can be observed through experimental probes are unknown. What we can obtain is some time series from one or at best a few of the dynamical variables of the system, which poses a big challenge to characterize the chaotic systems. An interesting question is how one go from one or a few time series to the multivariate state or phase space which is required for chaotic motions to occur?

Fortunately, this problem has been intensively studied in the past decades and has now formed a subject called data mining or data analysis [711]. The basic assumption is that the measured time series comes from the attractor of the unknown system with ergodicity, which contains the information of the attractor. Then one can use the measured time series to figure out the properties of attractor such as its dimension, its dynamical skeleton, and its degree of sensitivity on initial conditions. The delay-coordinate embedding technique established by Takens [12] provides a practical solution to this task.

Suppose that an experiment is conducted and one time series 𝑠 𝑛 = 𝑠 ( 𝑛 Δ 𝑡 ) is measured, where Δ 𝑡 is the sampling interval. Such a time series can be, for instance, a voltage signal from a physical or biological experiment, or the concentration of a substance in a chemical reaction, or the amount of instantaneous traffic at a point/node in the Internet, and so on. In principle, the measured time series comes from an underlying dynamical system with (1.1) and (1.2), with or without the influence of noise. Most commonly, the time series is a consequence of scalar measurements of some quantity which depends on the current state of the system, taken at a fixed sampling time: 𝑠 𝑛 = 𝑠 ( 𝐱 ( 𝑛 Δ 𝑡 ) ) + 𝜂 𝑛 , ( 1 . 5 ) where, more often than not, the measurement function 𝑠 is unknown as 𝐅 , 𝐱 is the 𝑚 -dimensional variable and 𝜂 𝑛 is the measurement noise. Let us neglect the effect of noise at this level of presentation. Equation (1.5) becomes

𝑠 𝑛 = 𝑠 ( 𝐱 ( 𝑛 Δ 𝑡 ) ) , 𝑛 = 1 , , 𝑁 . ( 1 . 6 ) The total recording time is then 𝑇 = 𝑁 Δ 𝑡 . In general, 𝑇 should be sufficiently large so that the full dynamics are exhibited in the time series. This is what we have in hand. In the following we will show how to extract the information on the unknown attractor from (1.5) or (1.6).

2. Basic Concepts and Analytic Tools of Chaotic Time Series

To characterize the attractor, some basic concepts have been introduced such as dimensions and Lyapunov exponents [1, 2]. These quantities are invariant under the evolution operator of the system and thus are independent of changes in the initial conditions of the orbit, and both are independent of the coordinate system in which the attractor is observed. Therefore, we can evaluate them from experimental data. The advantage of these quantities is that each one of them will turn a sequence of data into a single number, which is convenient for comparison between different time series.

As an attractor is generally located in a finite area, its trajectory shows some kinds of oscillatory behavior, that is, recurrence. To show the oscillatory behavior from a given scalar time series, a convenient way is to transform it into a 2 -dimensional vector by the Hilbert transform. To obtain the attractor of the corresponding time series, we have to reconstruct an auxiliary phase space by an embedding procedure, that is, the delay embedding technique.

2.1. Dimensions

Dimension quantifies the self-similarity of a geometrical object [7, 8]. For a homogeneous object, its dimension is a fixed number; while for a heterogeneous object, its different parts may have different dimensions and need to be characterized by the multifractal dimension. We here focus on the homogeneous objects. In this situation, a finite collection of points is zero dimensional, lines have dimension one, surfaces two and bulks three, and so forth. A chaotic attractor usually has a fractal dimension, that can be characterized by several ways. Three of them are convenient for numerical calculation, which are the box-counting dimension, information dimension, and correlation dimension.

For calculating the box-counting dimension 𝐷 0 , we divide the attractor into spheres/boxes of radius 𝜖 and ask how many boxes do we need to cover all the points in the data set. If we evaluate this number 𝑁 ( 𝜖 ) as a function of 𝜖 as it becomes small, then we define the box-counting dimension by 𝐷 0 = l i m 𝜖 0 l n 𝑁 ( 𝜖 ) . l n ( 1 / 𝜖 ) ( 2 . 1 ) Take Henon map 𝑥 𝑛 + 1 = 𝑦 𝑛 + 1 1 . 4 𝑥 2 𝑛 , 𝑦 𝑛 + 1 = 0 . 3 𝑥 𝑛 as an example. Its attractor is shown in Figure 1(a), where the square lattice has length 𝜖 . We count the number of square lattice which contains at least one point of the trajectory and then obtain one point in Figure 1(b). Then we change the 𝜖 in Figure 1(a) into 𝜖 / 2 and do the same process to obtain another point in Figure 1(b). In this way, we get Figure 1(b) in which the slope of the straight line is 𝐷 0 1 . 2 7 .

fig1
Figure 1: (a) The attractor of Henon map; (b) the slope of the straight line represents the box-counting dimension 𝐷 0 .

The shortage of box-counting dimension is that it treats all the boxes with points of the trajectory as the same, no matter how many points in each box. Information dimension overcomes this shortage and is defined in terms of the relative frequency of visitation of a typical trajectory by 𝐷 1 = l i m 𝜖 0 𝐻 ( 𝜖 ) , l n ( 1 / 𝜖 ) ( 2 . 2 ) where

𝐻 ( 𝜖 ) = 𝑁 ( 𝜖 ) 𝑖 = 1 𝑃 𝑖 l n 𝑃 𝑖 . ( 2 . 3 ) 𝑃 𝑖 is the relative frequency with which a typical trajectory enters the 𝑖 th box of the covering. Comparing with the definition of 𝐷 0 , we see that the box-counting dimension is weight-free while the information dimension is weighted on the visiting frequency in the specific boxes. When 𝑃 𝑖 is a constant, 𝐷 1 will become 𝐷 0 .

When dimension is high, the calculation of both 𝐷 0 and 𝐷 1 will be time consuming and a convenient dimension for this case is the correlation dimension 𝐷 2 . Comparing with the box-counting dimension 𝐷 0 and the information dimension 𝐷 1 , the correlation dimension 𝐷 2 is much easier to be calculated and is defined by

𝐷 2 = l i m 𝜖 0 l n 𝑁 ( 𝜖 ) 𝑖 = 1 𝑃 2 𝑖 . l n 𝜖 ( 2 . 4 ) The sum in the numerator of (2.4) can be expressed as the following correlation sum [9]:

2 𝐶 ( 𝜖 ) = 𝑁 ( 𝑁 1 ) 𝑁 𝑁 𝑗 = 1 𝑖 = 𝑗 + 1 Θ | | 𝐱 𝜖 𝑖 𝐱 𝑗 | | , ( 2 . 5 ) where Θ ( ) is the Heaviside function, Θ ( 𝑥 ) = 1 for 𝑥 0 and 0 otherwise, and | 𝐱 𝑖 𝐱 𝑗 | stands for the distance between points 𝐱 𝑖 and 𝐱 𝑗 . Thus the sum just counts the pairs ( 𝐱 𝑖 , 𝐱 𝑗 ) whose distance is smaller than 𝜖 . Then (2.4) becomes

𝐷 2 = l i m 𝜖 0 l n 𝐶 ( 𝜖 ) . l n 𝜖 ( 2 . 6 ) As (2.5) is very easy to be calculated, the correlation dimension 𝐷 2 has been widely used in time series analysis.

In practice, the sum in the equation of 𝐶 ( 𝜖 ) also depends on the embedding dimension 𝑚 as 𝑥 ( 𝑡 ) depends on the embedding space. The correlation dimension can be calculated in two steps. First one has to determine the correlation sum 𝐶 ( 𝜖 ) , for the range of 𝜖 available and for several embedding dimensions 𝑚 . Then we inspect 𝐶 ( 𝑚 , 𝜖 ) for the signatures of self-similarity. If these signatures are convincing enough, we can compute a value for the dimension. Grassberger and Procaccia demonstrated that if the embedding dimension and the number of data points are large enough and if the data are sufficiently noise-free, then the function l n 𝐶 ( 𝜖 ) versus l n 𝜖 has a linear region, called the scaling region [13, 14]. The slope of the function in that linear region is 𝐷 2 . Due to such reasons, the correlation dimension 𝐷 2 usually is estimated by examining the slope of the linear portion of the plot of l n 𝐶 ( 𝜖 ) versus l n 𝜖 for a series of increasing values of 𝑚 .

2.2. Lyapunov Exponents

Lyapunov exponent is the most important quantity to chaotic systems as a positive maximal Lyapunov exponent is a strong signature of chaos. In the contrary, a zero maximal Lyapunov exponent denotes a limit cycle or a quasiperiodic orbit and a negative maximal Lyapunov exponent represents a fixed point. An 𝑚 -dimensional system has 𝑚 Lyapunov exponents with 𝜆 1 , 𝜆 2 , , 𝜆 𝑚 in descending order.

Consider a dynamical system described by (1.2). Taking variation of both sides of (1.2) yields the following equation governing the evolution of the infinitesimal vector 𝛿 𝐱 in the tangent space at 𝐱 ( 𝑡 ) : 𝑑 𝛿 𝐱 = 𝑑 𝑡 𝜕 𝐅 𝜕 𝐱 𝛿 𝐱 . ( 2 . 7 ) Solving for (2.7) gives

𝛿 𝐱 ( 𝑡 ) = 𝐀 𝑡 𝛿 𝐱 ( 0 ) , ( 2 . 8 ) where 𝐀 𝑡 = e x p ( ( 𝜕 𝐅 / 𝜕 𝐱 ) 𝑑 𝑡 ) is a linear operator that evolves an infinitesimal vector at time 0 to time 𝑡 . The mean exponential rate of divergence of the tangent vector is then given by

𝜆 [ 𝐱 ] ( 0 ) , 𝛿 𝐱 ( 0 ) = l i m 𝑡 1 𝑡 | | | | l n 𝛿 𝐱 ( 𝑡 ) | | | | 𝛿 𝐱 ( 0 ) . ( 2 . 9 ) The Lyapunov exponents are given as 𝜆 𝑖 𝐱 𝜆 ( 0 ) , 𝐞 𝑖 , ( 2 . 1 0 ) where 𝑒 𝑖 is an 𝑚 -dimensional basis vector. Obviously, each Lyapunov exponent is an average of the local divergence rates over the whole attractor. For chaotic system, values of 𝜆 𝑖 do not depend on the choice of the initial condition 𝐱 ( 0 ) because of the ergodicity.

To check whether a time series is chaotic or not, one needs to calculate the 𝜆 1 , that is, the maximal Lyapunov exponent 𝜆 m a x . This task is much easier than the calculation of all the 𝜆 𝑖 . The reason is that a chaotic trajectory will automatically go to its maximum expending direction. This property can be conveniently used to calculate the 𝜆 m a x . Similarly, a chaotic trajectory will automatically go to its maximum contracting direction if we let it do backward evolution with 𝑡 𝑡 , which can be used to calculate the smallest Lyapunov exponent 𝜆 𝑚 . Numerically, one can calculate the 𝜆 m a x as follows. Choose two very close initial points and let their distance be 𝑑 0 1 . After integrating the dynamical system for a small time interval 𝜏 , their distance will become 𝑑 𝑖 . Considering that the attractor has a finite size, it is very easy for the trajectory to reach the boundary of attractor. Once it happens, the distance 𝑑 𝑖 will not be exponentially growing. A convenient way to overcome this problem is to do renormalization, which makes the evolution begin at a small distance again. In detail, we choose a new point at the end of one trajectory and let their distance be 𝑑 0 . Doing the integration again, we can get another 𝑑 𝑖 . In this way we have 𝜆 m a x = l i m 𝑛 1 𝑛 𝜏 𝑛 𝑖 = 1 𝑑 l n 𝑖 𝑑 0 . ( 2 . 1 1 ) Figure 2 shows its schematic illustration.

720190.fig.002
Figure 2: Schematic illustration of how to numerically calculate the maximal Lyapunov exponent.

The approach shown in Figure 2 can be also applied to the case of discrete systems. Consider an 𝑚 -dimensional map 𝑥 ( 𝑖 ) ( 𝑖 = 1 , 2 , , 𝑚 ) . Let { 𝑥 𝑛 ( 𝑖 ) } be its state at time 𝑛 . We make a new state { 𝑥 𝑛 ( 𝑖 ) } by adding a small perturbation { 𝑑 𝑥 ( 𝑖 ) } . That is, { 𝑥 𝑛 ( 𝑖 ) = 𝑥 𝑛 ( 𝑖 ) + 𝑑 𝑥 ( 𝑖 ) } . Let both { 𝑥 𝑛 ( 𝑖 ) } and { 𝑥 𝑛 ( 𝑖 ) } evolve for a time period to obtain their distance { 𝑑 𝑥 𝑛 ( 𝑖 ) } . Then the maximal Lyapunov exponent 𝜆 m a x can be given by

𝜆 m a x = l i m 𝑁 1 𝑁 𝑁 𝑛 = 1 l n 𝑑 𝑥 𝑛 𝑑 𝑥 𝑛 1 , ( 2 . 1 2 ) where 𝑑 𝑥 𝑛 = 𝑚 𝑖 = 1 [ 𝑑 𝑥 𝑛 ( 𝑖 ) ] 2 .

Practically, we cannot use the above approach to experimental data or time series as there may not always have a pair of points with distance 𝑑 0 at time 𝑡 = 𝑛 𝜏 . Thus we need to do some modification on it. The detailed process is as follows. Because of the recurrence of chaotic trajectory, one can always find two close points with distance 𝑑 0 ( 𝑡 0 ) at time 𝑡 0 . Then follow their trajectory to time 𝑡 1 and measure their distance 𝑑 1 ( 𝑡 1 ) . The evolution time is 𝑡 1 𝑡 0 . After that, we do renormalization to find a new beginning point. Considering that the candidates for choosing are not sufficient, we choose one from them by the following two rules: ( 1 ) the distance 𝑑 0 ( 𝑡 1 ) must be small; ( 2 ) to keep the direction of the maximal expending, the angle between 𝑑 0 ( 𝑡 1 ) and 𝑑 1 ( 𝑡 1 ) must be small. Figure 3 shows its schematic illustration. In this way we have

720190.fig.003
Figure 3: Schematic illustration of how to calculate the maximal Lyapunov exponent from time series or experimental data.

𝜆 m a x = l i m 𝑁 1 𝑡 𝑁 𝑡 0 𝑁 𝑘 = 1 𝑑 l n 1 𝑡 𝑘 𝑑 0 𝑡 𝑘 1 . ( 2 . 1 3 )

Experimental data are generally contaminated by noise. Its influence can be minimized by using an averaging statistics when computing Lyapunov exponents. The concrete process can be taken as follows. Choose a point 𝐱 ( 0 ) and select all its neighbors with distance smaller than 𝜖 . The size of the neighborhood should be as small as possible, but large enough such that on average each reference point has at least a few neighbors. Compute the average over the distances of all neighbors to the reference part of the trajectory as a function of the relative time. Substitute this average into (2.11) or (2.13) to get the maximal Lyapunov exponent 𝜆 m a x . To calculate more Lyapunov exponents from data except the 𝜆 m a x , see [15, 16] for details.

2.3. Fourier Transform and Hilbert Transform

Fourier spectral analysis is traditionally the time series analysis method and very powerful in revealing the periodicity of time series. Its basic idea is that most signals, and all engineering signals, can be represented as a sum of sine waves. By Fourier transform, a time series is transformed into a frequency spectrum, that is, from real space to frequency domain or Fourier space. The Fourier transform establishes a one-to-one correspondence between the signal at certain times (time domain) and how certain frequencies contribute to the signal. Thus, instead of describing the statistical properties of a signal in real space one can ask about its properties in Fourier space.

The Fourier transform of a function 𝑠 ( 𝑡 ) is given by 1 ̃ 𝑠 ( 𝜔 ) = 2 𝜋 𝑠 ( 𝑡 ) 𝑒 2 𝜋 𝑖 𝜔 𝑡 𝑑 𝑡 ( 2 . 1 4 ) and that of a finite, discrete time series by

̃ 𝑠 𝑘 = 1 𝑁 𝑁 𝑛 = 1 𝑠 𝑛 𝑒 2 𝜋 𝑖 𝑘 𝑛 / 𝑁 . ( 2 . 1 5 ) Here, the frequencies in physical units are 𝜔 𝑘 = 𝑘 / 𝑁 Δ 𝑡 , where 𝑘 = 𝑁 / 2 , , 𝑁 / 2 and Δ 𝑡 is the sampling interval.

A necessary condition for the Fourier analysis to be meaningful is that the time series should be piecewise stationary. However, a chaotic time series has a broad-band power spectra for which the Fourier spectrum gives no indication about the deterministic origin, let alone the fundamental invariants of the underlying dynamical system. For chaotic time series, a powerful approach is the Hilbert transform [10].

Consider a time series 𝑥 ( 𝑡 ) . One performs a mathematical transform to obtain the corresponding imaginary part ̃ 𝑥 ( 𝑡 ) , yielding the following complex signal 𝜓 ( 𝑡 ) :

𝜓 ( 𝑡 ) = 𝑥 ( 𝑡 ) + 𝑖 ̃ 𝑥 ( 𝑡 ) = 𝐴 ( 𝑡 ) 𝑒 𝑖 𝜙 ( 𝑡 ) . ( 2 . 1 6 ) 𝜓 ( 𝑡 ) is called analytic signal and the value of ̃ 𝑥 ( 𝑡 ) is obtained from 𝑥 ( 𝑡 ) through a transform called Hilbert transform:

1 ̃ 𝑥 ( 𝑡 ) = P . V . 𝜋 𝑥 𝑡 𝑡 𝑡 𝑑 𝑡 , ( 2 . 1 7 ) where P . V . stands for the Cauchy principal value for the integral.

The Hilbert transform is closely related to the Fourier transform and their relationship can be seen clearly as follows. Observe that if the real signal 𝑥 ( 𝑡 ) has a Fourier transform 𝑆 ( 𝜔 ) , then the complex signal, 𝜓 ( 𝑡 ) , the spectrum of which is composed of positive frequencies of 𝑆 ( 𝜔 ) only, is given by the inverse transform of 𝑆 ( 𝜔 ) , where the integration goes only over the positive frequencies: 2 𝜓 ( 𝑡 ) = 2 𝜋 0 𝑆 ( 𝜔 ) 𝑒 𝑖 𝜔 𝑡 𝑑 𝜔 . ( 2 . 1 8 ) The factor of 2 is inserted so that the real part of the analytic signal is 𝑥 ( 𝑡 ) , not one half of that. The explicit form of 𝜓 ( 𝑡 ) can then be obtained in terms of the real signal 𝑥 ( 𝑡 ) . Because 1 𝑆 ( 𝜔 ) = 2 𝜋 𝑥 ( 𝑡 ) 𝑒 𝑖 𝜔 𝑡 𝑑 𝑡 , ( 2 . 1 9 ) the complex signal can be written as 1 𝜓 ( 𝑡 ) = 𝜋 0 𝑥 𝑡 𝑒 𝑖 𝜔 ( 𝑡 𝑡 ) 𝑑 𝑡 𝑑 𝜔 . ( 2 . 2 0 ) The following mathematical identity [17]:

0 𝑒 𝑖 𝜔 𝜏 𝑖 𝑑 𝜔 = 𝜋 𝛿 ( 𝜏 ) + 𝜏 ( 2 . 2 1 ) gives

1 𝜓 ( 𝑡 ) = 𝜋 𝑥 𝑡 𝜋 𝛿 𝑡 𝑡 + 𝑖 𝑡 𝑡 𝑑 𝑡 , ( 2 . 2 2 ) which yields

1 𝜓 ( 𝑡 ) = 𝑥 ( 𝑡 ) + 𝑖 𝜋 𝑥 𝑡 𝑡 𝑡 𝑑 𝑡 , ( 2 . 2 3 ) which is the analytic signal corresponding to the real signal 𝑥 ( 𝑡 ) . The imaginary part of (2.23) is just the Hilbert transform (2.17).

The analytic signal 𝜓 ( 𝑡 ) corresponds geometrically to a rotation with amplitude and phase as follows:

𝐴 ( 𝑡 ) = 𝑥 ( 𝑡 ) 2 + ̃ 𝑥 ( 𝑡 ) 2 , 𝜙 ( 𝑡 ) = a r c t a n ̃ 𝑥 ( 𝑡 ) 𝑥 ( 𝑡 ) . ( 2 . 2 4 ) By the phase variable 𝜙 ( 𝑡 ) we can obtain an instantaneous frequency 𝜔 ( 𝑡 )

𝜔 ( 𝑡 ) = 𝑑 𝜙 ( 𝑡 ) = ̇ 𝑑 𝑡 𝑥 ( 𝑡 ) ̃ 𝑥 ( 𝑡 ) ̇ 𝑥 ( 𝑡 ) ̃ 𝑥 ( 𝑡 ) 𝐴 2 , ( 𝑡 ) ( 2 . 2 5 ) where ̇ ̃ 𝑥 ( 𝑡 ) and ̇ 𝑥 ( 𝑡 ) denote the derivatives of ̃ 𝑥 ( 𝑡 ) and 𝑥 ( 𝑡 ) to 𝑡 , respectively. Note that the instantaneous frequency 𝜔 ( 𝑡 ) is fundamentally different from the concept of frequency in the Fourier transform defined in the base of simple harmonic functions. Here 𝜔 ( 𝑡 ) measures the rate of rotation in the complex plane of analytic signal and is very useful in detecting the phase synchronization.

2.4. Attractor Reconstruction

Reconstruction of phase space is so important that we can do nothing without doing it. For example, our just introduced dimensions and Lyapunov exponents are very important concepts to the description of chaotic attractor. However, their calculations depend on the trajectory in attractor. For a measured time series, it is just a scalar measurement from one or a few variables of the attractor but not a trajectory. Thus, we need firstly to figure out the trajectory from the given time series. How to do that is a big challenge. Fortunately, the delay-coordinate embedding technique laid by Takens [12] gives the mathematical foundation to solve this problem. He showed that if the dynamical system and the observed quantity are generic, then the delay-coordinate map from a d-dimensional smooth compact manifold 𝑀 to 𝑅 𝑚 , 𝑚 > 2 𝑑 , is a diffeomorphism on 𝑀 . That is, under fairly general conditions, the underlying dynamical system can be faithfully reconstructed from time series in the sense that a one-to-one correspondence can be established between the reconstructed and the true but unknown dynamical systems.

Takens' delay-coordinate method goes as follows. From a measured time series 𝑠 ( 𝑘 ) = 𝑠 ( 𝑡 0 + 𝑘 Δ 𝑡 ) with Δ 𝑡 being the sampling interval, the following vector quantity of 𝑚 components is constructed:

𝐮 ( 𝑡 ) = { 𝑠 ( 𝑡 ) , 𝑠 ( 𝑡 + 𝜏 ) , , 𝑠 ( 𝑡 + ( 𝑚 1 ) 𝜏 ) } , ( 2 . 2 6 ) where 𝑡 = 𝑡 0 + 𝑘 Δ 𝑡 , 𝜏 is the delay time which is an integer multiple of Δ 𝑡 , and 𝑚 is the embedding dimension. Clearly, what time lag 𝜏 to use and what dimension 𝑚 to use are the central issues of this reconstruction based on delay-coordinate embedding. Let us discuss under what condition of 𝜏 and 𝑚 , the reconstructed vector 𝐮 ( 𝑡 ) can represent the true trajectory of the unknown attractor.

2.4.1. Embedding Dimension

To have a faithful representation of the true dynamical system, the embedding dimension should be large enough. Takens' theorem provides a lower bound for 𝑚 . Let us figure out this bound. We note that in a space of dimension 𝑚 one subspace of dimension 𝑑 1 and another subspace of dimension 𝑑 2 generically intersect in subspaces of dimension 𝑑 𝐼 = 𝑑 1 + 𝑑 2 𝑚 . ( 2 . 2 7 ) If this is negative, then there is no intersection of the two subspaces. Figure 4 shows examples. In Figure 4(a) we have 𝑑 1 = 𝑑 2 = 1 and 𝑚 = 2 , thus we obtain 𝑑 𝐼 = 0 , which means that the intersection set consists of points, and the intersections are generic because small perturbations cannot remove them. In Figure 4(b) we have 𝑑 1 = 𝑑 2 = 1 and 𝑚 = 3 , thus we obtain 𝑑 𝐼 = 1 < 0 , which means that two one-dimensional curves do not intersect generally in a three-dimensional space. In Figure 4(c) we have 𝑑 1 = 1 , 𝑑 2 = 2 , and 𝑚 = 3 , thus we obtain 𝑑 𝐼 = 0 again, which means that the intersections are generic. To obtain a one-to-one correspondence between points on the invariant sets in the actual and reconstructed phase spaces, self-intersection must not occur. Otherwise, there will be two directions at the self-intersection, which will destroy the one-to-one correspondence.

fig4
Figure 4: Schematic illustration of intersection of the two subspaces. (a) 𝑑 1 = 𝑑 2 = 1 and 𝑚 = 2 ; (b) 𝑑 1 = 𝑑 2 = 1 and 𝑚 = 3 ; (c) 𝑑 1 = 1 , 𝑑 2 = 2 and 𝑚 = 3 .

When we ask about the intersection of two subspaces of the same dimension 𝑑 𝐴 , namely, the orbit with itself, then intersections fail to occur when 2 𝑑 𝐴 𝑚 < 0 or 𝑚 > 2 𝑑 𝐴 . ( 2 . 2 8 ) The question of unfolding a set of dimension 𝑑 𝐴 from projections to lower dimensions concerns self-intersections of the set with itself as the projections are made. So the relevant criterion for unfolding a single object is 𝑚 > 2 𝑑 𝐴 . We wish 𝑚 be an integer and let the lowest 𝑚 which satisfies the unfolding condition 𝑚 > 2 𝑑 𝐴 be the embedding dimension 𝑚 𝐸 . Thus, we have 𝑚 𝐸 = i n t ( 2 𝑑 𝐴 + 1 ) , where i n t ( ) means taking integer. For example, the box counting dimension of the strange attractor for the Lorenz model is 𝑑 𝐴 2 . 0 6 which would lead us to anticipate 𝑚 𝐸 = 5 to unfold the Lorenz attractor.

In some situations, the needed embedding dimension is very large such as the EEG (electroencephalogram) data. As large embedding dimension requires long time series ( 1 0 5 1 0 6 points) and are thus computationally expensive, we hope to relax the condition 𝑚 > 2 𝑑 𝐴 , especially for calculating some statistical quantities such as Lyapunov exponents and fractal dimension. To reduce the cost of calculation, we may choose 𝑑 𝐴 < 𝑚 < 2 𝑑 𝐴 , provided that the self-intersections is neglectable. Schroer et al. show that, provided that the dimension 𝑚 of measurement space is large than the information dimension 𝐷 1 of the underlying dynamics, a prediction based on the reconstructed self-intersecting attractor is possible most of the time [18].

2.4.2. Time Delay

The embedding theorem is silent on the choice of time delay to use in constructing 𝑚 -dimensional data vectors. Considering that the 𝑠 ( 𝑡 ) , 𝑠 ( 𝑡 + 𝜏 ) , , 𝑠 ( 𝑡 + ( 𝑚 1 ) 𝜏 ) in (2.26) are serving as independent variables in the reconstruction space, they should be independent from each other and thus the delay time 𝜏 needs to be chosen carefully. In this sense, the chosen of 𝜏 should satisfy three conditions.

(1)It must be some multiple of the sampling time 𝜏 𝑠 , since we only have data at those times.(2)If 𝜏 is too short, the coordinates 𝑠 ( 𝑡 ) and 𝑠 ( 𝑡 + 𝜏 ) will not be independent enough and the reconstructed attractor will be accumulated around the diagonal line. That is, we will not have seen any of the dynamics unfold in that time.(3)Finally, since chaotic systems are intrinsically unstable, if 𝜏 is too large, any connection between the measurements 𝑠 ( 𝑡 ) and 𝑠 ( 𝑡 + 𝜏 ) is numerically tantamount to being random with respect to each other.

Therefore, the delay time 𝜏 should be large enough so that 𝑠 ( 𝑡 ) and 𝑠 ( 𝑡 + 𝜏 ) are rather independent, but not so large that they are completely independent in a statistical sense. This is a difficult problem and can be solved by the mutual information whose first minimum is the good candidate of 𝜏 .

2.4.3. Mutual Information 𝐼 ( 𝜏 )

For a measured data 𝑠 ( 𝑡 ) , we create a histogram for the probability distribution of the data. Denote by 𝑝 𝑖 the probability that the signal assumes a value inside the 𝑖 th bin of the histogram, and let 𝑝 𝑖 𝑗 ( 𝜏 ) be the probability that 𝑠 ( 𝑡 ) is in bin 𝑖 and 𝑠 ( 𝑡 + 𝜏 ) is in bin 𝑗 . Then the mutual information for time delay 𝜏 reads 𝐼 ( 𝜏 ) = 𝑖 , 𝑗 𝑝 𝑖 𝑗 ( 𝜏 ) l n 𝑝 𝑖 𝑗 ( 𝜏 ) 2 𝑖 𝑝 𝑖 l n 𝑝 𝑖 . ( 2 . 2 9 ) In the special case of 𝜏 = 0 the joint probabilities 𝑝 𝑖 𝑗 = 𝑝 𝑖 𝛿 𝑖 𝑗 and the expression yields the Shannon entropy of the data distribution. Apart from this degenerate case, the value of the mutual information is independent of the particular choice of histogram, as long as it is fine enough. In the limit of large 𝜏 , 𝑠 ( 𝑡 ) and 𝑠 ( 𝑡 + 𝜏 ) have nothing to do with each other and 𝑝 𝑖 𝑗 thus factorizes to 𝑝 𝑖 𝑝 𝑗 and the mutual information becomes zero. The mutual information 𝐼 ( 𝜏 ) between measurements 𝑠 ( 𝑛 ) and time lagged measurements 𝑠 ( 𝑛 + 𝑇 ) is both easy to evaluate directly from the time series 𝑠 ( 𝑛 ) and easy to interpret. A very nice property of average mutual information is that it is invariant under smooth changes of coordinate system.

Fraser and Swinney suggest that one use the function 𝐼 ( 𝜏 ) as a kind of nonlinear autocorrelation function to determine the optimal 𝜏 [19]. The first minimum of 𝐼 ( 𝜏 ) marks the time lag value to use in time delay reconstruction of phase space where 𝑠 ( 𝑡 + 𝜏 ) adds maximal information to the knowledge we have from 𝑠 ( 𝑡 ) , or, in other words, the redundancy is least. Let us take the Lorenz system ̇ 𝑥 = 𝜎 ( 𝑦 𝑥 ) , ̇ 𝑦 = 𝛾 𝑥 𝑦 𝑥 𝑧 , ̇ 𝑧 = 𝑏 𝑧 + 𝑥 𝑦 as an example. Figure 5(a) shows how the mutual information 𝐼 ( 𝜏 ) changes with 𝜏 when the parameters are taken as 𝜎 = 1 6 , 𝛾 = 4 5 . 9 2 , 𝑏 = 4 . It is easy to see that the first minimum of 𝐼 ( 𝜏 ) occurs at 𝜏 = 1 0 . Figures 5(b) to 5(d) represent the reconstructed attractors by 𝜏 = 1 , 1 0 , 2 0 , respectively. Obviously, Figure 5(c) reconstructed by the first minimum of 𝜏 = 1 0 reflects the structure of double-scroll best.

fig5
Figure 5: (a) The mutual information 𝐼 ( 𝜏 ) versus the time delay 𝜏 for Lorenz system; (b) to (d) represent the cases where the attractors are reconstructed by 𝜏 = 1 , 1 0 , 2 0 , respectively.

The time delay can be also chosen by other methods such as the autocorrelation function [7]. It is pointed out the delay spanned window ( 𝑚 1 ) 𝜏 , rather than 𝜏 and 𝑚 separately, is a crucial issue for the attractor reconstruction. The reason is that the window ( 𝑚 1 ) 𝜏 determines the characteristics of the correlation integral whose linear part gives the correlation dimension 𝐷 2 . However, the first minimum of mutual information is not consistently successful in identifying the optimal window [20]. The optimal window length for successful embedding 𝜏 𝑤 should satisfy 𝜏 𝑤 𝜏 𝑝 with 𝜏 𝑝 being the mean orbital period of the underlying chaotic system and is approximated from the oscillations of the time series [21].

3. Modeling and Forecasting: Applications of Chaotic Time Series Analysis

The time series analysis has been widely applied in all the fields where the dynamical equations are unknown and only one or a few data are available. The most popular application is to calculate the dynamical parameters which partially reflect the properties of the attractor. For example, from a time series one can first use the delay-coordinate embedding technique to reconstruct the phase space and then calculate its dimensions and Lyapunov exponents and so forth. by the methods introduced in the previous section. As this part is relatively easy and straightforward, we will not focus on it in this section. We would like to introduce a few typical cases where the time series analysis is necessary and useful tool to tell more information on the system, not only limited to the attractor.

3.1. Constructing Dynamical Equations

The most important question of time series analysis may be that: Is it possible to figure out the ordinary differential equations (ODEs) from a scalar time series, which govern the behavior of the system? This problem has been well studied and the answer is yes. A lot of methods have been developed to solve this problem. We here introduce two of them, that is, the standard approach and synchronization approach

3.1.1. The Standard Approach

This approach assumes every ODE can be written in a standard form [22, 23]

̇ 𝑦 1 = 𝑦 2 , ̇ 𝑦 2 = 𝑦 3 , ̇ 𝑦 𝐷 𝑦 = 𝑓 1 , 𝑦 2 , , 𝑦 𝐷 , ( 3 . 1 ) where 𝑦 1 is an observable, 𝑓 is a polynomial of an order 𝐾 : 𝑓 𝑦 1 , 𝑦 2 , , 𝑦 𝐷 = 𝐾 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 𝑐 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 𝐷 𝑗 = 1 𝑦 𝑙 𝑗 𝑗 , 𝐷 𝑗 = 1 𝑙 𝑗 𝐾 . ( 3 . 2 ) For example, the Rössler system ̇ 𝑥 = 𝑦 𝑧 , ̇ 𝑦 = 𝑥 + 𝑎 𝑦 , ̇ 𝑧 = 𝑏 + 𝑧 ( 𝑥 𝑐 ) can be rewritten as ̇ 𝑥 = 𝑦 , ̇ 𝑦 = 𝑧 , ̇ 𝑧 = 𝑎 𝑏 𝑐 𝑥 + 𝑥 2 𝑎 𝑥 𝑦 + 𝑥 𝑧 + ( 𝑎 𝑐 1 ) 𝑦 + ( 𝑎 𝑐 ) 𝑧 ( 𝑦 / ( 𝑎 + 𝑐 𝑥 ) ) ( 𝑥 + 𝑏 𝑎 𝑦 + 𝑧 ) . The Lorenz system ̇ 𝑥 = 𝜎 ( 𝑦 𝑥 ) , ̇ 𝑦 = 𝛾 𝑥 𝑦 𝑥 𝑧 , ̇ 𝑧 = 𝑏 𝑧 + 𝑥 𝑦 can be rewritten as ̇ 𝑥 = 𝑦 , ̇ 𝑦 = 𝑧 , ̇ 𝑧 = 𝑏 𝜎 ( 𝛾 1 ) 𝑥 𝑏 ( 𝜎 + 1 ) 𝑦 ( 𝑏 + 𝜎 + 1 ) 𝑧 𝑥 2 𝑦 𝑥 3 𝜎 + ( 𝑦 / 𝑥 ) [ ( 𝜎 + 1 ) 𝑦 + 𝑧 ] . Then the task is to determine the coefficient from the data.

Reference [24] points that the above method is inefficient in many situations. In particular, universal models often contain a large number of coefficients and demonstrate divergent solutions. For solving this problem, [24] gives a modified approach as follows. Its main idea is to let the function 𝑓 depend explicitly on time and takes the form

𝑓 𝑦 1 , 𝑦 2 , , 𝑦 𝐷 = , 𝑡 𝐾 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 𝑐 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 + 𝑎 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 c o s 𝜔 𝑡 + 𝑏 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 s i n 𝜔 𝑡 𝐷 𝑗 = 1 𝑦 𝑙 𝑗 𝑗 , 𝐷 𝑗 = 1 𝑙 𝑗 𝐾 , ( 3 . 3 ) where the linear terms c o s 𝜔 𝑡 and s i n 𝜔 𝑡 is chosen for simple and effective approximation. In general, using higher powers of c o s 𝜔 𝑡 and s i n 𝜔 𝑡 are also possible. For (3.3), we still have one parameter to be determined, that is, the driving frequency 𝜔 or driving period 𝑇 . The idea to find 𝑇 is as follows [25]. First, a sufficiently big value of a polynomial order 𝐾 is selected. Second, an initial estimate 𝑇 = 𝑇 of the period value is found (this estimate can be derived as a location of a peak in the power spectrum of the observed series). At this value of 𝑇 , one obtains the values of 𝑐 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 , 𝑎 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 , 𝑏 𝑙 1 , 𝑙 2 , , 𝑙 𝐷 , and an approximation error 𝜖 by the linear least squares method. Then, the trial value of 𝑇 is varied through a certain range near the initial estimate 𝑇 = 𝑇 and approximation is performed for each of the trial values. The graph of 𝜖 ( 𝑇 ) has a sharp and deep minimum which corresponds quite precisely to the “true” value of the driving period.

3.1.2. Synchronization Approach

Recently, Sorrentino and Ott proposed an adaptive strategy that, by using synchronization, is able to obtain a set of ODE that describes the unknown real system [26]. They assume that the ODEs governing the system dynamics are expressible or approximately expressible in terms of polynomials of an assigned degree. Then they extract the whole set of parameters of the unknown system from knowledge about the dynamical evolution of its state vector and its first derivative. The detailed steps are as follows [26].

Consider a system ̇ 𝐱 = 𝐹 ( 𝐱 ) with 𝐱 = ( 𝑥 1 , 𝑥 2 , , 𝑥 𝑚 ) 𝑇 and 𝐹 ( 𝐱 ) = [ 𝑓 1 ( 𝐱 ) , 𝑓 2 ( 𝐱 ) , , 𝑓 𝑚 ( 𝐱 ) ] 𝑇 , where 𝑓 𝑖 ( 𝐱 ) is a degree two polynomial

𝑓 𝑖 ( 𝐱 ) = 𝑚 𝑚 𝑗 = 1 𝑘 = 1 𝑎 𝑖 𝑗 𝑘 𝑥 𝑗 𝑥 𝑘 + 𝑚 𝑗 = 1 𝑏 𝑖 𝑗 𝑥 𝑗 + 𝑐 𝑖 . ( 3 . 4 ) For example, in the case of Rössler system, 𝑚 = 3 and 𝑓 1 = 𝑥 2 𝑥 3 , 𝑓 2 = 𝑥 1 + 0 . 1 6 5 𝑥 2 , 𝑓 3 = 0 . 2 + ( 𝑥 1 1 0 ) 𝑥 3 , all the coefficients ( 𝑎 𝑖 𝑗 𝑘 , 𝑏 𝑖 𝑗 , and 𝑐 𝑖 ) are zero, except 𝑎 3 1 3 = 1 , 𝑏 1 2 = 1 , 𝑏 1 3 = 1 , 𝑏 2 1 = 1 , 𝑏 2 2 = 0 . 1 6 5 , 𝑏 3 3 = 1 0 , and 𝑐 3 = 0 . 2 . Although the system function 𝐹 is unknown, it is appropriate to try to model the dynamics of the true system by ̇ 𝐱 = 𝐹 ( 𝐱 ) with 𝑓 𝑖 𝐱 = 𝑚 𝑚 𝑗 = 1 𝑘 = 1 𝑎 𝑖 𝑗 𝑘 𝑥 𝑗 𝑥 𝑘 + 𝑚 𝑗 = 1 𝑏 𝑖 𝑗 𝑥 𝑗 + 𝑐 𝑖 . ( 3 . 5 ) Then the task is how to make 𝑎 𝑖 𝑗 𝑘 , 𝑏 𝑖 𝑗 , 𝑐 𝑖 evolve to the true coefficient 𝑎 𝑖 𝑗 𝑘 , 𝑏 𝑖 𝑗 , 𝑐 𝑖 . To accomplish this goal, we envision coupling the true system to the model system (3.4). It is then hoped that when the synchrony is achieved, the model coefficients will be a good approximation to the corresponding coefficients of the real system.

The coupling from the true system to the model is designed as follows: ̇ 𝐱 = 𝐹 𝐱 𝐻 𝐱 + Γ ( 𝐱 ) 𝐻 . ( 3 . 6 ) The quantity 𝐻 is in general an 𝑛 𝑚 vector of 𝑛 observable scalar quantities that are assumed to be known functions of the system state 𝐱 ( 𝑡 ) . Γ is an 𝑛 × 𝑚 constant coupling matrix. Here we assume 𝐻 ( 𝐱 ) = 𝐱 and Γ = 𝛾 𝐼 𝑚 , where 𝛾 is a scalar quantity and 𝐼 𝑚 is the identity matrix of dimension 𝑚 . To obtain the synchronized solution 𝐱 ( 𝑡 ) = 𝐱 ( 𝑡 ) , we introduce a potential Ψ 𝑖 = ̇ 𝑥 𝑖 𝑓 𝑖 ( 𝑥 1 , 𝑥 2 , , 𝑥 𝑚 ) 2 𝑣 , 𝑖 = 1 , 2 , , 𝑚 , ( 3 . 7 ) where 𝐺 ( 𝑡 ) 𝑣 denotes the sliding exponential average 𝑣 𝑡 𝑒 𝑣 ( 𝑡 𝑡 ) 𝐺 ( 𝑡 ) 𝑑 𝑡 . Note that Ψ 𝑖 is a function of time and also of the coefficients 𝑎 𝑖 𝑗 𝑘 , 𝑏 𝑖 𝑗 , and 𝑐 𝑖 . Also note that if 𝑥 1 ( 𝑡 ) , 𝑥 2 ( 𝑡 ) , , 𝑥 𝑚 ( 𝑡 ) are chaotic, then the quantities 𝑓 𝑖 ( 𝑥 1 ( 𝑡 ) , 𝑥 2 ( 𝑡 ) , , 𝑥 𝑚 ( 𝑡 ) ) , 𝑖 = 1 , 2 , , 𝑚 , vary chaotically as well. It is easy to see that the potential satisfies Ψ 𝑖 0 . Once Ψ 𝑖 = 0 , we have synchronization and the coefficients are obtained.

To make Ψ 𝑖 = 0 , we let the parameters 𝑎 𝑖 𝑗 𝑘 , 𝑏 𝑖 𝑗 , and 𝑐 𝑖 adaptively evolve according to the following gradient descent relations:

𝑑 𝑎 𝑖 𝑗 𝑘 ( 𝑡 ) 𝑑 𝑡 = 𝛽 𝑎 𝜕 Ψ 𝑖 𝜕 𝑎 𝑖 𝑗 𝑘 , 𝑑 𝑏 𝑖 𝑗 ( 𝑡 ) 𝑑 𝑡 = 𝛽 𝑏 𝜕 Ψ 𝑖 𝜕 𝑏 𝑖 𝑗 , 𝑑 𝑐 𝑖 ( 𝑡 ) 𝑑 𝑡 = 𝛽 𝑐 𝜕 Ψ 𝑖 𝜕 𝑐 𝑖 ( 3 . 8 ) 𝛽 𝑎 , 𝛽 𝑏 , 𝛽 𝑐 > 0 . Our hope is that 𝑎 𝑖 𝑗 𝑘 , 𝑏 𝑖 𝑗 , and 𝑐 𝑖 will converge under this evolution to the true parameter values, 𝑎 𝑖 𝑗 𝑘 , 𝑏 𝑖 𝑗 , 𝑐 𝑖 .

We consider the first equation of (3.8). Let ( 𝑓 𝑖 ) 𝑗 𝑘 denote 𝑓 𝑖 ( 𝑥 1 , 𝑥 2 , , 𝑥 𝑚 ) evaluated at 𝑎 𝑖 𝑗 𝑘 = 0 , we have 𝑓 𝑖 ( 𝑥 1 , 𝑥 2 , , 𝑥 𝑚 ) = 𝑎 𝑖 𝑗 𝑘 𝑥 𝑗 𝑥 𝑘 + ( 𝑓 𝑖 ) 𝑗 𝑘 . Substituting this into the right-hand side of the first equation of (3.8), we obtain 𝑑 𝑎 𝑖 𝑗 𝑘 ( 𝑡 ) 𝑑 𝑡 = 2 𝛽 𝑎 𝑎 𝑖 𝑗 𝑘 𝑥 𝑗 2 𝑥 𝑘 2 + ( 𝑓 𝑖 ) 𝑗 𝑘 𝑥 𝑗 𝑥 𝑘 ̇ 𝑥 𝑖 𝑥 𝑗 𝑥 𝑘 𝑣 . ( 3 . 9 ) Similarly letting ( 𝑓 𝑖 ) 𝑗 denote 𝑓 𝑖 ( 𝑥 1 , 𝑥 2 , , 𝑥 𝑚 ) evaluated at 𝑏 𝑖 𝑗 = 0 , the second equation of (3.8) gives 𝑑 𝑏 𝑖 𝑗 ( 𝑡 ) 𝑑 𝑡 = 2 𝛽 𝑏 𝑏 𝑖 𝑗 𝑥 𝑗 2 + ( 𝑓 𝑖 ) 𝑗 𝑥