Abstract

The reasonability of artificial multi-point ground motions and the identification of abnormal records in seismic array observations, are two important issues in application and analysis of multi-point ground motion fields. Based on the dynamic time warping (DTW) distance method, this paper discusses the application of similarity measurement in the similarity analysis of simulated multi-point ground motions and the actual seismic array records. Analysis results show that the DTW distance method not only can quantitatively reflect the similarity of simulated ground motion field, but also offers advantages in clustering analysis and singularity recognition of actual multi-point ground motion field.

1. Introduction

Considerable research studies have shown that nonuniform ground motion input has a significant influence on dynamic response of large scale structures [1, 2]. The influence of multi-point ground motion must be taken into account when analyzing the seismic response of large scale structures. Ground motion records from common stations only reflect one- or multidimension vibration of sites, while multi-point ground motion fields are often obtained from seismic arrays. For lack of actual records of seismic arrays, attenuation rules of actual multi-point ground motion field are usually used to synthesize multi-point ground motions for specific site conditions and distances. In this process, the traveling wave effect and coherence of multi-point ground motions recorded in seismic arrays with various distances are first analyzed, and then the coherence corresponding to various points are obtained through statistical analysis. Finally, the method for synthesizing spatial ground motions with the coherence effect and traveling effect of multi-point earthquake ground motions considered can be established [35]. In this method, the variations of ground motions with site conditions and motion propagation can be controlled in a certain range in the extent of engineering scales for both actual array records and artificially synthesized multi-point earthquake ground motions. Multi-point ground motion field in the engineering scale may have a certain degree of similarity. Moreover, if the site condition is homogeneous, the degree of similarity will gradually decrease with the increase in distance [6].

Based on the knowledge that multi-point ground motion fields have similarity in engineering scale, there are at least two issues that can be addressed by determining the similarity of ground motions. First, the reliability of actual array records can be ensured. Since all actual array records are obtained from ground motion recording instruments, the reliability of records from all stations cannot be guaranteed in a particular triggering earthquake. Due to equipment failure or equipment differences some station records may be seriously distorted. When to use seismic array records for analyzing the propagation of multi-point ground motions, the distorted data should be removed to ensure the reasonability of array records and such distorted data can be excluded by using similarity test of actual multi-point ground motions. Second, the reasonability of artificially synthesized multi-point ground motions can also be tested. By judging whether the artificial multi-point ground motion field is similar to actual ground motion field, the efficiency and rationality of the synthesis method for multi-point ground motions can be tested.

For this reason, it is necessary to establish a standardized and efficient similarity test method for multi-point ground motions. For similarity judgment of ground motions, correlation coefficients are frequently used. However, correlation coefficients are mainly defined as a measurement of linear feature of two variables, and they are not suitable for similarity judgment between two sequences with nonlinear correlations. Spatial multi-point earthquake ground motions are often influenced by site conditions, coherence effect and traveling effect, therefore the use of the correlation coefficient is insufficient to reflect the similarity of ground motions. Figures 1 and 2 show two sets of multi-point ground motions synthesized using the trigonometric series method [7] with two different initial phase values. For condition 1, the initial phase is the actual ground motion phase, and for condition 2, the initial input phase is random. In both cases, the distance interval of synthesized ground motions is 200 m. Figure 3 shows the correlation coefficients between the artificial ground motion and the actual ground motions. For condition 2, due to the impact of the random initial phase value, the nonlinear relationships among ground motions of various points are more prominent, which indicates less correlations among various points. It is clear that in condition 2 the similarity among multi-point ground motions is not correctly presented.

Similarity measurement of time series is an important problem in data mining. By using the basic theory of similarity measurement, it not only can reflect the intrinsic similarity of time series or eigenvectors but also can quantitatively evaluate their similarity features. Similarity measurement has been an important tool in series clustering, pattern matching, classification, rule identification and anomaly detection [8, 9]. Among the methods commonly used for similarity measurement, the dynamic time warping distance (DTW) method proposed by Berndt and Clifford[10] is the most widely used,which can effectively determine the similarity between two series under the circumstances of shift and stretching of amplitude, and the bending of time axis [11]. In this study, the DTW method is applied to the similarity determination and singularity recognition of multi-point ground motions.

2. The Basic Principle of Similarity Measurement Method

In general, the similarity measurement between series means that a function can be defined where X and Y are two different time series of the same type of data set. The value of the function lies in , that is, the bigger the value is, the higher the similarity degree between two series is, and vice versa. In particular, two series are completely similar when . For simple time series, the similarity measurement function can be represented using correlation coefficients or cosine value between two series, while for the complex data, it is difficult to accurately represent the similarity degree using above function. Therefore, the similarity degree between the series is generally represented by defining a specific distance between two series called similarity distance. There are several methods available to define the similarity distance and the Minkowski distance is most frequently used, which is defined as follows: When , the distance between two series is called Euclidean Distance.

Referring to formula (2.1), if two series are equal, their distance is zero, which means two series are completely similar. The larger difference between two series yields larger distance and less similarity. When to calculate the Minkowski distance, it is required that two series have the same length, the values of two series have point-to-point correspondence, and the weight of each pair of difference is equal. Due to such a correspondence, the Minkowski distance cannot be applied to the similarity measurement of complex series with shift and stretching of amplitude. To solve this problem, the dynamic time warping distance method is often used. In this method, the distance is designated to depict the greatest similarity between series by calculating the minimum distance between them, which is defined as follows.

Let and be two series with the length of and , respectively, and an matrix can be defined to represent the point-to-point correspondence relationship between and , where the element indicates the distance between and . Then the point-to-point alignment and matching relationship between and can be represented by a time warping path , where the element indicates the alignment and matching relationship between and . If a path is the lowest cost path between two series, the corresponding dynamic time warping distance is required to meet where indicates the distance represented as on the path .

Then the formal definition of dynamic time warping distance between two series is described as where indicates empty series, indicates a subarray whose elements include the second element to the final element in an one-dimension array, indicates the distance between points and which can be represented by the different distance measurements, for example, Euclidean Distance. The distance of two-time series can be calculated by the dynamic programming method based on accumulated distance matrix [10], whose algorithm mainly is to construct an accumulated distance matrix:

Any element in the accumulated matrix indicates the dynamic time warping distance between series and . Series with high similar complexity can be effectively identified because the best alignment and matching relationship between two series is defined by the dynamic time distance.

3. Similarity Analysis of the Simulated Multi-Point Ground Motions

For structural dynamic analysis, artificial multi-point ground motion field plays a significant role. Current methods of artificial ground motion mainly cover triangular series method and Green function method [3]. According to the general rule of multi-point ground motion field in engineering scale, the multi-point ground motion fields simulated using above methods are of high similarity. Due to site effect, wave propagation and other factors, multi-point ground motions often show relatively more complex nonlinear relationships such as time shift and stretching. The correlation coefficient method shown in Figure 3 cannot accurately represent the similarity of simulated ground motions. For ground motions at the points with different distances, the corresponding dynamic warping time distances between them and the input ground motion can be calculated via the method discussed above. Figure 4 shows the dynamic time warping distance between the points and the input ground motion under condition 2 (Figure 2). It can be seen that the dynamic time warping distances between the points and input ground motion is basically around 40, indicating that ground motions of the points have an identical degree of similarity.

However, because different multi-point ground motion fields have different dynamic time warping distances, the degree of similarity between ground motions of various points cannot be intuitively presented. If a maximum distance (or lowest similarity) can be approximately defined between the input ground motion and a assume time series, the similarity degree of ground motions can be calculated numerically. That is to say, if regard the dynamic time warping distance between a particular time series and the input ground motion as a reference standard of maximum distance, the similarity degree of the particular time series and input ground motion can be consider as zero. In this paper, according to the basic understanding that the values of ground motion are not always maintained at the peak value as time increasing, a particular time series can be approximately assumed whose values are constantly the peak value of input ground motion as time increasing. If the dynamic time warping distance between the particular series and the input ground motion is define as D (the maximum distance), then according to the dynamic time warping distance between the points and input ground motions, the approximate degree of similarity between the points and input ground motions can be calculated as follows:

Figure 5 shows the degree of similarity between the points of simulated multi-point ground motion filed in Figure 2 and the input ground motion. By comparing with the results of similarity degree of multi-point ground motion using correlation coefficient as shown in Figure 3, it can be found the degree of similarity via dynamic warping time distance is more reasonable.

4. Clustering Analysis and Singularity Recognition of Multi-PointGround Motions

More generally, for ground motions in an unknown multi-point ground motion field, if a certain ground motion is seriously distorted, it can also be effectively identified using the dynamic time warping distance. The DTW method provides a basis for clustering analysis and singularity recognition of multi-point ground motion data. To validate the efficiency of this method, the real ground motion (El Centro 1940) recorded at Imperial Valley, California, USA is used whose time history of acceleration is shown in Figure 6.

Firstly, an experiment is performed using the artificially simulated ground motion field with multi-point similarity. Figures 7 and 8 show the time-history curves of artificial ground motions at some points as shown in Figures 1 and 2. By comparing the propagation rule and waveform of El Centro NS and the artificial motions, we can see there are obvious differences. If substitute El Centro wave for the original input seismic wave, a multi-point ground motion field including artificial singular ground motion can be established. Obviously, in this artificially synthesized ground motion field, El Centro wave should be the singular seismic wave. Using the dynamic warping distance method, the similarities of multi-point ground motion field in Figures 1 and 2 are calculated, respectively, and the results are shown in Tables 1 and 2. It can be seen in both Tables the dynamic warping distances between El Centro wave and all other points (indicated in the table as bold characters) show consistent anomalies. It can be concluded that El Centro is the singular seismic wave while the rest are in the same type of ground motion field.

Secondly, the data recorded at the SMART1 Array located in Lotung, Taiwan during Event 40 are used (Figure 9). Similarly, the El Centro wave was introduced in the array records to calculate the dynamic time warping distance between various points. The values are shown in Table 3. It is seen that the singular earthquake ground motion can be accurately determined via dynamic time warping distance calculation.

5. Conclusion

There exists similarity among the ground motions of multi-point ground motion filed and the degree of similarity can be accurately evaluated using the dynamic time warping distance method.

The dynamic time warping distance method is an efficient method for singularity recognition of actual array data, and it can be used in the preprocessing and clustering analysis of actual array data of multi-point ground motion field.

Acknowledgment

This work was supported by National Natural Science Foundation of China under Grant no. 90815011 and Program for New Century Excellent Talents in University of China under Grant no. 06-0765.