Abstract

The aim of this study is to evaluate the filtering techniques which can remove the noise involved in the time series. For this, Logistic series which is chaotic series and radar rainfall series are used for the evaluation of low-pass filter (LF) and Kalman filter (KF). The noise is added to Logistic series by considering noise level and the noise added series is filtered by LF and KF for the noise reduction. The analysis for the evaluation of LF and KF techniques is performed by the correlation coefficient, standard error, the attractor, and the BDS statistic from chaos theory. The analysis result for Logistic series clearly showed that KF is better tool than LF for removing the noise. Also, we used the radar rainfall series for evaluating the noise reduction capabilities of LF and KF. In this case, it was difficult to distinguish which filtering technique is better way for noise reduction when the typical statistics such as correlation coefficient and standard error were used. However, when the attractor and the BDS statistic were used for evaluating LF and KF, we could clearly identify that KF is better than LF.

1. Introduction

Recently, the advances of radar rainfall estimates with high spatial and temporal resolution have demonstrated the prospect of improving the accuracy of rainfall inputs for the accuracy of real time flood forecasting. However, the advantage of the weather radar rainfall estimates has been limited by a variety of sources of uncertainty in the radar reflectivity process, including random and systematic errors. There are a lot of discussions on radar rainfall estimation errors [18].

There are several ways of filtering a signal in one or two dimensions. An example of one which is often applied is low-pass filtering, an operation which removes all components of the power spectrum whose frequency is higher than a chosen threshold. Having this into account, several approaches have been proposed to reduce radar errors. Panofsky and Brier [9] introduced a low-pass filter, which was borrowed from electrical engineering terminology, and removed the high variability of noise from the data and leave only the low frequency components. In the recent years, the filtering method has been applied to reduce noise of radar rainfall in some studies [1012]. The Kalman filtering approach has the main advantage of providing a real time scheme to calibrate radar rainfall estimates based on rain gauge measurements. The studies of Ahnert [13], Smith and Krajewski [14], Anagnostou and Krajewski [15], Seo et al. [16], Dinku et al. [17] and Chumchean et al. [18], Krajewski et al. [19], and Wang et al. [20] used Kalman filtering to predict and update the mean field bias in real time. There is also a methodology to decompose a multivariate signal into independent signals, namely, independent component analysis (ICA). It is efficient to decompose the complexity of the dynamics in the seismological and atmospheric field [20, 21].

These days, much amount of radar rainfall data is being produced, processed, and used. Also the radar rainfall series is widely applied to hydrologic applications such as flash flood forecasting. However, radar rainfall data include noise from many sources and there are lacks of noise reduction studies on the radar rainfall data itself. Therefore, this study analyzes noise of radar rainfall using chaotic dynamics which has nonlinear and aperiodic nature and filtering techniques for investigating radar rainfall characteristics.

To study the nonlinear characteristics of natural phenomena, many statisticians and scientists have suggested the chaos theory which analyze and forecast the nonlinear phenomena of the natural system. Lorenz [22] suggested the strange attractor in a simple model of convection roll in the atmosphere. Packard et al. [23] suggested the method of delays and Takens [24] proved the method of delays using differential topology. Fraser and Swinney [25] suggested a method for the estimation of time delay using the mutual information. Falanga and Petrosino [26] estimated the complexity of the system by the degrees of freedom necessary to describe the asymptotic dynamics in a reconstructed phase space.

All hydrological measurements are to some extent contaminated by noise. And the noise limits the performance of many techniques of identification, modeling, prediction, and control of deterministic systems [27]. Some of the most characteristic examples of the effects of noise are as follows: self-similarity of the attractor is broken; a phase space reconstruction appears as high-dimensional on small length scales; nearby trajectories diverge diffusively rather than exponentially; and the prediction error is found to be bounded from below no matter in which prediction method is used and to how many digits the data are recorded [28].

This study evaluates the noise cancellation capabilities of filtering techniques of low-pass filter (LF) and Kalman filter (KF). To do this, we regenerate chaotic data series and add noise to the series. And then, we perform the noise reduction analysis for the noise added chaotic series by using two filtering techniques and investigate the noise cancellation capabilities of the techniques by the attractor of the series and by the BDS statistic [2939]. The same analysis is also performed for the radar rainfall in this study.

2. The BDS Statistic and Noise Reduction Filters

2.1. Phase Space Reconstruction

The first step in metric analysis of a chaotic time series is the construction of an -dimensional embedding space from the scalar time series. This is done using the method of delays introduced by Packard et al. [23] and Takens [24], which has the advantage of distributing the noise equally among the components. A scalar time series , is embedded into -dimensional space by constructing the vectors where is the index lag and is embedding dimension, both of which must be chosen appropriately. If the sampling time is , the delay time is .

The reconstructed state variables need to be independent, and the quality of reconstructed attractor depends on the choice of the index lag . If the delay time is too small, the reconstructed attractor is compressed along the identity line, and this is called redundance. If is too large, the attractor dynamics may become causally disconnected, which is called irrelevance and which may cause the attractor to appear much more complex than it really is [40].

2.2. The BDS Statistic

The BDS statistic is derived from the correlation integral and has its origins in the recent work on deterministic nonlinear dynamics and chaos theory. Grassberger and Procaccia [41] introduced the correlation integral as a method of measuring the fractal dimension of deterministic data. It is measure of the frequency with which temporal patterns are repeated in the data. The correlation integral at embedding dimension is given by where , if , , if .

And is the size of the data sets, is the number of embedded points in -dimensional space, and denotes the sup-norm. measures the fraction of the pairs of points , , whose sup-norm separation is not greater than . If the limit of as exists for each , we write the fraction of all state vector points that are within of each other as .

If the data are generated by a strictly stationary stochastic process which is absolutely regular, then this limit exists. In this case the limit is as follows:

When the process is IID, and since , (3) implies that . Also has asymptotic normal distribution, with zero mean and variance as follows:

We can consistently estimate the constants by and by

Under the IID hypothesis, the BDS statistic for is defined as which has a limiting standard normal distribution under the null hypothesis of IID as and obtains its critical values using the standard normal distribution.

Before applying the BDS statistic, the first addressed issue is which region of “” yields BDS statistics that are well approximated by the asymptotic distribution. As the sample size is increased, the distribution of the BDS statistic becomes more normal. So the minimal number of data must be provided. Next, the region of embedding dimension “” should be suggested. If the sample size is fixed, we expect the finite sample property to worsen as “” increases. This study follows the recommendation of Brock et al. [29] for selecting the ranges of , , and . The embedding dimension is used in the range of . Then, the value of “” is selected as the half standard deviations of the data sets.

2.3. Kalman Filter and Low-Pass Filter

The Kalman filter (KF) was introduced by Kalman’s famous paper describing a recursive solution to the discrete data linear filtering problem [42]. The Kalman filter algorithm can be applied as an estimator of the state of a dynamic system described by the linear difference equation: where matrix is the relationship between the state of the system, described by the column vector at time , and the state at time . The column form of a vector is considered to be the normal form and a transpose into a row vector will be denoted by ; this is also the notation used for the transpose of matrices. Matrix relates the optional control input vector to the state and the vector is the process noise.

The system is then measured at discrete points in time, where the measurement vector is related to the true state of the system by the equation where vector represents the measurement noise and is the observation matrix.

They are assumed to be independent (of each other), white, and with normal probability distributions:

In practice, the process noise covariance and measurement noise covariance matrices might change with each time step or measurement; however here we assume they are constant.

The error covariance matrix of the estimate is

The estimate vector is referred to as the a posteriori estimate since it is an estimate of the system at time .

A posteriori state estimate is computed as a linear combination of a priori estimate and a weighted difference between an actual measurement and a measurement prediction as follows:

The matrix in (13) is chosen to be the gain that minimizes the a posteriori error covariance equation.

Kim [43] shows how to track a varying signal and at the same time reduce the influence of measurement noise by using a 1st order low-pass filter (LF) (an exponentially weighted moving average filter) described as where is a constant in the range of .

The expression for the computation of the a posteriori estimate in (12) is very similar to the 1st order low-pass filter with the significant difference lying in the varying Kalman filter gain instead of the constant .

3. Noise Reduction Studies on Logistic and Radar Rainfall Series

3.1. Noise Influence on Logistic Series
3.1.1. Attractor Characteristics in Noise Level

To study the effects of noise in a time series, we added Gaussian noise to the time series. Specifically, it considered the noise added time series to test the noise effect in the original time series and define the as where is the noise level, is a standard deviation of , and Gaussian noise has .

May [44] emphasized that a simple nonlinear map may have very complicated dynamics and showed his point with Logistic map which is a discrete time analog for population growth. Logistic map is defined as where is between 0 and 4. For small values of , the system is stable and well behaved; however, as the value of approaches 4, it becomes chaotic. We simulate Logistic map sequence of and add noise to it with the noise levels 10%, 50%, and 100%. Logistic series with noise are shown in Figure 1. Here, is standard deviation of the sample series in Figure 1.

The attractor of each Logistic series is reconstructed in phase space and the characteristics of the series can be identified (Figure 2). For the reconstruction of the series using (1), the embedding dimension and delay time are used (Figure 2). The autocorrelation function (ACF) is expected to provide a reasonable measure of the transition from redundance to irrelevance as a function of delay. It is considered that the decorrelation time equals the lag (delay time: ) at which the ACF first attains the value zero. Otherwise, should be chosen as the local minimum of ACF, whichever occurs first [45, 46]. When the ACF decays exponentially, we select at which the ACF drops [47].

The original Logistic series which has one variable shows its attractor with a simple quadratic form (Figure 2(a)). However, as the noise level is increased, the attractor is becoming more and more complicated form which it is high-dimensional series (Figures 2(b)2(d)). For the noise level = 100% especially, the attractor looks like random series.

3.1.2. Noise Reduction Studies of Logistic Series

This section studies the noise reduction of the noise added Logistic series using LF and KF. Noise cannot be forecasted but statistically estimated and the parameters of LF and KF are calibrated by trial and error method. The constant in (14) for LF is used, the process noise covariance in (9) for KF is applied, and the measurement noise covariance in (10) is used. The results of noise reduction studies using LF and KF are shown in Figure 3. When noise level is smaller, noise can be removed more effectively by LF and KF. Table 1 is showing the statistical results for noise reduction analysis by LF and KF. LF has the coefficient of correlation 0.50–0.70 and standard error 0.30–0.25 and KF has 0.93–0.99 and 0.04–0.1. Therefore, KF can reduce noise more effectively than LF.

The attractors for noise removed Logistic series by LF and KF are reconstructed in phase space (Figure 4). The noise removed series by LF show their attractors which still have noisy shapes (Figures 4(a)4(c)) but the noise removed series by KF show more clear attractors which describe the characteristics of Logistic map (Figures 4(d)4(f)). Even though KF is more effective way for removing noise in the series, it is difficult to restore it to the original state. If we investigate the range of the values of the series generated from Logistic equation, we can find that the values of the original series are in the range of 0 to 1. The values of noise removed series by LF and KF for the series having noise level = 100% are investigated and the values by LF and KF are in the range of −1 to 2 (Figure 4(c)) and 0 to 1 (Figure 4(f)), respectively. Therefore, the result is showing that KF is more proper tool for the noise reduction of the series.

The BDS statistic was applied for testing for nonlinearity of each data series. Not only is it useful in detecting deterministic chaos, but it also serves as a residual diagnostic. If the model (null hypothesis) is correct, then the estimated residuals will pass the test for IID (independently and identically distributed). A failure to pass the test is an indication that the selected model is misspecified. Here the confidence interval (C.L.) of 95% which is a significance level of 5% is used for the randomness test of a time series. The original series, noise added series, and noise removed series of Logistic map are analyzed by the BDS statistic for their randomness and nonlinearity. And the results are shown in Table 2. The original series is showing its nonlinearity and the series with noise level of 100% which has bold type in Table 2 represents its randomness. The noise removed series by LF which is in bold type in Table 2 is also showing its randomness. The bold type describes that the null hypothesis can be accepted and the null hypotheses of the series in all other columns except for two columns of the series with noise and LF with cannot be accepted. If we see the values of the BDS statistic in Table 2, KF has more similar BDS statistic values with the values of original series than LF. Therefore, it can be identified that KF is more proper tool for noise cancellation than LF.

3.2. Noise Influence on Radar Rainfall Series
3.2.1. Radar Rainfall Series and Its Attractor

Radar rainfall is a representative hydrologic data which includes noise from many sources. This study uses the radar rainfall obtained from the radar in Biseul Mountain radar (BSL radar) in Gyeongbuk province, Korea. The radar rainfall series in Gamcheon watershed especially which is produced in BSL radar is used for analyzing the series characteristics according to noise cancellation by LF and KF. BSL radar was constructed in 2009 and it is dual polarization radar. The radar has temporal and spatial resolutions of 2.5 min and 250 m × 250 m. Therefore, BSL radar rainfall series of 2.5 min-time interval is obtained with the data period of 6/24/2011 09:00–6/26/2011 11:00 (about 3000 min; average: 1.73 mm, standard deviation: 1.53 mm).

The ACF of radar rainfall series was exponentially decreased and so the delay time was selected as (lag ) at which the ACF drops (Tsonis and Elsner [47]). The time series plot and the reconstructed attractor of radar rainfall are shown in Figure 5. Even though the ACF showed the persistence of radar rainfall series, the attractor is complicated and we can know that the radar rainfall is greatly influenced by noise.

3.2.2. Noise Reduction Studies of Radar Rainfall Series

This section applies LF and KF for the noise reduction study of radar rainfall series and the constant in (14) for LF is used, the process noise covariance in (9) for KF is applied, and the measurement noise covariance in (10) is used. The raw data series of radar rainfall and the results of noise reduction studies using LF and KF are shown in Figure 6. The magnified red box in Figure 6 is for heavy rainfall period. Table 3 is showing the statistical results for noise reduction analysis by LF and KF for the radar rainfall series. LF has the coefficient of correlation 0.994 and standard error 0.156 and KF has 0.989 and 0.231. In this case, there is not much difference between LF and KF. Therefore, both filtering techniques show the similar function for removing noise involved in radar rainfall.

The attractors for noise removed radar rainfall series by LF and KF are reconstructed in phase space (Figure 7). The noise removed radar rainfall series by LF shows that its attractor is more simplified shape (Figure 7(a)) than the original attractor (Figure 5(b)). Also, the noise removed radar rainfall series by KF shows that its attractor (Figure 7(b)) is more clear shape than the attractor by LF (Figure 7(a)). Therefore, the attractor in which the noise of radar rainfall series is removed by LF and KF can be more clearly identified. In this case, the attractor by KF is clearer than by LF.

The original radar rainfall series and noise removed series by LF and KF are analyzed by the BDS statistic for their randomness and nonlinearity. And the results are shown in Table 4. The original radar rainfall series is showing its nonlinearity in the radar rainfall column of Table 4 and the series after removing the noise by LF and KF are also showing their nonlinearities in columns of low-pass Filter and Kalman Filter of Table 4. If we see the BDS statistic values, KF has the largest values, LF has next, and the original radar rainfall has the smallest values. This means that the noise removed radar rainfall series by KF is better than LF for noise reduction and for describing the nonlinearity of the radar rainfall.

4. Summary and Conclusions

This study investigated the filtering techniques for removing the noise involved in Logistic series and radar rainfall. The chaotic dynamics and the BDS statistic were used for analyzing the time series which are associated with noise. Logistic series with noise level were used for evaluating the filtering techniques of LF and KF. The analysis for the evaluation of LF and KF was performed by phase space reconstruction and the BDS statistic from chaos theory. As the noise level is increased, the characteristics of Logistic series were becoming random and this phenomenon was also occurred in the attractors and the BDS statistic analysis. The applications of LF and KF to the noise added Logistic series showed that KF reduced noise more clearly involved in the Logistic series than LF.

The noise in radar rainfall series was removed by LF and KF. Then the attractor and the BDS statistic were used for evaluating the filtering techniques. It was difficult to distinguish which filtering technique is better when the correlation coefficient and standard error were used for evaluating LF and KF. However, the attractor and the BDS statistic gave us more clear answers for the determination of the proper filtering technique. In this study, we have shown that KF is better technique than LF and chaos theory can be applied for investigating the characteristics of the time series.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This study was conducted with financial support from the Korean Institute of Civil Engineering and Building Technology’s Strategic Research Project (Operation of Hydrological Radar and Development of Web and Mobile Warning Platform). Also, this work was supported by the National Research Foundation of Korea Grant funded by the Korean Government (NRF-2009-220-D00104).