Abstract

A nonparametric, data-driven methodology of monitoring for geotechnical structures subject to long-term environmental change is discussed. Avoiding physical assumptions or excessive simplification of the monitored structures, the nonparametric monitoring methodology presented in this paper provides reliable performance-related information particularly when the collection of sensor data is limited. For the validation of the nonparametric methodology, a field case study was performed using a full-scale retaining wall, which had been monitored for three years using three tilt gauges. Using the very limited sensor data, it is demonstrated that important performance-related information, such as drainage performance and sensor damage, could be disentangled from significant daily, seasonal and multiyear environmental variations. Extensive literature review on recent developments of parametric and nonparametric data processing techniques for geotechnical applications is also presented.

1. Introduction

Restoring and improving urban infrastructure is recognized by the National Academy of Engineering as one of the fourteen grand challenges for engineering (NAE, [1]), and according to the 2009 ASCE Report Cards for Americas Civil Infrastructure, the current condition of U.S. infrastructure is rated “D” [2]. Aging civil infrastructure including bridges, levees, and dams in the US is calling for urgent measures focusing on maintenance, repair, and renovation. Geotechnical structures, compared to other types of civil infrastructure, are more vulnerable to nature and human-induced hazards. For example, Landslides in the Pacific Coast, the Rocky Mountains, the Appalachian Mountains, Hawaii, and Puerto Rico regions cause fatalities of 25 to 50 per year and direct/indirect economic losses up to $3 billion per year [3].

Structural health monitoring (SHM) is an emerging technique for the assessment of structural condition, hazards, and risks, consisting of three major components: sensing and instrumentation, data communication and archiving, and data analysis and interpretation. With the advent of todays powerful digital media and Internet, the needs for the first two components have been readily filled in many cases, but serious technical challenges still exist on the third component; how to process voluminous sensor data to obtain critical information for decision making? The research community is caught overwhelmed with the complex and extensive nature of field data associated with various factors of geotechnical phenomena. Some important challenges in processing field measurements are as follows.(1)How can performance-related information (e.g., condition of drainage systems) be disentangled from the causes of various environmental factors (e.g., diurnal and seasonal temperature change)?(2)Field measurements are expensive and technically difficult, especially when the monitoring is long term. How can one perform reliable estimation with insufficient sensor data without sacrificing the accuracy?(3)Extensive modeling efforts are required in current structural health monitoring practices for geotechnical structures. How can one reduce modeling efforts for geotechnical structures, whose material and structural characteristics are various?(4)How can one deal with unavoidable and unpredictable sensor/instrument network problems and loss of subsets of sensor data, which are commonly encountered in field data collection?

This paper discusses reliable monitoring methodology for geotechnical structures that is subject to long-term environmental change with very limited sensor measurements. The objective of the methodology is to provide the information of when, where, and how confidently field engineers should be deployed to the monitoring site for potential hazards on structural performance. The methodology should be robust enough to deal with unavoidable malfunctioning of instrumentation devices during data collection.

This paper is organized as follows: some definitions and dilemma in current monitoring practices are discussed in Section 2. Sensing and modeling strategies of monitoring for complex geotechnical systems are discussed in Section 3. Understanding system identification techniques is important to develop reliable monitoring methodology. Recent developments of modeling and system identification techniques have been discussed: parametric approaches in Section 4 and nonparametric approaches in Section 5. A case study was conducted to demonstrate how monitoring methodology developed by the authors can be applied to realistic problems. The analysis results for a full-scale retaining wall subject to long-term environmental change are discussed in Section 6.

2. Some Definitions and Dilemma in Current Monitoring Practices

Inverse analysis and system identification techniques are necessary tools to evaluate current performance of civil infrastructure systems using field measurement data. A system in inverse analysis can be expressed with a cause-response model, which consists of the causative force, system characteristics function, and system response as shown in Figure 1. The causative force is usually external forces (e.g., soil pressure), and the system response is usually the resulting deformation (e.g., displacement). The system characteristic function determines system properties with linear or nonlinear relationships between the system input and output associated with spatial and temporal variation of soil properties and highly variable soil conditions.

When earth structures are exposed to significant environmental variation (e.g., temperature and precipitation), system identification becomes more complicated because the system response reflects the combined effects of loads and environmental factors. This is where the conventional parametric approaches of system identification become difficult to implement.

The nonparametric methods, on the other hand, are data-driven identification techniques that do not require a priori knowledge on physics of target systems. Consequently, without relying on idealization and simplification in modeling, the same data processing methodology is applicable to different structure types. The nonparametric methods are also advantageous in dealing with deteriorating structures since nonparametric models are more flexible in dealing with time-varying systems than the parametric ones, which are modeled with physical assumptions and would not be valid once target structures are damaged.

So far, system identification of geotechnical structures is primarily done using the parametric methods. In long-term monitoring of geotechnical systems, however, there could be significant discrepancy between system behavior and corresponding models for two reasons. First, soil conditions are highly variable. Although high-fidelity models coupled with complex soil behavior are already available (e.g., coupled thermo-hydro-mechanical models), to collect all necessary sensor data for parametric identification is very expensive and it is usually not feasible. Due to insufficient data for sophisticated models, simpler models are often employed, which ignore many significant environmental factors. Consequently, parameter estimation becomes inaccurate due to oversimplification. Second, structures deteriorate over time. A common challenge in modeling deteriorating systems is that deterioration could result in not only changes in system parameter values but also transformation of the monitored system into different classes of nonlinear systems. Moreover, the characteristics of the damaged systems are usually unknown, so that the systems cannot be parametrically modeled prior to the occurrence of actual damage.

One drawback of existing nonparametric approaches is that physical interpretation on identification results is not as straightforward as that of the parametric methods, whose system parameters possess physical meaning (e.g., Youngs modulus). Although some nonparametric approaches were used in geotechnical applications, obtaining important performance-related information for decision making in maintenance has been rarely emphasized in this class of methods. For example, the nonparametric Artificial Neural Networks technique that will be described in Section 5.3 has been employed as an alternative approach to parametric regression methods using soil constitutive models (e.g., elastoplastic models) that will be described in Section 4.1, to identify complex nonlinear stress-strain relationship of soil. When soil strength is degraded, unlike the parametric methods, the nonparametric method could detect the change in soil mechanical properties, but it would not be able to interpret what types of physical change it is from the identification results. In order to overcome the above dilemma in current monitoring practices, it is desirable to take the advantages from both sides: modeling flexibility from the nonparametric methods and physical interpretation from the parametric methods.

3. Sensing and Modeling Strategies

To reduce high costs of sensor data collection associated with a high degree of spatial and temporal variability for geotechnical structures, the selection of what to be measured is a critical issue. Three options are possible in sensing: causative forces, environmental factors, and system response in Figure 1. The system response is desired to measure since the other two do not contain the information of system characteristics; the system response has the most abundant information about the entire system containing the effects of all components of causative forces, environment and system characteristics function. Using data that contain the information of the system characteristics is particularly important when one deals with deteriorating structures.

A challenge, however, in dealing with the system response data is that it is usually difficult to interpret raw sensor data directly due to interrelated effects of the components in the system. Thus, some kind of disentanglement techniques will be needed to decompose the data into more easily manageable and physically understandable forms.

To explain modeling strategies, Figure 2 summarizes the differences in system identification between parametric and nonparametric methods.

In nonparametric methods, response-only (or output-only) data are processed to find mathematical relationships embedded in the data. In order to deal with complicated raw system response (or system output) data, some disentanglement techniques will be used prior to modeling. Once the system response data are processed, additional data of the causative forces (or system input) and/or environmental factors can be used as a posteriori information for physical interpretation. In model construction, therefore, the monitoring methodology does not require explicit relationships between the system input, environment and system output, which are generally not known in geotechnical applications.

The above sensing and modeling methodology has several important advantages over existing (parametric) approaches, particularly in monitoring applications.(1)Oversimplification problems can be avoided especially when actual systems are complex and data are insufficient for sophisticated (parametric) input-output models since the modeling process is solely data driven using response-only data.(2)Modeling time and effort can be reduced significantly by using the same data processing procedures for different structure types since the proposed approach is not limited to a specific type of structure (i.e., the model is not based on physical assumptions). For the same reason, the same procedures can be used for different sensor types.(3)The proposed approach is more advantageous than conventional parametric approaches in dealing with deteriorating structures often associated with unknown time-varying system characteristics.

4. Review of Parametric Approaches

In this section, recent developments of the parametric approaches have been reviewed to provide background of parametric modeling, estimation, and optimization techniques.

4.1. Modeling

Two parametric modeling approaches for geotechnical systems are discussed: soil constitutive model and coupled thermo-hydro-mechanical models.

4.1.1. Soil Constitutive Models

There exist various soil constitutive models. In elastic model as the simplest constitutive model, the strain is assumed to be sustained under the applied load. Thus, the elastic strain is reversible, and if applied load is removed, the material springs back to its undeformed condition. Using elastoplastic models, the level of model complexity increases by adding the effects of irreversible plastic strains, and the soil is assumed to sustain both elastic and plastic strain. Therefore, if the load is removed, the soil sustains permanent plastic deformation, whereas elastic strain is recovered. Consequently, a key issue in the elastoplastic modeling exists in describing the material plasticity. A branch of plastic modeling is based on the concept of perfect plasticity [5]. Some examples include the Tresca model and the von Mises model for perfect plasticity in cohesive soils, Mohr-Coulomb model, Drucker-Prager model, Lade-Duncan model, Matsuoka-Nakai model, and Hoek-Brown model for perfect plasticity in frictional material.

Another branch of plasticity modeling adopts the concept of critical states. In this modeling approach, the soil is characterized with three major parameters: the mean effective stress, shear stress, and soil volume (or void ratio) [6]. The original Cam clay model and the modified Cam clay model belongs to this category. The original Cam clay model was developed by researchers at Cambridge University as the first critical-state models that predict unlimited soil deformations without change in stress or volume when the critical state is reached in soft soil [7]. The modified Cam clay model assumes that the voids between the solid particles are only filled with water (i.e., fully saturated). The modified Cam clay models are formulated based on plasticity theory; when the soil is loaded, saturated water is expelled from the voids between the solid particles, and, consequently, significant irreversible plastic volume change occurs. Some limitations of the Cam clay models are described in Yu [5]. General descriptions on soil constitutive models can be found in Yu [5], Ling et al. [8], and Hicher and Shao [9].

4.1.2. Coupled Thermo-Hydro-Mechanical (THM) Models

Geotechnical systems subject to environmental change usually behave as complex coupled thermo-hydro-mechanical (THM) systems. Researchers in geotechnical engineering have developed a number of the THM models, including (1) coupled models for heat, moisture, and/or air transfer [1020], (2) granular-level freezing process of pore water in soil-like porous media [2126], and (3) frost heaving in earth structures [2738].

The THM models express the sophisticated coupled relationships of heat and moisture transfer in deformable partially saturated soil [15]. The freezing process influenced by the interactions between water, temperature, and stresses in soil; water migrates to freezing fronts, and the frozen soil can contain unfrozen water below the freezing temperature; the water glaciation is influenced by the state of stress [38]. The formulation usually involves interrelated PDEs of thermoelasticity of solids (T-M) (interaction between the stress/strain and temperature fields through thermal stress and expansion) and poroelasticity theory (H-M) (interaction between the deformability and permeability fields of porous media). The conservation equations of mass, energy, and momentum are usually obtained with Hooke’s law of elasticity, Darcy’s law of flow in porous media, and Fourier’s law of heat conduction [39]. The effects of precipitation to the moisture content in the soil were studied by Troendle and Reuss [40], D’Odorico et al. [41], and Longobardi [42]. For the numerical solution of the conservation equations, the finite element method (FEM) is usually employed [39, 43].

4.2. Parameter Estimation

For parametric models, the cause-response system can be expressed aŝ𝑦𝑘=𝑥𝑘,𝑥𝑘1,,𝑥𝑘1𝜃𝑘,𝑦𝑘=̂𝑦𝑘+𝜂𝑘+𝜀𝑘,̃𝑦𝑘=𝑦𝑘̂𝑦𝑘,(1) where 𝑦𝑘: observed (or measured) system output at time step 𝑘, in which the dimension of 𝑦𝑘 is (1×𝑚), and 𝑚 is the total number of observational points or number of sensors in in-situ measurements; ̂𝑦𝑘: estimated system output based on employed geomaterial constitutive models. In geotechnical engineering, the finite element method (FEM) is commonly used for the numerical solution of the constitutive equations, thus yielding ̂𝑦𝑘; ̃𝑦𝑘: residual between the observed output 𝑦𝑘 and estimated output ̂𝑦𝑘. The residual includes the modeling error 𝜂𝑘 and measurement error 𝜀𝑘, which are combined together and usually undistinguishable for field measurements. In many applications, the residual is assumed to have ̃𝑦𝑘𝑁(0;Σ̃𝑦), in which Σ̃𝑦 is an (𝑚×𝑚) covariance matrix of ̃𝑦𝑘; 𝑘: system function of given system parameter vector 𝜃. In the most general case, 𝑘 is stochastic, time-varying, nonlinear dynamic function; 𝜃:  (𝑝×1) system parameter vector to be estimated; 𝑥: known system input vector with the memory of the 𝑙-th order. For static systems, 𝑙=0.

The goal of system identification is to find the “best” estimates of the system parameters 𝜃 that minimize the residual ̃𝑦𝑘. Many optimal estimation algorithms are available for the best estimates, and they are usually classified into two approaches: parameter estimation methods and state estimation methods. The parameter estimation methods (also referred as the variational methods in some geotechnical literatures) are described in this section, and the state estimation methods (also referred to as sequential methods in some geotechnical literatures) will be described in Section 4.3.

In parameter estimation, the most general objective function can be expressed asmin𝜃𝐽(𝜃𝛽)=min𝜃𝐽𝑜(𝜃)+𝛽𝐽𝑝,𝐽(𝜃)(2)𝑜(𝜃)=𝑛𝑖=1𝑦𝑘̂𝑦𝑘(𝑥𝜃)𝑇𝑊𝑜1𝑦𝑘̂𝑦𝑘𝐽(𝑥𝜃),(3)𝑝(𝜃)=𝜃𝜃𝑃𝑊𝑝1𝜃𝜃𝑝,(4) where 𝐽𝑜(𝜃): objective function for the observational (or measurement) information of the system output; 𝐽𝑝(𝜃): objective function for the prior information of the system parameters; 𝛽: a positive scalar parameter, which adjusts the significance (weighting) between the observational information 𝐽𝑜(𝜃) and the prior information 𝐽𝑝(𝜃); 𝑊𝑜: covariance matrix of the measurement error whose dimension is (𝑚×𝑚); 𝑊𝑝: covariance matrix of the prior information error involving system parameters whose dimension is (𝑚×𝑚); 𝜃𝑝: previously known means of the system parameters 𝜃.

Three parameter estimation methods are usually employed in geotechnical applications: (1) least square estimation, (2) maximum likelihood estimation, and (3) Bayesian estimation.

4.2.1. The Least Square Estimation (LSE)

The objective function of the LSE corresponds to the case in which the adjusting scalar parameter 𝛽=0 in (2), and the covariance matrix of the measurement error𝑊𝑜1=𝐼 in (3), where 𝐼 is an (𝑚×𝑚) identity matrix, thus resulting in𝐽LSE𝜃=𝑛𝑖=1𝑦𝑘̂𝑦𝑘(𝑥𝜃)𝑇𝑦𝑘̂𝑦𝑘(𝑥𝜃).(5) With the condition of 𝛽=0, no prior information of the system parameters is used during the parameter estimation. With the condition of 𝑊𝑜1=𝐼, all observation values are weighted with the same significance. Thus, the LSE requires the least amount of information among the parameter estimation methods.

The LSE method would be the most widely used method for geotechnical applications. Some examples of the application of LSE in geomechanical applications include the work of Gioda and Maier [44], Cividini et al. [45], Cividini et al. [46], Arai et al. [47], Arai et al. [48], Arai et al. [49], Gioda and Sakurai [50], Shoji et al. [51], Shoji et al. [52], Anandarajah and Agarwal [53], Murakami et al. [54], Beck and Woodbury [55], and Xiang et al. [56].

4.2.2. The Maximum Likelihood Estimation (MLE)

In the MLE method, the observational information of the measurements is used, and the measurement data are weighted according to their significance (i.e., 𝑊𝑜1𝐼), but no prior information of system parameters is used in the parameter estimation (i.e., 𝛽=0). Therefore, the LSE can be seen as a special case of the MLE. The objective function of the MLE is𝐽MLE𝜃=𝑛𝑖=1𝑦𝑘̂𝑦𝑘(𝑥𝜃)𝑇𝑊𝑜1𝑦𝑘̂𝑦𝑘,𝑊(𝑥𝜃)𝑜1𝐼.(6) Some examples of using the MLE for geotechnical engineering applications are Ledesma et al. [57], Honjo and Darmawan [58], Ledesma et al. [59], Ledesma et al. [60], and Gens et al. [61].

4.2.3. The Conventional Bayesian Estimation (BE) and Extended Bayesian Estimation (EBE)

In the BE method, the system parameters are estimated using both the observational information of measurements and the prior information of the system parameters, with the same significance between these two information (i.e., 𝛽=1) as𝐽BE(𝜃)=𝐽𝑜(𝜃)+𝐽𝑝(𝜃).(7) The objective function of the EBE is more general than that of the BE, with the nonunit positive scalar adjusting parameter 𝛽 as𝐽EBE(𝜃)=𝐽𝑜(𝜃)+𝛽𝐽𝑝(𝜃),𝛽1,𝛽>0.(8) If the adjusting parameter 𝛽 is small, the prior information of 𝜃𝑝 has less contribution in the parameter estimation of 𝜃, and vice versa. Optimal values of the adjusting parameter 𝛽 can be determined, for example, with the cross-validation method [62], ridge regression method [63], and the Akaike Information Criterion (AIC) [6466].

Some application examples of the conventional BE and EBE in geotechnical engineering include Cividini et al. [46], Gioda and Sakurai [50], Arai et al. [67], Honjo et al. [64], Honjo et al. [65], and Xiang et al. [56].

The conventional BE and EBE methods are more sophisticated than other estimation methods, while the Bayesian methods require more amounts of information on both observational measurements and prior knowledge of system parameters. Therefore, the availability of necessary information is important to apply the Bayesian methods.

4.3. State Estimation

In state estimation methods, the system can be identified by estimating its state at each time step using so called filters. Therefore, the state estimation method is also referred to as the sequential estimation method. Among numerous types of filters, the Kalman filter-based algorithms would be most widely used in geotechnical applications, including (1) the linear Kalman filter method and (2) the extended Kalman filter method. Some application examples of the Kalman filter methods for geotechnical applications are given in the work of Murakami and Hasegawa [68], Kim and Lee [69], and Zheng et al. [70]. More general descriptions and details concerning the Kalman filter can be found in Mendel [71].

4.3.1. The Linear Kalman Filter

The underlying system model of the linear Kalman filter is based on the assumption of a recursive linear dynamic system discretized in the time domain as𝑧𝑘=𝐴𝑘𝑧𝑘1+𝐵𝑘𝑥𝑘+𝑤𝑘,(9) where 𝑧𝑘: true internal state at time step 𝑘, which is evolved from the previous state 𝑧𝑘1; 𝑥𝑘: known system input state at time step 𝑘; 𝑤𝑘: stochastic process of noise with a zero-mean, multivariate normal distribution of 𝑤𝑘𝑁(0,Σ𝑤𝑘); 𝐴𝑘: linear state transition matrix, which is applied to the previous state 𝑧𝑘1; 𝐵𝑘: input matrix, which is applied to the current system input 𝑥𝑘.

The observational (or measured) state of the system output can be expressed as𝑦𝑘=𝐶𝑘𝑧𝑘+𝑣𝑘,(10) where 𝑦𝑘: observational system output; 𝐶𝑘: observational matrix, which maps the true state space of 𝑧𝑘 into the observed space of 𝑦𝑘; 𝑣𝑘: stochastic process of observational noise with zero mean Gaussian white noise of 𝑣𝑘𝑁(0,Σ𝑣𝑘).

Using this underlying system model, the estimate of the state and error covariance matrix of the estimated state can be determined aŝ𝑧𝑘𝑘=̂𝑧𝑘𝑘1+𝐾𝑘̃𝑦𝑘,(11)𝑃𝑘𝑘=𝐼𝐾𝑘𝐶𝑘𝑃𝑘𝑘1,(12) where ̂𝑧𝑘𝑘: updated state at time step 𝑘 given observations up to and including time step 𝑘; 𝑃𝑘𝑘: updated error covariance matrix of ̂𝑧𝑘𝑘; ̂𝑧𝑘𝑘1: predicted state at time step 𝑘 given observations up to and including time step 𝑘1. ̂𝑧𝑘𝑘1=𝐴𝑘̂𝑧𝑘1𝑘1+𝐵𝑘1𝑥𝑘1; 𝑃𝑘𝑘1: predicted error covariance matrix of̂𝑧𝑘𝑘1. 𝑃𝑘𝑘1=𝐴𝑘𝑃𝑘1𝑘1𝐴𝑇𝑘+Σ𝑤𝑘; ̃𝑦𝑘: measurement residual; ̃𝑦𝑘=𝑦𝑘𝐶𝑘̂𝑧𝑘𝑘1;  𝑆𝑘: residual covariance matrix; =𝐶𝑘𝑃𝑘𝑘1𝐶𝑇𝑘+Σ𝑤𝑘; 𝐾𝑘: optimal Kalman gain. 𝐾𝑘=𝑃𝑘𝑘1𝐶𝑇𝑘𝑆𝑘.

The Kalman filter shown in (11) is an optimal estimator of minimum mean-square error 𝑧𝑘̂𝑧𝑘𝑘.

4.3.2. Extended Kalman Filter (EKF)

In the EKF, the underlying linear dynamic models are extended to nonlinear models as𝑧𝑘𝑧=𝑓𝑘1,𝑥𝑘+𝑤𝑘,𝑦𝑘𝑧=𝑘+𝑣𝑘,(13) where 𝑓 and are nonlinear functions. Instead of 𝐴𝑘 and 𝐶𝑘 in the linear Kalman filter method, and, in the EKE, the Jacobian matrices of 𝜕𝑓/𝜕𝑧 and 𝜕/𝜕𝑧 are used.

In summary, the system in the state estimation can be identified by estimating its state at each time step using filters. Using the Kalman filter methods, it is possible to incorporate prior information in the observation data during the state estimation. Since the underlying system model of the linear Kalman filter method is a linear dynamic system, this method is usually not applicable to nonlinear geotechnical systems. The extended Kalman filter method can be used to identify such nonlinear systems.

4.4. Optimization

Once an objective function with respect to unknown system parameters is constructed as shown in Section 4.2, the solution procedure uses standard optimization techniques to find the optimal values of the system parameters. Numerous optimization algorithms have been developed and used for general purposes of optimization in every field of science and engineering. General descriptions of optimization algorithms can be found in Bertsekas [72].

In geotechnical applications, the aim of the optimization process is usually to calibrate geotechnical models by finding a set of optimal values of the model parameters. The optimal values of the model parameters can be found, using various optimization algorithms by minimizing the residuals between the measurement data (usually obtained from field or laboratory testing) and the synthetic data (usually obtained from the finite element analysis for the numerical solutions of the geotechnical models). In many geotechnical applications, however, the optimization surface contains many local minima and sometime is nonconvex due to the complexity of material behaviors and coupled effects of temperature, moisture, and loads.

Some examples of optimization algorithms used in geotechnical studies include the Newton method [73], quasi-Newton method [53], Gauss-Newton method [56, 73], conjugation gradient method [47], simplex method [45, 54], complex method [74], random search method [75, 76], and more recently evolutionary algorithms, such as the genetic algorithm [7780] and the particle swarm optimization method [81].

5. Review of Nonparametric Approaches

Nonparametric approaches have been also applied in different geotechnical problems. In this section, recent developments of nonparametric data processing techniques for geotechnical systems have been reviewed.

5.1. Time Series Analysis

In time series analysis, the dynamic response of target systems can be analyzed with a discrete time series expansion model of the system input and output. One kind of time series models is called an autoregressive-moving average (ARMA) model that can be formulated as𝑦𝑘=𝑛𝑏𝑖=0𝑏𝑖𝑥𝑘𝑖𝑛𝑎𝑗=0𝑎𝑖𝑦𝑘𝑖+𝑒,(14) where 𝑥𝑘: observed (or measured) system input at time step 𝑘; 𝑦𝑘: observed (or measured) system output at time step 𝑘; 𝑛𝑎: order of the moving average (MA) as 𝑛𝑏𝑖=0𝑏𝑖𝑥𝑘𝑖; 𝑛𝑏: order of the autoregression (AR) as 𝑛𝑏𝑖=0𝑎𝑖𝑦𝑘𝑖; 𝑒: white, exogenous noise.

Using the ARMA model, the characteristics of the measurement time histories of the system input and output can be determined from the identification of the expansion coefficients (𝑎’s and 𝑏’s) based on the measured system input and output. The optimal coefficient values can be determined, using various optimization algorithms as discussed in Sections 4.2 and 4.4. A general description of time series analysis methods can be found in Box and Jenkins [82].

Some application examples of the time series analysis methods for geotechnical systems include Glaser [83], Glaser and Leeds [84], Glaser and Baise [85], Baise et al. [86], and Glaser [87]. In Glaser and Baise [85], a technique for mapping the identified time series coefficients to relevant soil physical properties was discussed that is considered to be a parametric approach in their paper.

5.2. Time-Frequency Analysis
5.2.1. The Empirical Mode Decomposition and the Hilbert-Huang Transform

The empirical mode decomposition (EMD) and Hilbert-Huang transform (HHT) methods are nonparametric data processing techniques pioneered by Huang et al. [88, 89] and Huang and Attoh-Okine [90]. One advantage of using these techniques is in dealing with long-term natural processes, which are commonly observed nonlinear and nonstationary. The EMD and HHT are widely used in various fields of science and engineering: meteorology and atmospheric physics [9196], earthquake engineering, structural health monitoring (SHM), and control for civil structures [97102].

For any arbitrary time series 𝑥(𝑡), an analytical signal 𝑧(𝑡) can be obtained using the Hilbert transform. Let 𝑦(𝑡) be the Hilbert transform of 𝑥(𝑡)1𝑦(𝑡)=𝜋𝑃𝑥(𝜏)𝑡𝜏𝑑𝜏,(15) where 𝑃 is the Cauchy principal value, and𝑧(𝑡)=𝑥(𝑡)+𝑖𝑦(𝑡)=𝑎(𝑡)𝑒𝑖𝜃(𝑡),(16) where𝑎(𝑡)=𝑥2(𝑡)+𝑦2𝑡,𝜃(𝑡)=tan1𝑦(𝑡)𝑥(𝑡).(17)

In (15), it should be noted that the Hilbert transform is the convolution of 𝑥(𝑡) with 1/𝑡, which emphasizes the local properties of 𝑥(𝑡). In addition, (17) provides the best local fit of 𝑥(𝑡) using-time dependent functions of 𝑎(𝑡) and 𝜃(𝑡). Finally, the instantaneous frequency is defined as𝜔(𝑡)=𝑑𝜃(𝑡)𝑑𝑡.(18) In order to obtain physically meaningful instantaneous frequencies (IMF), Huang et al. [88] suggested the decomposition of a complex original time series into multiple so-called intrinsic mode functions that represents the oscillatory modes embedded in the original signal, and the instantaneous frequencies are determined for the decomposed IMFs. The signal 𝑥(𝑡) can be expressed using the series of IMFs as𝑥(𝑡)=𝑚𝑘=1IMF𝑘+𝑟(𝑡),(19) where the IMF𝑘 is the 𝑘-th intrinsic mode function, 𝑚 is the number of the IMFs, and 𝑟(𝑡) is the residual.

The IMF is defined to have the properties of local zero means and the same numbers of zero crossings and extrema throughout the time series for the IMF to be only one mode of oscillation without complex riding waves. A difference from the Fourier-based signal processing methods is that the IMF is not restricted to be single banded and can be non-stationary. Several EMD algorithms have been developed using the so-called sifting process [104, 105].

The HHT is a time-frequency analysis technique; combined with the EMD, a time-frequency plot can be obtained for each IMF to visualize frequency change over time. The HHT is similar to the wavelet transform (WT) as a non-stationary data processing technique, but the HHT is not limited by the underlying basis functions as the WT is.

5.3. Black-Box Methods

One technical difficulty in the identification of complex (nonlinear) geotechnical systems is that the system characteristic function in Figure 1 is usually unknown beforehand, so that it is not possible to establish exclusive relationships between the system input and system output. This case is often encountered when systems identified are under field condition subject to various environmental effects, or systems are evolved into a different class of nonlinearity after unpredictable unknown structural damage. The black-box methods can be used when the physical relationships between the system input and the system output are unknown.

The Artificial Neural Networks (ANNs) technique, inspired by biological neural networks, has been shown to be a powerful tool for developing model-free representation of nonlinear systems. The ANNs consist of an interconnected group of artificial neurons that forms the input layer, hidden layers, and output layer for arbitrary multiinput multi-output (MIMO) systems in Figure 3. Employing various optimization algorithms, the input-output relationships could be determined by finding the optimal values of the weights and biases of the artificial neurons. Detailed description of the ANN method can be found in Fausett [106] and Gurney [107].

The ANN techniques have been used in a wide range of geotechnical applications including pile capacity, settlement of foundations, characterization of soil properties and behavior, liquefaction, site characterization, earth retaining structures, slope stability, tunnels, and underground openings [103]. Some technical challenges for the ANN modeling in geotechnical engineering are discussed in Jaksa et al. [108].

5.4. Response-Only Models

Response-only methods are defined as the methods that use no system information in their data processing procedures. The blind source separation (BSS) is classified as one of these kinds. The BSS method is a multivariate, nonparametric techniques, which separate unknown system input (or “sources”), based on observed system output (or “response”) without (or with little) information of the system input or system function. BSS includes several response-only techniques, such as the principal component analysis (PCA) for statistically uncorrelated multivariate system input, and the independent component analysis (ICA) for statistically independent multivariate system input. General descriptions of the PCA and ICA methods can be found in Hyvärinen et al. [109].

The principal component analysis (PCA) method, also known as the proper orthogonal decomposition (POD) or the Karhunen-Loève transform, is a multivariate statistical technique [110]. Two algebraic solutions of the PCA are commonly used including (1) the eigenvector decomposition of the covariance matrix and (2) the singular value decomposition approach. The first solution will be described in this section. For an (𝑚×𝑛) observation data set 𝑋=[𝑥1;;𝑥𝑚], where 𝑥𝑖 is an (𝑛×1) vector associated with sensor 𝑖, the goal of the algebraic solution is to find the orthonormal matrix of the principal components 𝑃, where𝑌=𝑃𝑋,(20) which renders the covariance matrix 𝐶𝑌 diagonal. The covariance matrix can be determined from𝐶𝑌=1𝑛1𝑌𝑌𝑇=1𝑛1PAP𝑇,(21) such that𝐴=𝑋𝑋𝑇=𝑉𝜆𝑉𝑇,(22) where 𝐴 is an (𝑚×𝑚) symmetric matrix, 𝑉 is the (𝑚×𝑚) matrix of eigenvectors arranged as column, 𝜆 is the (𝑚×𝑚) diagonal matrix of the eigenvalues. The PCA is limited by its global linearity because the PCA removes linear correlations among the observed data and is only sensitive to second-order statistics [111, 112].

Some geotechnical applications of the PCA include Dai and Lee [113], Komac [114], Folle et al. [115].

6. Case Study: Monitoring for Full-Scale Retaining Walls Subject to Long-Term Environmental Change

In order to demonstrate the benefits of the nonparametric methodologies discussed in Section 2, a case study was conducted using a full-scale reinforced concrete retaining wall with the height of 13.59 m. Because the wall was placed only 9.5 m away from a high-rise residential apartment building, the collapse of the wall would result in a catastrophic disaster.

The backfilled soil characteristics were not known, and the soil behavior (e.g., pore water pressure or soil temperature) was not monitored. The material properties of the reinforced concrete were also unknown, and the plan of the retaining wall was not available. The retaining wall was monitored for three years with three tilt sensors located at the upper, middle, and lower locations of the wall (13.14 m, 6.55 m, and 1.68 m from the ground). At the same locations of the tilt gauges, the surface temperatures were also measured. Therefore, a total six sensors (i.e., three tilt gauges and three surface temperature sensors) were used and wired to a data logger, equipped with a digitizer and local storage device. The sensor readings were sampled at once every hour (1 sample/hr) for all channels. Consequently, due to the lack of information in terms of measurement types, temporal and spatial resolution of measurements, and information on the monitored structure, conventional parametric identification approaches could not be used in this study. Furthermore, although the wall surface temperature data were collected, only tilt data were used in this analysis to demonstrate that important performance-related information on the retaining wall can be obtained using response-only data without relying on additional data of the causative force and environment in the data processing procedures. As described in Section 3, since the inverse analysis using response-only data is not based on explicit relationships of system input output, which cannot be accurately determined due to limited information of structural characteristics and sensor measurements, the oversimplification problem often observed in conventional parametric approaches would be avoidable. Environmental measurements will be used a posteriori information for physical interpretation of the inverse analysis results, which is commonly not straightforward in other nonparametric approaches. If this approach was successful, the expensive data collection cost could also be reduced (Figure 4).

The tilt time histories measured from the retaining wall are shown in Figure 5. The slope is in microradian, and the plus sign is for the slope towards the apartment side. The slope signals at all three locations were significantly affected by seasonal and daily variation: decreasing during summer and increasing during winter, and decreasing during days and increasing during nights as reflected in daily trends (not clearly shown in the figure due to scale). During this three-year monitoring period, the wall behavior was affected by temperature change in addition to rain and snow falls, freeze thaw of backfilled soil, soil-structure interaction, and so on.

Figure 5 also shows that the collected sensor data are partially incomplete. The lower sensor failed in Q1 2006 (approximately after one year). There were “missing” data for all sensors in Q4, 2006, for about three months due to instrument failure. These unavoidable and unpredictable sensor and instrumentation problems are frequently encountered in long-term field measurements, and the proposed nonparametric methodology should be robust to handle these kinds of problems. Therefore, the figure illustrates the lack of data available for the complexity of the given problem, which is commonly encountered in many geotechnical applications.

Three nonparametric data processing techniques were used: the empirical mode decomposition (EMD), the Hilbert-Huang transform (HHT) for single-channel (or Univariate) analysis, and the principal component analysis (PCA) for multichannel (or multivariate) analysis. A summary of the proposed nonparametric data processing approaches is provided in Table 1.

A brief description of the EMD-HHT was given in Section 5.2, and the analysis procedures of the EMD-HHT are summarized in Figure 6. Due to the complexity of the geotechnical system coupled with long-term environmental variation, the raw sensor data shown in Figure 6 are usually too complicated to be interpreted for performance assessment. Thus, a daily trend was disentangled using the EMD based on its period of one day out of the raw signal even with missing data for three months in the second year, and a sample result is shown in Figure 6(b). The disentangled daily trend of the slope is mostly influenced by the daily fluctuation of the wall surface temperature (i.e., the wall inclined toward the apartment during daytime and toward the backfill during night time). Once the daily trend was disentangled, the instantaneous frequency of the daily trend was obtained using the HHT as shown in the time-frequency plot of Figure 6(c).

The time history of the daily trend has a period of one day, and the corresponding instantaneous frequency has a baseline frequency of one per day as shown in Figures 6(b) and 6(c). Occasional amplitude reduction is observed in the time history (e.g., 3/11, 3/15, 3/21, and 4/5 through 4/9) in Figure 6(b), and during these times, the corresponding instantaneous frequencies become significantly larger than the baseline frequency. Hourly precipitation records collected separately at the nearest weather station to the wall site are plotted in Figure 6(d). The precipitation data were not used in our analysis. Interestingly, the comparison with the instantaneous frequency in Figure 6(c) shows that the peaks of the instantaneous frequency concur with precipitation events, and the frequency decreases back to the baseline frequency (i.e., one day) when the precipitation stops.

These results demonstrate an important advantage of the nonparametric techniques over conventional parametric methods in monitoring applications. Without a priori information, physical assumptions and oversimplification of the monitored structure, the daily trend can be disentangled from a complicated raw slope signal. With the occurrence of the precipitation, the normal pattern in a slope signal (i.e., the system response in Figure 1) is “disturbed” due to the change of the structural characteristics with increased water content in the backfills (i.e., the system characteristics function). Consequently, the pattern of the disentangled daily trend is also disturbed in its amplitude and frequency. After the precipitation stops, the pattern in the raw slope time history returned to the normal condition with a working drainage system, which drain away excessive water in the soil, and so does the patter of the disentangled daily trend. After the precipitation stops, if the pattern of the disentangled signal did not go back to normal (i.e., the instantaneous frequency in Figure 6(c) did not go back to the baseline frequency), it could be concluded that the drainage system is not working properly. A critical difference between using the raw and the processed signals is that the raw signal is too complicated to recognize the precipitation effect because it is overshadowed by other dominant non-performance-related effects, such as temperature as shown in Figure 6(a); the important drainage-related information can be extracted using the disentangled signal as shown in Figure 6(c).

The principal component analysis (PCA) technique was used as a multi-sensor analysis method.

The brief description of the PCA was provided in Section 5.4. In order to find the optimal window size, the statistics of the first PCA mode shape, which is associated with the largest contribution to the energy of the total wall motion, were calculated. Figure 7 shows the mean values of the eigenvectors in dashed lines with one-standard deviation (1𝜎) uncertainty in the shaded areas. The statistics were calculated with different window sizes (i.e., numbers of days) up to 60 days, and the window size of one-day duration includes 24 data points for the given sampling rate of 1 sample/hr. In the figures, since the expectation of the PCA mode shape begins statistically unbiased after 14 days (i.e., the mean and deviation values begin saturated), the window size of two week was selected for the PCA in this study.

Figure 8 shows the PCA mode shapes with the error bars of one-standard deviation (1𝜎). In the figure, the mode shapes of the wall slopes were converted to the displacements using the known heights of the sensor location. The 𝜇 and 𝜎 in the parenthesis are the mean and standard deviation of the eigenvalue corresponding to each mode that is normalized to the sum of the eigenvalues of all modes. Although no physical characteristics information was used, Figures 8(a)8(c) illustrates that the PCA mode shapes agree to the first, second, and third bending modes of a cantilever. The PCA eigenvalues show that the motion of the first mode is dominant: 97.3% of the entire motion energy with the standard deviation of 2.1%. This dominant motion is clearly due to the significant daily and seasonal trends shown in Figure 5 that could be mostly due to diurnal and seasonal temperature variation. For the purpose of structural health monitoring, this dominant low-order mode is less interesting since important information of condition assessment is performance related, not environment related. In addition, structural damage is usually localized phenomena, so that higher modes would have a better spatial resolution to detect.

Figures 8(a)8(c) were created using the data in year 2005 before the bottom tilt gauge was damaged.The same PCA procedures were applied using the data after the tilt gauge was permanently damaged in year 2006 Q1, and the results are compared in Figures 8(d)8(f). The first mode after the damage in Figure 8(d) was realized similar to the one before the damage in Figure 8(a) except the deviation of the mode shape increased after the damage. The comparison of Figures 8(b) and 8(e) shows that an excessive amount of the movement was realized after the damage of the bottom sensor that is unusual for the cantilever type of the wall structure. The mean contribution of the first mode to the total energy of the wall motion was reduced from 97.3% (with the standard deviation of 2.1%) to 82.3% (with the standard deviation of 14.3%) and that of the second mode increased from 2.3% (with the standard deviation of 2.0%) to 14% (with the standard deviation of 14.4%), while the energy contribution and shape of the third mode remained similar as shown in Figures 8(c) and 8(f).

Figure 9 shows the time histories of the PCA eigenvalues. In the figure, the time history of the first mode is shown in the solid line (black), the second mode in the dashed line (red), and the third mode in the dash-dot line (blue). The realized eigenvalues of Modes 1 and 2 significantly changed from March, 2006, the same time when the bottom sensor was damaged.

Based on the single-channel and multi-channel analyses results discussed in Section 6, the following important facts can be concluded for the general monitoring applications of geotechnical structures.(i)From the time histories of Figures 6(c) and 9, when abnormal behaviors of the wall occur can be determined. These abnormal behaviors are related to the performance of the structure, which are commonly overshaded by the significant effects of environmental variation. The disentanglement techniques, such as the EMD and the PCA, allow filtering out the environment-related information and focusing on the performance-related information.(ii)From the mode shapes of the lower senor in Figures 8(d)8(f) (particularly Figure 8(e) for the PCA or using the information of the upper sensor location in the EMD-HHT in Figure 6), where the abnormal behaviors occur can be also determined.(iii)Using the statistics (e.g., error bars) of the eigenvalues and eigenvectors of the PCA modes in Figure 8, the confidence levels of detecting abnormal behaviors can be quantified combining with the standard statistical hypothesis test or classification techniques. It should be noted that since the PCA modes are statistically uncorrelated (or statistically independent for the independent component analysis), uncertainty quantification can be done with three times of integral (for three slope measurements) for statistical tests, not triple integral. For example, it was observed that the cross-correlation values of the PCA eigenvalues between different modes are very low (less than 0.6404) as summarized in Table 2. This property is particularly important when a large number of sensors are used.

7. Summary and Conclusions

The modeling procedures of the nonparametric methods are data driven, not based on a priori physical knowledge of the monitored structure. Therefore, the methodology developed by the authors is not limited to a specific type of structure, but it could be applicable to a wide range of monitoring applications for different geotechnical structures. For the diversity of the characteristics of geotechnical structures, the nonparametric methodology could reduce modeling efforts significantly in various monitoring applications that has been technical barrier using conventional parametric approaches.

The important performance-related information (e.g., effects of drainage or malfunctioning sensors) could be obtained using a very limited amount of the response-only sensor data (i.e., three tilt time histories). The decomposition techniques used in this study could disentangle the response deformation data of the complex system subject to long-term environmental variations without the information of the causative force, environment or structural characteristics. For example, since the precipitation records were not used in the EMD-HHT, it was demonstrated that oversimplification problems could be avoided using the response-only analysis techniques that is not based on exclusive input-output relationships. Therefore, the nonparametric methodology discussed in this paper could provide the important information of when, where, and how confidently engineers should be deployed to the site for potential performance hazards of monitored structures using a very little amount of information without sacrificing accuracy of the inverse analysis. The common practical problems of the unpredictable sensor/instrument network malfunctioning problems could be also effectively dealt with the nonparametric methodology.