An Improved Bayesian Structural Identification Using the First Two Derivatives of Log-Likelihood Measure

Zhou, Jin; Mita, Akira; Mei, Liu

doi:https://doi.org/10.1155/2015/236475

Journal of Structures

On this page

Abstract Introduction Conclusions References Copyright Related Articles

Research Article | Open Access

Volume 2015 | Article ID 236475 | https://doi.org/10.1155/2015/236475

An Improved Bayesian Structural Identification Using the First Two Derivatives of Log-Likelihood Measure

Jin Zhou,¹Akira Mita,¹and Liu Mei¹

Academic Editor: Elio Sacco

Received28 Nov 2014

Accepted03 Mar 2015

Published17 Mar 2015

Abstract

The posterior density of structural parameters conditioned by the measurement is obtained by a differential evolution adaptive Metropolis algorithm (DREAM). The surface of the formal log-likelihood measure is studied considering the uncertainty of measurement error to illustrate the problem of equifinality. To overcome the problem of equifinality, the first two derivatives of the log-likelihood measure are proposed to formulate a new informal likelihood measure for the sake of improving the accuracy of the estimator. Moreover, the proposed measure also reduces the standard deviation (uncertain range) of the posterior samples. The benefit of the proposed approach is demonstrated by simulations on identifying the structural parameters with limit output data and noise polluted measurements.

1. Introduction

Recent years witness the increasing desire of Bayesian estimation for structural parametric system when quantifying the inevitable uncertainties, such as measurement error or structural model error and so forth, as is reviewed by Simoen et al. [1]. In particular, Beck and Au [2] used Laplace’s method of asymptotic approximation to obtain a posterior PDF with a small-dimensional parameter space. To solve higher dimensional problems, Muto and Beck [3] developed an adaptive Markov chain Monte Carlo (MCMC) simulation for the Bayesian model updating. Gibbs sampling and transitional Markov chain Monte Carlo (TMCMC) were used by Ching and Chen [4] to obtain the posterior PDF of parameters. Cheung and Beck [5] used a hybrid Monte Carlo method, known as the Hamiltonian Markov chain, to solve higher dimensional model updating problems. Huhtala and Bossuyt [6] explored a Bayesian inference framework to solve the inverse problem of locating structural damage. An et al. [7] proposed a statistical model parameter estimation using Bayesian inference when parameters are correlated and observed that data have noise. Green [8] used a novel MCMC algorithm, data annealing, which is similar to simulated annealing, for the Bayesian identification of a nonlinear dynamic system.

The difficulty of Bayesian estimation lies in the efficiency in the convergence of posterior samples in the Markov chain to the acceptable model set. Moreover, because of the noise corrupted measurement, the surface of the prediction error lies in a hypersurface of a multidimensional parametric space. It will cause the surface of the probability density for the posterior sequences to have multiple regions of attraction and numerous local optima. It thus inevitably yields a biased estimator (no matter what is called maximum likelihood estimator, ML, or maximum a posteriori estimator, MAP). This problem is defined as the “equifinality” [9–11]. The surfaces of the prediction error, using formal likelihood measures, maximum log-likelihood (ML), are studied. From the surfaces of fitness measures, it can be concluded that the formal likelihood measure underestimates or overestimates the uncertain intervals of the posterior samples. The reason is that there are several possible models which can also give high values of likelihood around the neighborhood of the estimator.

In this paper, the bias between the ML/MAP estimator and the actual value is deduced by the Taylor expansion. It is found that the gradient and Hessian matrix of the likelihood measure can bridge the biased estimator and the actual value, which is thus proposed to improve the accuracy of the posterior samples. The parameter estimation problem is proposed as a two-step strategy. In the first step, the MAP/ML estimator is obtained by the formal Bayesian likelihood measures using the differential evolution adaptive Metropolis-Hastings (DREAM) algorithm. In the second step, a new fitness measure is proposed, which can be seen as the informal likelihood measure under the framework of the generalized likelihood uncertainty estimation (GLUE) [12–14]. Numerical examples of a linear structural system are presented, with which the effectiveness and efficiency of the proposed method are investigated.

2. Problem Statement

2.1. Least Squares (LS) Estimator for the Inverse Problem

Let denote the measured response at each time interval () and denotes the output of candidate models. The difference between the measured response and model outputs is defined as the residual error: , where , and is the number of outputs. The common measure for the inverse problem is to attempt to force the residual vector as close to zero as possible by tuning the model parameter vector, . Thus, the fitness measure can be defined as follows:

This is an -dimensional optimization issue which maximizes the likelihood measure of SSR (equivalent to minimize the measure of LS formulation). But such measure can only provide an estimate of optimal value of . If we need to quantify the uncertainty of the estimator, it would be a desire to estimate the underlying posterior PDF of parameter, , which is under the framework of Bayesian probabilistic estimation.

2.2. Bayes Estimate Using Formal Log-Likelihood (LL) Measures

In the Bayesian estimation framework, the model set, , is a class of probabilistic models, each of which predicts the response of the actual system. The identification problem is to infer the plausibility of each candidate model with a posterior density distribution conditioned by the measured data, ; it is not a quest for the true structural parameters. is a stochastic parameter vector defining each possible model in the model set. The model set, M, is defined by random parameters, (, is the random variables in probability space ), where is the number of parameters for model and is the number of stochastic samples. The initial plausibility of each model parameterized by is defined as a prior density function, . The updated plausibility of I/O model using Bayes’ theorem is as follows:

is the likelihood measure, . If the measurement error is considered as obeying the Gaussian distribution with a constant variance, (th available observed response), the posterior PDF in (2) is thus as follows:where is the evidence of model class (), which is a high-dimensional integral. The difficulty in estimating the posterior PDF is none other than approximating the model evidence. Strives to overcome this challenge are the purpose of the Metropolis-Hastings algorithm. For simplicity, (2) is rewritten as . The MH algorithm generates the posterior PDF with four steps. Start with initial samples, , and compute the likelihood measures . The updated samples, , are produced by the jumping distribution, , which is the probability of returning a value of given a previous value of . The restriction on the jumping is that the transition is probability symmetric, . The acceptance ratio at the updated candidate () and the source posterior samples () is . If the acceptance ratio, , accept the candidate sample, ; if the jump decreases the density, , then it rejects the updating and keeps the current samples,. The acceptance ratio is as follows:

It is clear that the advantage of MH algorithm lies in the fact that when computing acceptance ratio there is no need to obtain the model evidence since the constant cancels out. The transition of samples generates a Markov chain (). Following a burn-in period, the Markov chain approaches its stationary distribution, and the samples after the burn-in period converge into the posterior PDF, , as that in (2). From (4), it can be found that the Bayesian estimate relies on the likelihood measure, . It is more convenient to use the logarithm of the likelihood measures rather than the likelihood function itself:

Either the log-likelihood measure, as is in (5), or the least square measure, as is in (1), obeys the rule of “goodness-of-fit.” This is because only the model with high probabilistic value of likelihood in the MH method will be accepted.

2.3. The Surface of the Likelihood Measures

To illustrate the problem of “equifinality,” the surfaces of the common-used likelihood measure, the as is in (5), are simulated in identifying the stiffness parameters of a 2-DOF linear dynamic system. The state space of the system is written as follows: where , , and are the mass, damping, and the stiffness matrices, is identity matrix, and is position vector. and are state space vectors, respectively, representing the displacement and velocity, and is the input of the system. The system output is an acceleration which is assumed to be contaminated by Gaussian white noise . The measured output vector is thus

The mass and stiffness of each DOF are defined as 100 kg and 1000 N/m. Equation (7) includes a Rayleigh damping Mita [15], where the first two-modal damping ratio is set as 5%:

The parametric domain is meshed by 5% deviation of the true value. The output acceleration (acc.) with different noise levels (7) was used in the simulation, in which the noise level (nl.) was defined as . The contour plots of the likelihood measure, as is in (5), in the scenarios of noise-free and different noise level scenarios are exhibited in Figure 1.

(a) (No noise)

(b) (10% noise)

(c) (30% noise)

(d) (100% noise)

From Figure 1, we can find that only when the measurement error is ignored, the center of the posterior samplings model set (the ML/MAP estimator) will be unbiased; however, taking the noise into account, the optimal solution with maximum PDF is biased. Moreover, around the search neighborhoods of the optimal solution, there are many local optimums. It can conclude that when considering the measurement error the common likelihood measures as in (1) and in (5) are weak to solve the problem of equifinality. The bias of the estimator will increase with adding the number of the parametric dimensions and the noise level. It is thus necessary to improve the identified ML/MAP estimator.

3. The Proposed Accuracy-Improving Method

3.1. The First Two Deviations of the Likelihood Measure

With Taylor’s expansion, the likelihood measure can be deduced aswhere denotes . With the derivative of (9) to and ignoring the high order derivative series, one obtains:where is the first order derivative of likelihood measure, which is the gradient matrix of ; is the second order derivative of likelihood measure, which is the Hessian matrix of . and are the gradient and Hessian matrix at the point of true value, . Since , then it yields that

As seen in (11), the bias, , is equal to the negative product of the gradient and the inverse Hessian matrix at the actual value. Equation (11) is thus proposed in this study to formulate the informal likelihood measure. Let denote the bias between the MAP/ML estimator and each of the posterior samples. The proposed Bayesian updating of the posterior samples using DREAM algorithm can be divided into two steps. The first step is to find the ML/MAP estimator, . The second step is to search the optimal point, , of the proposed criteria around the neighborhood of the ML/MAP estimator. The informal likelihood measure can be written as follows: where denote the credible range, which can be decided by the user as the stop criteria.

3.2. Illustration of the Proposed Likelihood Measure

From (12), it can be found that the extreme points of the likelihood measure, , are the actual value, , and the MAP/ML estimator, , because the former extreme value is the null point of (11) and the latter item meets the condition of . Since the MAP/ML estimator, , has already been obtained in the first step after the distribution of the posterior samples has become stationary, the posterior samples on the Markov chains in the second step will converge into the neighborhood around the actual value due to the proposed likelihood measure. The accuracy of the estimator is thus improved and the uncertain range of the posterior samples will be narrowed around the actual value. To verify the proposed likelihood measure, as is in (12), the surface and contour for the 2-DOF linear system, of which the simulation in 100% noise scenario is the same as that in Section 2.3, are shown in Figure 2. The accuracy of the posterior samples, seen in Figure 2, is improved comparing with the MAP estimator, comparing with Figure 1(b).

(a)

(b)

3.3. Two Steps of the DREAM Based Bayesian Estimation

3.3.1. Step 1: MAP Estimator Using DREAM Algorithm

The DREAM algorithm combines the DE mutation strategy into the updating of Markov chains. In the transition of Markov chains, the posterior samples are updated by the revised DE mutation strategy. Let denote the jump scale between the updating state and current state of the th of the Markov chain. The samples are updated as follows: where is a small random vector that is drawn from ; is the number of chosen pairs; and and are, respectively, different and random integers that are chosen from the integer set . The term signifies the -dimensional identity matrix, and signifies a small random vector drawn from a uniform distribution to assure the ergodicity of the samples on the Markov chain. The scaling factor is decided by the values of and , where d is the parametric dimension. The DREAM algorithm also explores the DE crossover strategy in d dimensions of the current samples,, the updated posterior samples,, and a trial sample is generated with where is the th independent random number uniformly distributed in the range of . CR is a crossover probability defined by the user. The DREAM accepts the candidate state with probability and keeps the current state with probability. The MH acceptance probability, , is computed by (4). The posterior samples are updated by (13) and (14) and approaches to a stationary after a burn-in period. The convergence of posterior samples is checked by the Gelman-Rubin criteria [16]. The -statistic,, is computed by using the last 50% of the samples in each chain. The convergence criteria, , are where is the number of the used posterior samples; equals , where is the variance of the sequence; the posterior variance is estimated as , where is the variance between the sequences; can be computed as .

3.3.2. Step 2: Density Updating of the Samples That Satisfy the Proposed Criteria

When in (14) for each dimension is less than 1.2, the posterior samples on the MC chain converged into a stationary distribution. The MAP estimator, , is obtained with the maximum posterior PDF. Also, the standard deviation of the posterior samples, , can be obtained. Then, the algorithm turns into the second step. The gradient and Hessian matrix of the posterior samples within the boundary of are calculated at each iteration. The proposed informal likelihood measure as in (11) will be used for the updating of the posterior samples. The estimation procedure will stop till the prescribed precision, , is satisfied. The gradient matrix and the Hessian matrix at each sample can be written as where denotes the th posterior sample and is the parametric dimension. The diagonal and off-diagonal elements of the can be given by (16) and the following equation:

3.4. Identification Procedures

The procedure of the proposed posterior density estimation is as follows.

Step 1. Use the LHS method to generate sequences for the initial state of MC chains, respecting the prescribed limits of the search space. The likelihood measure of each sample is obtained by (5).

Step 2. Update the posterior sample of the Markov chain by mutation strategy using (13) and by the crossover probability using (14). Calculate the density for the updated samples.

Step 3. The Metropolis acceptance (4) is used for choosing the accepted posterior samples.

Step 4. Return to Step 2 and to Step 3, after the burn-in period and calculating the convergence criteria using (15) for each dimension of the structural parameter. If the convergence criteria of the MC chain are met (), the MAP estimator and the standard deviation of the samples are obtained; otherwise, return to Step 2.

Step 5. Calculate the gradient and Hessian matrix at the point of each sample within the interval of by (16) and (17).

Step 6. The informal likelihood measure as in (12) is used for the transition of the posterior samples till the predefined stop condition is satisfied.

4. Numerical Study

4.1. Identification of a 5-DOF LTI System

Numerical simulation of a 5-DOF LTI system was carried out to verify the proposed method. The structural system is simulated as (6) and the measured signal is as (7). The input was an El-Centro wave lasting 40 s and the sampling frequency was 100 Hz (Figure 3). The measurement noise of the 5th DOF in the 100% noise level scenarios is shown in Figure 3. Table 1 shows the structural properties of the dynamic system.

(a)

(b)

The influence of the limited availability of measurements on the proposed method is also assessed in this study. In the “full output” scenario, measurements of all DOFs are available, whereas, in the “partial output” case, only data from DOFs 3 and 5 are available. The mass is assumed to be known; hence, an -DOF system with -available measurements is described by a model set, of which the interested parameterized vector is as follows:

The parameters of the DREAM algorithm were set as follows: the number of Markov chain samples was 20, the crossover probability was 0.85, and the number of sample pairs () was 5. The search domain was taken to be 0.5–2.0 times the true value. The identified results are exhibited in Table 2.

From Table 2, the conclusion that the proposed method can find the optimal value, , in the scenarios of both “full outputs” and “partial outputs,” even when considering large noise level, can be drawn. Ignoring the measured noise error, the proposed method can identify the structural parameters with no bias. The coefficient variance (cov.) of posterior samples is identified as zero, which demonstrates that the inverse problem under the noise-free scenario is deterministic. When the acceleration of each DOF is available for measure, the maximum relative error ranged from 0.494% in the 30% noise case to 0.726% in the 100% noise case. While considering that the measurements of the 3rd and the 5th DOFs are available, the maximum relative error ranged from 0.701% in the 30% noise case to 1.044% in the 100% noise case. The credible range of the posterior samples is depicted by the coefficient variance of the MC sequences. It is clear that the parametric uncertainty was additive with the increasing of measurement error. In the “full outputs” scenario, the maximum coefficient variance (cov.) of the posterior samples ranged from 0.897% in the 30% noise level to 2.397% in the 100% noise case. Correspondingly increasing with noise level in the partial outputs scenario, the maximum cov. rises from 2.423% to 4.197%.

The convergence progress of the MC samples for each dimensional structural parameter in the scenario of partial output and the consideration 100% noise level are shown in Figure 4. From Figure 4, it is found that the probability density of the posterior samples in Markov chain approached to be stationary after 1500 iterations, which is clearly shown in Figure 4(b). The proposed likelihood measure used for the transition of posterior samples to find the optimal works after 2500 iterations. The posterior uncertain range that assures a reliability of 95% can be obtained from the posterior samples of the model class which denotes the plausibility of each I/O system. Figure 5 shows the uncertain range of the response with 95% assurance and with measurement errors at each time interval by incorporating the identified standard deviation of the prediction error (time history of initial 10 s is shown). The percentage of the response considering a 100% measurement error within the uncertain range that considers prediction error is 94.76%.

(a)

(b)

4.2. The Identification of a 10-DOF LTI System

To show the advantage of the proposal, the identification results of a 10-DOF LTI system using the improved method, as is proposed in (12), are compared with the estimation solutions using the formal log-likelihood measure, as is in (5). For simplicity, the mass of each floor is assumed to be 50 kg, the stiffness of each DOF is equal to 5000 N/m, and the first two-mode shape damping ratio is assumed to be 0.05. The input excitation is the same as that used in the simulation of the 5-DOF LTI system. The white noise is added to the measured response, simulated by (11), where the noise level is assumed to be 10%, 30%, and 100%. When considering the scenario of partial measurements, only the even DOFs (2nd, 4th, 6th, 8th, and 10th) are assumed to be available for Bayesian updating. The results are shown in Figure 6.

(a) Case 1: full outputs

(b) Case 2: partial outputs

In the case of “full outputs,” the maximum relative error of ranges from zero to 0.617% in 10% noise level case, 1.175% in 30% noise level, and 1.766% in 100% noise level. Correspondingly, the maximum relative error of using the traditional likelihood measure increases from zero in the case of ignoring measurement noise to 0.754% in 10% noise level, 1.999% in 30% noise level, and 6.491% in 100% noise level. The improvement is also clear in the scenario of partial outputs. The maximum relative error of using the proposed method increases from zero in noise-free case to 1.479% in 10% noise level, 2.175% in 30% noise level, and 3.543% in 100% noise level. While the maximum relative error of using the traditional method rises from zero in no-noise case to 2.621% in 10% noise level, 5.974% in 30% noise level, and 8.451% in 100% noise level, it can be found that the minimum and maximum relative errors of mean posterior samples in model set are all reduced using the proposed method. Moreover, it can be found from Figure 6 that using (12) for the parametric uncertainty (the cov.) becomes smaller than that obtained by the formal log-likelihood measure as in (5). For instance, when considering the case of 100% noise level and partial outputs are available, the maximum coefficient variance of the MC samples using the proposed method is 6.198% compared with that obtained by traditional method which is 11.09%. Comparing the identification results shown in Figure 6, it is clear that the accuracy of the estimator using the proposed likelihood measure is improved.

5. Conclusions

To improve the accuracy of the MAP estimator obtained by the traditional likelihood measure using the DREAM based structural identification, the gradient and Hessian of the log-likelihood measure are proposed to formulate the generalized likelihood measure for the density transition of Markov chains. Compared with the formal likelihood function, the relative error of the MAP estimator and the uncertain range of the posterior samples using the proposed method becomes smaller. Numerical simulations of a 5-DOF and 10-DOF LTI system demonstrated their effectiveness in solving identification problems with a high noise level and loss of measurement data. In conclusion, DREAM based Bayesian estimation using the proposed improvement has the potential to solve the problem of “equifinality” especially when considering large level of measurement error.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported in part by a Grant for the project “Implementation of Shelter Guidance System for Commuters Who Are Unable to Return Home Based on Structural Health Monitoring of Tall Buildings after Large-Scale Earthquake” (FY2013–2016, PI: A. Mita) from the Japan Science and Technology Agency.

References

E. Simoen, G. De Roeck, and G. Lombaert, “Dealing with uncertainty in model updating for damage assessment: a review,” Journal of Mechanical Systems and Signal Processing, vol. 56-57, pp. 123–149, 2015.
View at: Publisher Site | Google Scholar
J. L. Beck and S.-K. Au, “Bayesian updating of structural models and reliability using Markov chain Monte Carlo simulation,” Journal of Engineering Mechanics, vol. 128, no. 4, pp. 380–391, 2002.
View at: Publisher Site | Google Scholar
M. Muto and J. L. Beck, “Bayesian updating and model class selection for hysteretic structural models using stochastic simulation,” Journal of Vibration and Control, vol. 14, no. 1-2, pp. 7–34, 2008.
View at: Publisher Site | Google Scholar
J. Ching and Y.-C. Chen, “Transitional Markov Chain Monte Carlo method for Bayesian model updating, model class selection and model averaging,” Journal of Engineering Mechanics, vol. 133, no. 7, pp. 816–832, 2007.
View at: Publisher Site | Google Scholar
S. H. Cheung and J. L. Beck, “Bayesian model updating using hybrid Monte Carlo simulation with application to structural dynamic models with many uncertain parameters,” Journal of Engineering Mechanics, vol. 135, no. 4, pp. 243–255, 2009.
View at: Publisher Site | Google Scholar
A. Huhtala and S. Bossuyt, “A Bayesian approach to vibration based structural health monitoring with experimental verification,” Journal of Structural Mechanics, vol. 44, no. 4, pp. 330–344, 2011.
View at: Google Scholar
D. An, J.-H. Choi, and N. H. Kim, “Identification of correlated damage parameters under noise and bias using Bayesian inference,” Structural Health Monitoring, vol. 11, no. 3, pp. 293–303, 2012.
View at: Publisher Site | Google Scholar
P. L. Green, “Bayesian system identification of a nonlinear dynamical system using a novel variant of Simulated Annealing,” Mechanical Systems and Signal Processing, vol. 52-53, pp. 133–146, 2015.
View at: Publisher Site | Google Scholar
K. J. Beven, “A manifesto for the equifinality thesis,” Journal of Hydrology, vol. 320, no. 1-2, pp. 18–36, 2006.
View at: Publisher Site | Google Scholar
J. R. Stedinger, R. M. Vogel, S. U. Lee et al., “Apprisal of the generalized likelihood uncertainty estimation (GLUE) method,” Water Resources Research, vol. 44, no. 12, pp. 1–17, 2008.
View at: Google Scholar
M. Sadegh and J. A. Vrugt, “Approximate Bayesian computation in hydrologic modeling: equifinality of formal and informal approaches,” Hydrology and Earth System Sciences Discussions, vol. 10, pp. 4739–4797, 2013.
View at: Publisher Site | Google Scholar
K. Beven and J. Freer, “Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology,” Journal of Hydrology, vol. 249, no. 1–4, pp. 11–29, 2001.
View at: Publisher Site | Google Scholar
Y. Zhang, H. H. Liu, and J. Houseworth, “Modified generalized likelihood uncertainty estimation (GLUE) methodology for considering the subjectivity of likelihood measure selection,” Journal of Hydrologic Engineering, vol. 16, no. 6, pp. 558–561, 2011.
View at: Publisher Site | Google Scholar
Z. Li, Q. Chen, Q. Xu, and K. Blanckaert, “Generalized likelihood uncertainty estimation method in uncertainty analysis of numerical Eutrophication models: take bloom as an example,” Mathematical Problems in Engineering, vol. 2013, Article ID 701923, 9 pages, 2013.
View at: Publisher Site | Google Scholar
A. Mita, Structural Dynamics for Health Monitoring, Sankeisha, Nagoya, Japan, 2003.
A. Gelman and D. B. Rubin, “Inference from iterative simulation using multiple sequence,” Statistical Science, vol. 7, no. 4, pp. 457–472, 1992.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2015 Jin Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

711

Downloads

255

Citations