Abstract

Recent experience with the strongest earthquakes greater than magnitude 5.4 in Korea leads to public interest in the safety and resilience of critical infrastructures. There are many nuclear power plants near the epicenters of the earthquakes. Nuclear power plants are essential infrastructures that provide stable and enough energy for human life, and simultaneously, control systems for the safety and security of nuclear power plants are critical due to the risk of nuclear accidents on public health and the environment. The nuclear area uses probabilistic risk assessment to estimate the risk of structures, systems, and components in nuclear power plants, and the evaluation of the fragility curve is a key process for probabilistic risk assessment. The challenges of a seismic fragility analysis lie in estimating the influence of various uncertainties in material, geometry, and earthquake and improving the existing fragility analysis methods, which require time-consuming nonlinear time history analysis. Thus, this paper conducts a multivariate seismic fragility analysis using surrogate models for reinforced concrete squat shear walls and proposes a simplified closed form equation for a chosen surrogate seismic demand model. The surrogate models are trained and validated by several approaches: response surface method, support vector machine, Gaussian process regression, and neural network. In addition, a correlation analysis is used to evaluate the relative importance of the variables to the seismic demand to simplify the surrogate model further. Finally, simplified surrogate models based on the importance of the variables are proposed as closed form of polynomials, and the performance of these models on the fragility analysis is evaluated.

1. Introduction

The Great East Japan Earthquake severely damaged infrastructures such as telecommunication facilities, power plants, bridges, and buildings. On the day of the earthquake, 8.5 million households lost electricity, 1.5 million were still powerless on March 13, and 0.3 million had no power supply even a week after the earthquake. Due to the lack of power, communication service outages were widespread [1]. Also, Korea has recently experienced the strongest earthquakes greater than magnitude 5.4 in Gyeongju and Pohang, which are beyond-design earthquakes, as shown in Figure 1. They are the largest earthquakes since the beginning of the instrumental earthquake records in 1978. These earthquakes caused life and property damage: Gyeongju and Pohang earthquakes resulted in 111 victims, 23 injured people, and 5,368 damaged properties and 135 injured people and 57,000 damaged properties, respectively [2, 3]. In addition, Korea has the highest density of NPPs, and many NPPs in Korea are operated on the country’s east coast near the epicenters of the earthquakes, as shown in Figure 1. In particular, the Kori NPP is located in Busan metropolitan city, which is the second largest city with a population of 3.5 million, and the Wolseong NPP in Gyeongju City is located within around 20 km of Ulsan metropolitan city with a population of over 1.1 million. As a result, Korean society pays strong attention to the safety of nuclear power plants (NPPs). A probabilistic seismic risk assessment of the structures, components, and systems in an NPP is important to estimate the seismic fragility of them and the overall risk of the NPP. The overall risk is estimated by the convolution of hazard and fragility curves as the probability of occurrence of each consequence; therefore, an accurate evaluation of the risk is necessary to define and quantify the levels of structural safety and resilience.

Reinforced concrete (RC) squat shear walls are the most typical structural systems used in NPPs because they effectively resist lateral forces like earthquakes and winds. The NPPs consist of various types of squat shear walls, as shown in Figure 1. The squat shear wall has a height-to-length ratio of less than two, resulting in shear-dominant lateral behaviors with small lateral deformation [4]. In the seismic risk assessment, calculating a fragility curve defined as the conditional probability of a failure exceeding a specific capacity for a given ground motion intensity measure (IM) is a key process. The fragility analysis must include all sources of uncertainties such as ground motions, the geometry, and material properties of structures. However, there is a lack of studies on the seismic fragility analysis of the squat shear wall. Many earlier studies [510] mainly focused on experiment and simulation works for investigating and predicting the nonlinear behavior of squat shear walls under monotonic and cyclic loading. Furthermore, while Syed and Gupta [11] conducted a fragility analysis for an RC squat shear wall with uncertainty in ground motions, uncertainties in the geometry of the shear wall were not included.

Cloud analysis (CA) and multiple stripe analysis (MSA) are the conventional methods used to develop seismic fragility curves in many studies [1116]. Both analyses employ nonlinear time history analysis (NTHA) to collect the seismic demands. In the cloud analysis, a single-parameter demand model is estimated by linear regression between demands and unscaled ground motions in a lognormal space. In the MSA, scaled ground motions to IMs of interest are used, and the probability of a failure at a given IM level is calculated as a ratio of the number of samples in which the seismic demand obtained from NTHA exceeds the capacity to the total number of samples (). However, recent fragility analysis studies have shown the need for a multiparameter demand model using surrogate models because the conventional single-parameter demand model has limitations [1719]. In addition, new NTHAs are repeatedly necessary to reestimate a fragility curve due to changing or updating the input variables; however, NTHA is a very time-consuming process. Several researchers have attempted to generate multiparameter surrogate models for a seismic demand model [1619]. However, all studies have focused on highway bridges, and few studies have considered RC squat shear walls. The seismic demand characteristics of RC squat shear walls are quite different from those of the highway bridge piers, decks, and bearings. Accordingly, it demands that a seismic demand model be efficiently developed for the fragility of the RC squat shear wall. Also, the current multiparameter surrogate model studies for highway bridges have developed black-box surrogate models (i.e., implicit functions) with a sole focus on the model accuracy; thus, such developed models cannot be accessed easily by other engineers and researchers, which hinders utilizing such models for design and safety assessment in a comprehensible manner.

The purposes of this study are to (1) develop multiparameter surrogate models of a box-type RC squat shear wall for the ultimate shear forces and (2) determine a best-fitting surrogate model. These models consider the geometric uncertainty of the shear walls as well as the material and ground motion uncertainties. Surrogate models are developed using four different techniques: second-order response surface method (RSM), support vector machine (SVM), Gaussian process regression (GPR), and neural Network (NN). (3) The fragility curves of the box-type shear walls obtained from the surrogate models are accurately estimated and compared with the ones obtained from the traditional approaches mentioned above. (4) Through a correlation analysis, the relative importance of the variables to the seismic demand is evaluated and reduced parameter surrogate models are suggested. As a result, the multiparameter surrogate models help to efficiently generate the fragility curve for the box-type RC shear wall and to update the seismic fragility curve due to the variation of the input variables without time-consuming simulations.

2. Finite Element Model of Box-Type Shear Wall for Surrogate Model and Fragility Analysis

For training surrogate models and conducting fragility analysis for the shear walls, developing a finite element model and running NTHAs is essential. This study uses a finite element model of box-type RC shear walls, which was developed and validated with the experimental data [2024]. The validated finite element model is analyzed considering uncertainties, and corresponding analytical results are used to develop surrogate models and conduct a fragility analysis in the following sections. An experiment on the box-type RC shear walls was conducted by the Nuclear Power Engineering Corporation (NUPEC) to assess the ultimate seismic capacity of the shear walls, as shown in Figure 2(a) [8]. The shear walls consisted of upper slab, lower slab, and four rectangular shear walls with an aspect ratio of 0.67. The box-type shear walls were subjected to multiaxial cyclic loading, and vertical preloading was applied to the upper slab, resulting in a typical value of compressive stress at the bottom of shear walls for the lower story of a building in NPPs. The rectangular shear walls had a double layer of D6 rebar with a rebar ratio of 1.2% in both the horizontal and vertical directions. Figure 3 shows the dimensions of the box-type shear wall system.

The finite element model was developed consisting of a box wall, loading slab, and bottom slab using ABAQUS. An analysis of the shear wall under cyclic loading was performed through the ABAQUS implicit solver. The developed model is shown in Figure 2(b) [2022]. The walls were modeled by a 4-node shell element with reduced integration (S4R) and a mesh size of 100 mm, assuming that the rebar is perfectly bonded with the concrete. In the experimental observations, there was no damage to the upper and lower slabs, so an elastic isotropic concrete model was applied to the slabs. The bottom slab is fixed at the ground, and multiaxial cyclic loading is applied to the loading slab. To represent the nonlinear behavior of the system, a concrete damage plasticity model proposed by Lubliner et al. [23] and Lee and Fenves [24] was applied to the rectangular shear walls. It has been used in many studies on the nonlinear behavior of concrete structures [2022, 2532]. In addition, the uniaxial concrete tension stiffening model and compression hardening model that were proposed by Maekawa and Okamura [33] and Izumo [34], respectively, were used in this study. Tables 1 and 2 summarize the material properties of the shear wall and parameters of the concrete damage plasticity model [20], where and are the biaxial compressive yield stress and the uniaxial compressive yield stress, is the ratio of the second invariant of the tensile meridian to that of the compressive meridian at initial yield for any given value of the first stress invariant, and and represent the compression and tension recovery factors.

Figure 4(a) shows that the experimentally obtained load-deformation backbone curve coincides well with the corresponding curve obtained from the finite element analysis, particularly at the ultimate (maximum) shear strength which needs to be measured in this study. In addition, the coefficient of determination () between the experimental and analytical shear forces represents 0.9959, as presented in Figure 4(b); therefore, this finite element model can be used to collect ultimate shear forces for developing surrogate demand models and for conducting a fragility analysis in the following sections.

3. Uncertainty in Material, Geometry, and Ground Motion

All sources of uncertainties must be considered in a fragility analysis. This study considers uncertainties in the material, the geometry of the box-type shear walls, and earthquakes. The uncertainties in the material properties and geometry are presented in Table 3. and are included as the material uncertainty. The material properties of concrete, such as , , and , are correlated in nature, thereby being considered by the following equations [35]:

The degree of uncertainties in steel-related strength parameters is much less compared to the corresponding strength parameters in concrete [11]; thus, in this study, for the uncertainty in the steel is only considered. The geometry uncertainties, such as the aspect ratio of the shear wall (), ratio of length to thickness (), vertical preloading (), and horizontal and vertical rebar ratio and , are considered, and the correlations between the parameters are not considered because of the inherent variability in structural geometry across different structural types and purposes. The geometric uncertainties were determined by research from the Multidisciplinary Center for Earthquake Engineering Research [36]. In the research, a database was assembled comprising data from experiments with 434 squat walls to improve the current state of knowledge on squat wall response and develop improved empirical equations for ultimate shear strength for shear walls [36]. The database for the 150 rectangular wall experiments is used in this study to consider uncertainties in the geometry of the rectangular shear walls. All input variables, including material and dimensional properties, would be randomly extracted based on the defined distribution in Table 3. Furthermore, 20 pairs of artificial ground motions that were generated in a previous study [37] are selected to account for the uncertainty in earthquakes. The artificial ground motions consist of two horizontal and a vertical earthquake time history, and they are compatible with a design response spectrum anchored to 0.3 g for NPPs based on Regulatory Guide 1.60 [38], as shown in Figure 5. The entire ground motions are scaled to 10 different peak ground acceleration (PGA): 0.2 g, 0.4 g, 0.6 g, 0.8 g, 1.0 g, 1.2 g, 1.4 g, 1.6 g, 1.8 g, and 2.0 g for generating surrogate models and estimating fragility curves. 200 sets of random variables for the material and geometry of the boxed-type shear walls are generated by the Latin hypercube sampling technique, and the sets are randomly matched with the selected ground motions. As a result, 200 NTHAs are performed to obtain the ultimate shear forces of the shear walls for given input variables.

4. Surrogate Models for Seismic Demand Model

Probabilistic risk assessments provide realistic and subjective risk index of structures that describes losses as a function of hazard intensity and the probability of occurrence of each consequence. In the probabilistic risk assessment, the fragility curve is used to evaluate seismic performance on structures and estimate the probability of a specific failure given earthquake intensity. In addition, the fragility curve could estimate potential damage to structures during an earthquake. The conventional fragility analysis employs NTHA to collect the seismic demands. However, recent fragility analysis studies have shown the need for a multiparameter demand model using surrogate models because the existing method requires repetitive NTHAs to reestimate a fragility curve when the input parameters are changed or updated. On the other hand, the pretrained surrogate models help reduce a lot of computational time caused by NTHAs for reestimating the fragility curves and identifying the impact of each parameter on the fragility curve. Thus, this study uses common machine learning algorithms, such as the response surface method, support vector machine, Gaussian process regression, and neural network, to develop surrogate models for the seismic demand of box-type shear walls.

4.1. Response Surface Method (RSM)

RSM consists of a group of mathematical and statistical techniques used to develop an adequate functional relationship between a response of interest and several input variables [39]. The relationship could be approximated with a linear polynomial function and second-order polynomial function, as follows [39]: where , , and represent the predicted responses, input variables, and regression coefficients. The coefficients are estimated using the least square method, which minimizes the gap between the response surface (polynomial function) and seismic demands obtained from the NTHAs.

4.2. Support Vector Machine (SVM)

SVM is a learning method to define a hyperplane for data classification and regression [18, 4042]. In the regression case, the goal of the SVM is to define the hyperplane close to as many of the data points as possible [40]. The objective function of the SVM is to choose a hyperplane with a small norm while simultaneously minimizing the sum of the distances from the data points outside of -tube to the hyperplane, and the linear function is written as follows [40]: where , , and is the -dimensional orthogonal vector to the hyperplane, -dimensional input vector, and scalar. is the positive penalty parameter, and slack variables ( and ) represent the deviation of data points from the -tube. In the SVM, the error of the data points within the -tube is ignored. Figure 6 shows a schematic of the support vector regression model.

When it is difficult to linearly define a hyperplane in low-dimensional space, a kernel function is used, and it implicitly transforms the input variables into high-dimensional space so that a linear hyperplane is provided in the high-dimensional space. A function for multiple input variables is Gaussian basis function with standard deviation () as follows [40]:

4.3. Gaussian Process Regression (GPR)

GPR is a nonparametric Bayesian regression combined with the properties of Gaussian processes [4345]. Its goal is to obtain the distribution of predicted responses at given input variables, in contrast to a representative regression, which would fit actual responses. In GPR, it is assumed that a function is distributed as a Gaussian process, which is a probability distribution over functions that fit the data as follows [43]: where is the mean and represents the covariance functions between each pair in (input variables). In this study, the squared exponential kernel function is used as written in the following [43]: where and are hyperparameters. The kernel function represents the similarity of data, which means that similar input variables produce similar outputs. The multivariate Gaussian distribution of and is expressed as follows [43]: where and are output functions obtained from observed and expected data. The conditional distribution over given represents a posterior distribution given new data and is written as follows [43]:

4.4. Neural Networks (NN)

NN is one of the widely used machine learning techniques inspired by the structure of the human brain [41, 44, 46, 47]. The NN consists of an input layer, an output layer, and different hidden layers between the input and output layers, as shown in Figure 7. Each layer is connected and has neurons (nodes) that are assigned numerical values. Nodes in the input layer collect input data and transmit the data multiplied by the corresponding weight value to the hidden layer. The sum of the weight data from the input layer is saved at each node in the hidden layer, and an activation function that exists in the hidden layers determines the value of the data to be transmitted to a subsequent hidden layer or output layer. The function plays a significant role in defining nonlinear relationships between each node. Finally, the output layer produces the seismic demand for given input variables. Commonly, the backpropagation process is applied for training the model to minimize the error between actual and predicted outputs by modifying the weight and bias values. In this process, the model is retrained based on the error in the output. If the error does not satisfy a predefined accuracy threshold, it returns back to the input layer to adjust the weight and bias, reestimates the output, and repeats the process until the desired accuracy is achieved.

5. Fragility Analysis of Box-Type RC Squat Shear Wall

5.1. Evaluation for the Accuracy of Surrogate Models

Surrogate models using various techniques, such as RSM, SVM, GPR, and NN, are generated for predicting the ultimate shear forces of box-type shear walls with different input data. The accuracy of the generated surrogate models is evaluated, and a best-fitting model is identified. To avoid overfitting in the surrogate models, the input data of the shear walls are randomly divided into a training set and a test set. The ratio of the testing to the training data is adopted as 0.333 in the current study. The training set is used to generate surrogate models, and the test set is used to evaluate the performance of the surrogate models. The accuracy of the surrogate models is evaluated with the following performance indices: coefficient of determination (), root mean square error (RMSE), mean absolute error (MAE), and variance accounted for (VAF). The predicted and actual ultimate shear forces are compared in Figures 8 and 9. Table 4 presents the performance indices for surrogate seismic demand models. The surrogate models show the highest prediction of 88% and the lowest prediction of 74% in terms of . Results from the GPR- and NN-based model show overfitting, whereas results from the RSM- and SVM-based model present slight underfitting. For selecting the best fitting model, average values of training and test set at each performance index are calculated as presented in Table 4, and as a result, it is found that the RSM-based surrogate model produces the best performance.

5.2. Fragility Curve Using Traditional Approach
5.2.1. Cloud Analysis (CA)

A seismic fragility curve represents the conditional probability of a failure that a seismic demand exceeds a specific capacity for a given IM, and it is generally estimated in conjunction with the NTHA of a finite element model. IM may include spectral acceleration or peak ground acceleration (PGA). In this study, PGA is used as IM, considering consistency with previous studies [11, 21, 22] for fragility analysis of shear walls. A lognormal cumulative distribution for a specific IM often defines the fragility function. Cornell et al. [48] suggested the conditional probability of a failure as follows:

where () is the standard normal cumulative distribution function, and are the median of the demand and capacity, is the dispersion of demand given in the IM, and is the dispersion of the capacity. The has a linear regression in the logarithmic space, and it can be expressed as follows: where and are regression coefficients obtained from NTHA. The is calculated as follows: where is the number of simulations and is the seismic demand obtained from th NTHA at a given IM. Figure 10 shows the seismic demands and IM plot obtained from the NTHAs of the FE models of the shear walls. It includes the assumed linear regression for the median of seismic demand, and the linear regression model shows a low prediction of 40% (). The capacities of 200 shear walls () are calculated based on ACI 349 design code [49] to estimate the fragility curve of the shear walls. Finally, the fragility curve through the CA is estimated in conjunction with Equations (11)–(13), and Figure 11 presents the results.

5.2.2. Multiple Stripe Analysis (MSA)

In the traditional MSA, the probability of a failure at a specific IM is estimated in conjunction with NTHA’s results and is expressed as a ratio of the number of failures to the total number of simulations; therefore, the probability of a failure for the box-type shear walls is estimated at 10 different PGAs in this study. However, the fragility curve cannot be estimated directly with these discrete data, and the maximum likelihood estimation (MLE) [50] is thereby used to estimate the fragility function. Figure 12 presents the fitted fragility curve by MLE compared with the MSA results at the different PGAs, and the conditional probability of a failure is written as follows: where is the median of IM and is the standard deviation of .

5.3. Fragility Curve Using Surrogate Models

The fragility curves of the box-type shear walls using generated surrogate models are estimated in this section. To estimate the fragility curves, seismic demands at each PGA level obtained from the surrogate models are compared with the capacity of each shear wall model. The probability of a failure at the specific PGAs () is calculated, and the fragility curve is estimated by MLE. Figures 13 and 14 present a comparison of the fragility curves between the traditional approaches and each surrogate model, and Table 5 summarizes the estimates of the fragility curves.

The fragility curve using CA represents the most overestimated fragility values, and the difference between the estimates of the fragility curves represents more than 50% compared with the MSA. It shows that the assumption of the CA that the median of the demands has a linear regression in the logarithmic space leads to unrealistic and overestimated results. Thus, the CA is not appropriate for the fragility analysis considering uncertainties in extensive input variables such as material, geometry, and earthquake. On the other hand, the RSM-based surrogate model produces the most similar fragility curve to the MSA. Also, the GPR and NN show errors within 10% regarding the probability of a failure at each PGA. However, the SVM-based surrogate model results in greatly underestimated fragility curves, and the median value is less than 25% compared with the MSA. The dispersion of the fragility curves generated by surrogate models tends to be slightly larger than that generated by MSA; however, the overall trend of the fragility curve is quite similar to MSA’s result, as shown in Figure 14. As a result, the RSM-based surrogate model is the best-fitted seismic demand model for box-type shear walls.

6. Reduced Multiparameter Surrogate Model Using RSM

While the RSM-based surrogate model is generated using the eight variables in the previous section, the model is simplified through a correlation analysis, which measures the correlation or dependence between variables, as presented in this section. Figure 15 shows the correlation between the variables. It is found that while the seismic demand () is rarely affected by , , and , five other variables (, , , , and ) have quite an effect on it. Moreover, three of the five variables (, , and ) have the greatest influences on the seismic demand of the shear walls. To simplify the generated surrogate model, two cases are considered in this study: in case 1, the , , and are eliminated, and only five variables are used as the input parameters. Case 2 is that only the three most important variables are used as the input. As a result, the two surrogate models are trained by RSM with different input variables (case 1 and case 2) and the accuracy of the surrogate models is presented in Figures 16 and 17. The RSM-based surrogate model with five parameters shows around 80% accuracy and has little difference from the original eight-parameter model; however, the three-parameter model is less accurate than the other model, at around 70%. In terms of the fragility curve, the results estimated by the reduced parameter surrogate models are shown in Figure 18 and compared with the original RSM-based model and MSA. The median of a fragility curve from the five-parameter model represents the closest value of MSA and RSM original, and the three-parameter model provides an overestimated median. Nevertheless, the three-parameter model still provides a better result than the CA. The estimated parameters of lognormal fragility curves are summarized in Table 6.

Although there is a slight difference in a fragility curve between full parameters and reduced parameters, the advantage of the RSM-based surrogate model with reduced parameters is that the model could be derived as a simple form due to the assumption of the polynomial regression. As a result, it is easily used with little understanding of surrogate models. Therefore, the seismic demand and fragility curves can be conveniently estimated based on the closed forms originating from Equations (3) and (4). The five-parameter and three-parameter models are written as follows:

7. Conclusion

While existing fragility methods to reestimate the fragility curves of structures require repetitive NTHAs, surrogate model-based fragility methods are expected to reduce a lot of computational time and conveniently reestimate the fragility curve. This study generates multiparameter surrogate models of a box-type RC squat shear wall for a fragility analysis. The summary and conclusion of this study are the following: (1)By comparing four performance indices, the RSM-based surrogate model is selected as the best-fitting demand model of the box-type shear walls and has 87% of (2)The RSM-based surrogate model produces the most similar fragility curve to the MSA. In addition, the GPR- and NN-based models lead to errors within 10% in terms of the probability of a failure at each PGA. On the other hand, the SVM-based surrogate model results in quite underestimated fragility curves, and the median value represents less than 25% compared with the MSA(3)Based on a correlation analysis, simplified surrogate models are expressed as closed forms of polynomials. The simplified surrogate model with five parameters yields a fragility curve similar to the results of the MSA and RSM-based surrogate models. However, the simplified surrogate model with three parameters produces an overestimated fragility curve(4)The suggested surrogate model and the reduced parameter closed forms help to reestimate and update the seismic shear force and fragility of the box-type RC squat shear walls without the time-consuming NTHAs when the change in the input variables of the shear walls occurs (i.e., degradation of material properties and discrepancy in the variables between design and construction stages)

While the surrogate models with multiparameters for estimating the seismic demand and fragility are generated in this study, the characteristic of earthquakes is only defined by PGA. Further studies are needed to consider various characteristics of the earthquakes (i.e., spectral acceleration at a period of 1.0 s and at the fundamental frequency of structures and duration of strong earthquake). In addition, the accuracy of the surrogate models lies around 87%; therefore, efforts to improve the performance of the surrogate models are needed in further studies.

Data Availability

Data are available on request.

Additional Points

Highlights. (i) This study presents surrogate models for a shear wall seismic demand. (ii) The seismic demand data for the shear wall are collected based on an experimentally validated finite element model. (iii) The surrogate models are developed using several machine learning methods. (iv) The reduced parameter surrogate model is further proposed. (v) The efficient and accurate closed form equation is finally devised.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2021R1A2C1010278 and No. RS-2022-00144328).