Mathematical Foundation of Probabilistic Preference Theory and Applications in EngineeringView this Special Issue
Research Article | Open Access
Xudong Wang, Jitao Yao, "Linear Regression Estimation Methods for Inferring Standard Values of Snow Load in Small Sample Situations", Mathematical Problems in Engineering, vol. 2020, Article ID 3753417, 10 pages, 2020. https://doi.org/10.1155/2020/3753417
Linear Regression Estimation Methods for Inferring Standard Values of Snow Load in Small Sample Situations
The aim of this paper is to establish a new method for inferring standard values of snow load in small sample situations. Due to the incomplete meteorological data in some areas, it is often necessary to infer the standard values of snow load in the conditions of small samples in engineering, but the point estimation methods of classical statistics adopted till now do not take into account the influences of statistical uncertainty, and the inference results are always aggressive. In order to overcome the above shortcomings, according to the basic principle of optimal linear unbiased estimation and invariant estimation of the minimum type I distribution parameters and the tantile, using the least square method, the linear regression estimation methods for inferring standard values of snow load in small sample situations are proposed, which can take into account two cases such as parameter-free and known coefficient of variation, and the predicted formulas of snow load standard values are given, respectively. Through numerical integration and Monte Carlo numerical simulation, the numerical table of correlation coefficients is established, which is more convenient for the direct application of inferential formulas. According to the results of theoretical analysis and examples, when using the indirect point estimation methods to infer the standard values of snow load in the conditions of small samples, the inference results are always small. The linear regression estimation method is suitable for inferring standard values of snow load in the conditions of small samples, which can give more reasonable results. When using the linear regression estimation to infer standard values of snow load in practical application, even if the coefficient of variation is unknown, it can set the upper limit value of the coefficient of variation according to the experience; meanwhile, according to the parameter-free and known coefficient of variation, the estimation is carried out, respectively, and the smaller value of the two is taken as the final estimate. The method can be extended to the statistical inference of variable load standard values such as wind load and floor load.
Snow load is one of the main loads of buildings, and the inference of its standard value is the basis for the establishment of structural design and evaluation methods. At present, the inference of the standard values of snow load is generally fitting with the maximum type I distribution for the maximum annual snow pressure ; then, using the relation between the representative values and the distribution parameters, an estimate of the distribution parameters is given under a certain guaranteed rate. Among them, the estimation method of distribution parameters includes the moment method and the maximum likelihood method [2–13]. These are point estimation methods. They are mainly suitable for large samples. They require sufficient meteorological data as statistical samples. It is generally believed that at least 30 years of data are required [14, 15]. However, in engineering practice, it is sometimes necessary to infer the standard values of snow load under the condition that the test data are insufficient, and the actual sample capacity is often very limited. Statistical analysis is done mostly in the case of small sample capacity. This results in a significant reduction in parameter estimation accuracy when the sample capacity is small. In the case of such a small sample, if the current point estimation method is still used for inference, the result of inference is often reduced due to the influence of statistical uncertainty. A more reasonable choice is to use a small sample inference method [16–19].
For dead loads, the standard for appraisal of reliability of civil buildings has proposed a small sample method for inferring its standard value ; however, the snow load usually obeys the maximum type I distribution, which is different from the probability distribution form of the dead load. Therefore, the same method cannot be used to infer the standard values of snow load. A method for estimating the maximum type I distribution parameters and tantile is proposed in the paper , which lays a theoretical foundation for the small sample inference of the standard values of snow load without parameter information. Therefore, the inference formula of the standard values of snow load without parameter information can be established. However, in some cases, the variable coefficients of the probability distribution of snow loads are known, or their upper limit values can be set based on experience. Using the additional information, the uncertainty of the statistics in the estimation process can be significantly reduced, and more favorable extrapolation results can be obtained under the same conditions. Therefore, it is necessary to study the inference method of the standard values of snow load when the coefficient of variation is known under the condition of small samples, to provide a better choice for the small sample inference of the standard values of snow load.
Based on the above analysis, the probability distribution model of snow load is established first in this paper because the standard values of snow load usually are expressed as a tantile of the distribution, type I maximum distribution, and type I minimum distribution which belong to the same extreme value distribution families and can be converted with each other; therefore, by using the least square method based on the best linear unbiased estimation and the invariant estimation principle of the current minimum type I distribution parameters, a linear regression estimation method for standard snow load under small sample conditions is proposed [22–31].
The inference formula of standard values of snow load is given in this paper, and no parameter information and the known coefficient of variation are both considered. Through numerical integration and Monte Carlo numerical simulation, a numerical table of correlation coefficient is established to tantile the application of the inference formula, and the conclusions and suggestions are given by comparing the results with the traditional large sample method. The method can be extended to the statistical inference of variable load standard values such as wind load and floor load.
2. Probability Distribution Model of Snow Load
A probability distribution model of snow load is established by using a stationary binomial random process .
It is assumed that the design reference period of the building structure is T, and it can be divided into equal period of time in ; the average time for the snow load to change once is ; the probability of action at each time is ; the probability distribution function appearing on different time segments is , and the random variables on different time segments are independent of each other. The establishment of the snow load model requires the identification of the above three key parameters. The Unified Standard for Reliability Design of Building Structures uses a limit state design method based on probability theory, that is, a first-order second-moment method that considers the probability distribution type of basic variables. Since the basic variables are considered as random variables, the random process of the load must be converted into a random variable . If random variables at any point time are used instead of random processes, it will be unsafe, so the maximum load random variables appearing in the design reference period should be used instead of random processes for statistical analysis. The basic steps to convert the random process to the maximum load within T years are as follows: Step 1: establish the load probability distribution function at any period time : where is the load distribution function at time point. Step 2: establish the probability distribution function : Assuming that the average number is the snow load which occurs in years, then Obviously, when , then, When , using the approximate relationship, If is sufficiently small in (2), then Hence, It can be seen from the above equation that the probability distribution function of the maximum load in the design reference period is equal to the m-power of the probability distribution function of the load at any time, and can be obtained according to statistics . Step 3: the probability distribution of snow load is fitting with the maximum type I distribution. The distribution function and probability density functions are, respectively, as follows: where and are the scale parameters and position parameters of the distribution.
3. Linear Regression Estimation Method for Standard Values of Snow Load
3.1. In the Condition of Unknown Parameter Information
The value at any time point of snow load obeys the maximum type I distribution, and the probability density function iswhere are distributed parameters, and . The standard values of snow load usually are expressed as down tantiles with calibration of the random variable , and they can be written as ; it meets thatwhere is a guaranteed rate of the standard value .
It is assumed that samples of are arranged from a small to large order: , and the test values are , respectively.
Then, obeys the minimum type I distribution with two parameters , and the order statistic and up tantiles with calibration are
For obtaining which is the characteristic value of variable actions, we usually select the upper limit estimated value as the inferring results; when the load effect is favorable for the structure, the lower limit estimated value should be selected, but it has been rarely found in the most unfavorable combinations of the action effect . We can use the upper and lower limit estimated values of to infer the upper and lower limit estimated values of because the estimate values
And, for the random variables , we have,
Thus, the upper and lower limit estimated values of , which are the characteristic values of the variable actions, arewhere can be directly determined from the numerical table. Because are present, numerical tables only give the numerical values when , so the values of and usually cannot be directly determined from the numerical table . Therefore, we propose a new approximate method which uses the present numerical table to infer the values of in this paper.
We use to denote the guarantee rate when the present numerical tables are adopted, and down tantiles with calibration of the random variable are expressed as , so we have
It is approximate let that
Because of the corresponding relationships between and , , and , we can obtain that
In practical applications, we should select which is approximate to . It is proved by calculation that when , , and , the margins of errors of and are −0.017 to 0.017 and −0.031 to 0.036, respectively. This method is more convenient and accurate than the interpolation method.
3.2. In the Condition of Known Coefficient of Variation
3.2.1. Linear Unbiased Estimation of Distribution Parameters
It is assumed that the value at any time point of snow load is and the coefficient of variation of is known, and the distribution parameters arewhere are the mean, standard deviation, and coefficient of variation of , respectively, and is the Euler constant. It is assumed that the samples of are arranged from the small to large order: , that is, . At this time, the probability density function of and the joint probability density function of and are
Then, obeys the standard maximum type I distribution, and its order statistic isthe probability density function of and the joint probability density function of and are
The mean, the variance of , and the covariance of the and are
Then, the mean of is
The covariance matrix and its inverse matrix of are, respectively, denoted as
According to the least square method of parameter estimation , take the weighted average sum as
When the coefficient of variation is known, let
When the coefficient of variation is known, the least square estimate of the unknown parameter is
Hence, the mean of isthat is, the linear unbiased estimator of . If are the test values of , then the linear unbiased estimate is
Compared to the corresponding coefficient in the current linear unbiased estimate , the influence of the coefficient of variation is considered in the coefficient . Since it is difficult to obtain the analytical expressions of equations (28)–(30), the values of , , and must be determined by numerical integration, the value of can be determined from (36), the values when n = 10 are listed in Table 1, and others are omitted.
3.2.2. Interval Estimation of Tantile
For the standard values of snow load, a relatively large tantile is usually selected under the condition of a small sample, and the upper limit value in the interval estimate should be used as its estimate value.
Let , then is a statistic that is independent of the distribution parameters and . Based on the probability distribution of , the estimate available of by the upper limit in the interval estimate iswhere is the down tantile of , is the confidence degree, and the value is relatively large. The value of estimated from the upper limit is
It is almost impossible to determine by the analytical method, so we need to use Monte Carlo numerical simulation to determine the value. The random numbers that obey the standard maximum type I distribution which are first generated in each simulation, a set of sample values , can be obtained after sorting, and the sample value of is obtained from equation (41). When the number of simulations is sufficient, any tantile can be obtained by statistics. Here, the number of simulations is 50,000. Table 2 lists the partial numerical tables when .
4. How to Choose the Confidence Degree in Engineering Practice
The confidence degree is a representative of the trust level for the inferring results, and it has a direct impact on the inferring results; the higher the confidence degree, the higher the upper limit estimated value and the lower the lower limit estimated value, and vice versa. At present, there is no suggestion about how to choose the confidence degree for inferring representative values and design values of variable actions in the literature.
Generally speaking, the larger the variability of the random variable , the larger the value range of the distributed parameter and tantiles in the inference; moreover, the changes of have great influences on the upper and lower limit estimated values when the confidence degree gets larger and larger; that is, along with the becoming larger, the upper limit estimated value is increased relatively quickly and the lower limit estimated value is reduced relatively quickly. When the variability of the random variable is larger, and if we select the higher confidence degree in this case, the inferring results are too conservative, so we should select the relatively lower confidence degree. On the contrary, we should select the relatively higher confidence degree when the variability of the random variable is smaller; it can avoid too aggressive inferring results.
The National Standard of the People’s Republic of China (standard for appraiser of reliability of civil buildings) (GB50292-1999) throws out some suggestion about how to select the confidence degree to infer the standard value of the material strength: for steel, select ; for concrete, select ; for masonry, select ; and it is suggested that to infer the standard value of permanent action. The variation coefficients of material strength of the permanent action, steel, concrete, and masonry are in the order as 0.07, 0.06–0.10, 0.16–0.23, and 0.20–0.24 . It can be seen that the value of the confidence degree in the National Standard of the People’s Republic of China conforms to the above rules. The variation coefficient of snow load is more variable ; according to the above rules, we should select the relatively lower confidence degree, but the value should not be less than 0.5; otherwise, the lower limit estimated value will be higher than the upper limit estimated value. When the confidence degree and all other conditions are equal, we contrast the inferring results of the moment method and the liner regression estimation method. It shows that the inferring result of the liner regression estimation method is slightly higher than the moment method, which is close to the result without considering the influence of statistical uncertainty. When the confidence degree is not great, the changes of have no great influence on the inferring results , so we can select the slightly higher confidence degree, which can take into account the influences of statistical uncertainty more fully and avoid the too aggressive inferring results. In this article, we suggest to select the confidence degree to infer the standard values of snow load. We will illustrate the correctness by an example in Section 5.
In this section, we use the established linear regression estimation method to estimate the standard values of snow load. The data used in this article are the actual snowfall data from 1989 to 2008 in the Shuyang county , and the specific values are shown in Table 3.
To compare and analyze the differences in the extrapolation results under different sample sizes, the above sample values are divided into three groups: 2008–1999 (10 sample data), 2008–1994 (15 sample data), and 2008–1989 (20 sample data) [41–45]. The results are calculated according to different inference methods.
We select the guarantee rate , and the confidence degree in the interval estimate is 0.75.
By using the moment method, we can obtain the estimate of the standard values of snow load through calculation as follows:
According to the maximum likelihood estimation method, by introducing the likelihood function, we can derive the parameter estimation formula as follows:
We can obtain the estimated values of the parameters, respectively, by the iterative method, and finally obtain the standard values of snow load through calculation as follows:
By using the linear regression estimation method of the standard value of snow load in the condition of unknown parameter information (equations (13), (14), and (18)), we can obtain the estimate through calculation as follows:
The values of and are listed in Table 4, and .
In the condition of the known coefficient of variation, to facilitate comparison under the same conditions, we select , by using the moment method, then
By using the maximum likelihood estimation method, we can obtain the estimated values of the parameters, respectively, by the iterative method, and finally obtain the standard values of snow load:
By using the linear regression estimation method that the coefficient of variation is known (equations (39) and (42)), thenwhere and the value of is shown in Table 1; Table 5 shows the calculation process.
To facilitate comparative analysis, Table 6 lists the statistical extrapolation results of different estimation methods and different sample sizes in two cases where the coefficient of variation is known and the parameter information is unknown.
A comparative analysis of the calculations presented in Table 6 shows that(1)Regardless of whether the coefficient of variation is unknown or known, the results estimated by the maximum likelihood method are the smallest, followed by the moment method, because these two classical statistical methods do not take into account the influences of statistical uncertainty, and the inferred results are always on the aggressive side.(2)Regardless of whether the coefficient of variation is unknown or known, with the gradual increase in the sample capacity, the relative errors of the results of the linear regression method and the classical statistical methods (the moment method and the maximum likelihood method) are gradually reduced, which is due to the gradual reduction of the influence of statistical uncertainty when the sample capacity increases.(3)According to the linear regression method, at different sample sizes, the estimated results when the coefficient of variation is known are smaller, which is due to the significant reduction in the uncertainty of the statistic in the estimation process.
Due to space limitation, this paper only compares the extrapolation values of different inference methods in three groups of sample sizes. By calculating and comparing the extrapolation of the standard values of snow loads in other sample sizes, the same results can be obtained.
(1)When we use the current statistical inference method to infer the standard value of snow load in the conditions of small samples, the inferring results are always on the aggressive side because it does not take into account the influences of statistical uncertainty.(2)The linear regression estimation method presented in this paper can reduce the influence of statistical uncertainty when the sample is small, and it is applicable to the statistical inference of the standard values of snow load in small sample conditions; considering the two cases where no parameter information and the coefficient of variation are known, more reasonable inference results can be given.(3)We suggest to select the confidence degree to infer the standard value of snow load.(4)In the practical application, even if the coefficient of variation is unknown, the upper limit of the coefficient of variation can be set based on experience, the estimation can be made according to the no parameter information, the coefficients of variation are known, and the smaller value of the two is taken as the final estimate. The research in this paper can provide a theoretical basis for adjusting snow load.
The authors solemnly inform that the key data in the calculation process have been listed in the data list of the article, and other nonkey detailed data can be obtained by contacting the corresponding author if necessary.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This project was supported by the National Natural Science Foundation of China (nos. 50678143 and 51278401) and the Education Department of Shaanxi (no. 17JK0440).
- GB50009-2012, Load Code for the Design of Building Structure, China Architecture & Building Press, Beijing, China, 2012.
- J. Aitchison and S. D. Silvey, “Maximum-likelihood estimation of parameters subject to restraints,” The Annals of Mathematical Statistics, vol. 29, no. 3, pp. 813–828, 1958.
- D. D. Dorfman and E. Alf, “Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals-rating-method data,” Journal of Mathematical Psychology, vol. 6, no. 3, pp. 487–496, 1969.
- S. I. Aihara, “Regularized maximum likelihood estimate for an infinite-dimensional parameter in stochastic parabolic systems,” SIAM Journal on Control and Optimization, vol. 30, no. 4, pp. 745–764, 1992.
- S. S. Li, “Estimation method for parameters of extreme value I distribution,” Journal of Fuzhou University, vol. 16, no. 1, pp. 79–84, 1998.
- Z. D. Duan and D. C. Zhou, “A comparative study on parameter estimate method for extremal value distribution,” Journal of Harbin Institute of Technology, vol. 36, no. 12, pp. 1605–1609, 2004.
- G. Aneiros-Pérez and P. Vieu, “Semi-functional partial linear regression,” Statistics & Probability Letters, vol. 76, no. 11, pp. 1102–1110, 2006.
- P. Hall and J. L. Horowitz, “Methodology and convergence rates for functional linear regression,” The Annals of Statistics, vol. 35, no. 1, pp. 70–91, 2007.
- I. Naseem, R. Togneri, and M. Bennamoun, “Linear regression for face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 11, pp. 2106–2112, 2010.
- F. Cribari-Neto and W. B. da Silva, “A new heteroskedasticity-consistent covariance matrix estimator for the linear regression model,” AStA Advances in Statistical Analysis, vol. 95, no. 2, pp. 129–146, 2011.
- Y. Zhao, X. Meng, and H. Yang, “Jackknife empirical likelihood inference for the mean absolute deviation,” Computational Statistics & Data Analysis, vol. 91, pp. 92–101, 2015.
- G. C. Manjunath Patel, P. Krishna, and M. B. Parappagoudar, “Squeeze casting process modeling by a conventional statistical regression analysis approach,” Applied Mathematical Modeling, vol. 40, no. 15-16, pp. 6869–6888, 2016.
- T. Sun, L. Cheng, and H. Jiang, “Greedy method for robust linear regression,” Neurocomputing, vol. 243, pp. 125–132, 2017.
- M. O’Rourke, A. DeGaetano, and J. Tokarczyk, “Analytical simulation of snow drift loading,” Journal of Structural Engineering, vol. 131, no. 4, pp. 660–667, 2005.
- Y. N. Zhang, Y. Zhang, Y. Q. Wang, and Y. J. Shi, “Calculation and analysis of basic snow pressure in Liaoning province,” Journal of South China University of Technology (Natural Science Edition), vol. 38, no. 9, pp. 108–112, 2010.
- S. Basu, “Improved small sample inference procedures for epidemiological parameters under cross-sectional sampling,” Journal of the Royal Statistical Society: Series D (The Statistician), vol. 50, no. 3, pp. 309–319, 2001.
- S. Paul and X. Zhang, “Small sample GEE estimation of regression parameters for longitudinal data,” Statistics in Medicine, vol. 33, no. 22, pp. 3869–3881, 2014.
- T. P. Talafuse and E. A. Pohl, “Small sample discrete reliability growth modeling using a grey systems model,” Grey Systems: Theory and Application, vol. 8, no. 3, pp. 246–271, 2018.
- S.-K. Fan, C.-H. Jen, and J.-X. Lee, “Profile monitoring for autocorrelated reflow processes with small samples,” Processes, vol. 7, no. 2, p. 104, 2019.
- GB50292-1999, Standard for Appraiser of Reliability of Civil Buildings, China Architecture & Building Press, Beijing, China, 1999.
- J. T. Yao and X. D. Wang, “Minor sample estimation of the parameters and fractile of largest extreme value distribution type I,” Statistics and Decision, vol. 19, pp. 11–14, 2014.
- S. Emilio and B. Sen, “Nonparametric least squares estimation of a multivariate convex regression function,” The Annals of Statistics, vol. 39, no. 3, pp. 1633–1657, 2011.
- S. Aihara, “Consistency property of extended least-squares parameter estimation for stochastic diffusion equation,” Systems & Control Letters, vol. 34, no. 5, pp. 249–256, 1998.
- D. Kumar, S. F. Ali, and A. Arockiarajan, “Structural and aerodynamics studies on various wing configurations for morphing,” IFAC-PapersOnLine, vol. 51, no. 1, pp. 498–503, 2018.
- B. Keshtegar and P. Hao, “A hybrid self-adjusted mean value method for reliability-based design optimization using sufficient descent condition,” Applied Mathematical Modelling, vol. 41, pp. 257–270, 2017.
- Y.-G. Zhao and T. Ono, “Moment methods for structural reliability,” Structural Safety, vol. 23, no. 1, pp. 47–75, 2001.
- X. Shi, A. P. Teixeira, J. Zhang, and C. Guedes Soares, “Structural reliability analysis based on probabilistic response modelling using the maximum entropy method,” Engineering Structures, vol. 70, pp. 106–116, 2014.
- X. Zhang and M. D. Pandey, “Structural reliability analysis based on the concepts of entropy, fractional moment and dimensional reduction method,” Structural Safety, vol. 43, pp. 28–40, 2013.
- M. Koliou and A. Filiatrault, “Development of wood and steel diaphragm hysteretic connector database for performance-based earthquake engineering,” Bulletin of Earthquake Engineering, vol. 15, no. 10, pp. 4319–4347, 2017.
- Z. Xi, C. Hu, and B. D. Youn, “A comparative study of probability estimation methods for reliability analysis,” Structural and Multidisciplinary Optimization, vol. 45, no. 1, pp. 33–52, 2012.
- S. S. Mao, Statistics Handbook, Science Press, Beijing, China, 2003.
- SJ/T11099-1996, Tables for Best Linear Unbiased Estimate (BLUE) (Extreme-Value Distribution, Weibull Distribution), China Architecture & Building Press, Beijing, China, 1996.
- GB50068-2001, Unified Standard for Reliability Design of Building Structures, China Architecture & Building Press, Beijing, China, 2001.
- GB50153-2008, Unified Standard for Reliability Design of Engineering Structures, China Architecture & Building Press, Beijing, China, 2009.
- S. S. Dai and H. L. Fei, Reliability Test and Statistical Analysis (First Book), National Defence Industry Press, Beijing, China, 1983.
- Research Department of Machinery Industry Standard Fourth, Table for Reliability Test, National Defence Industry Press, Beijing, China, 1979.
- GB12282 1-90, Tables for Life Testing Tables for Best Linear Unbiased Estimate (BLUE) (Extreme-Value Distribution, Weibull Distribution), Publishing House of Electronics Industry, Beijing, China, 1996.
- Z. Chang and X. M. Xie, “Research and suggestions on wind and snow live load value of buildings,” Building Science, vol. 27, no. 1, pp. 83–85, 2011.
- J. T. Yao and X. D. Wang, “Linear regression estimation of representative values of variable actions,” Journal of Building Structures, vol. 35, no. 10, pp. 98–103, 2014.
- C. J. Kang, W. F. Du, X. Yang, and Z. Zhu, “Calculation and analysis of snow loads based on the maximum likelihood method,” Journal of Henan University (Natural Science), vol. 46, no. 2, pp. 220–225, 2016.
- A. Q. Baig, M. Naeem, and W. Gao, “Revan and hyper-Revan indices of octahedral and icosahedral networks,” Applied Mathematics and Nonlinear Sciences, vol. 3, no. 1, pp. 33–40, 2018.
- M. Dewasurendra and K. Vajravelu, “On the method of inverse mapping for solutions of coupled systems of nonlinear differential equations arising in nanofluid flow, heat and mass transfer,” Applied Mathematics and Nonlinear Sciences, vol. 3, no. 1, pp. 1–14, 2018.
- Y. Qin, Y. Luo, Y. Zhao, and J. Zhang, “Research on relationship between tourism income and economic growth based on meta-analysis,” Applied Mathematics and Nonlinear Sciences, vol. 3, no. 1, pp. 105–114, 2018.
- J. F. Gómez-Aguilar and A. Atangana, “Time-fractional variable-order telegraph equation involving operators with Mittag-Leffler kernel,” Journal of Electromagnetic Waves and Applications, vol. 33, no. 2, pp. 165–177, 2019.
- D. P. Ahokpossi, A. Atangana, and P. D. Vermeulen, “Hydro-geochemical characterizations of a platinum group element groundwater system in Africa,” Journal of African Earth Sciences, vol. 138, pp. 348–366, 2018.
Copyright © 2020 Xudong Wang and Jitao Yao. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.