Research Article  Open Access
Prediction Model of Collapse Risk Based on Information Entropy and Distance Discriminant Analysis Method
Abstract
The prediction and risk classification of collapse is an important issue in the process of highway construction in mountainous regions. Based on the principles of information entropy and Mahalanobis distance discriminant analysis, we have produced a collapse hazard prediction model. We used the entropy measure method to reduce the influence indexes of the collapse activity and extracted the nine main indexes affecting collapse activity as the discriminant factors of the distance discriminant analysis model (i.e., slope shape, aspect, gradient, and height, along with exposure of the structural face, stratum lithology, relationship between weakness face and free face, vegetation cover rate, and degree of rock weathering). We employ postearthquake collapse data in relation to construction of the YingxiuWolong highway, Hanchuan County, China, as training samples for analysis. The results were analyzed using the back substitution estimation method, showing high accuracy and no errors, and were the same as the prediction result of uncertainty measure. Results show that the classification model based on information entropy and distance discriminant analysis achieves the purpose of index optimization and has excellent performance, high prediction accuracy, and a zero falsepositive rate. The model can be used as a tool for future evaluation of collapse risk.
1. Introduction
Collapse is a geological phenomenon whereby rock and soil on a steep slope suddenly fail, move downslope, and accumulate at the foot of the slope, as a result of gravity and other external forces [1–8]. Due to the ongoing development of railways, highways, and other projects in western China, slope collapse and landslides are also increasing. During and following earthquakes, high and steep slopes are prone to collapse. The 5.12 magnitude Wenchuan earthquake in 2008, for example, resulted in subsequent collapse [9], causing serious damage to highways and other transportation infrastructure. Therefore, a quantitative assessment of damage to highways caused by collapse would provide strong support for a risk assessment of regional geological disasters and lay the foundation for the sustainable development of the region of interest.
Many methods of disaster risk assessment relating to collapse have been developed. For example, analytic hierarchy process (AHP) and the fuzzy comprehensive evaluation method were applied to evaluate the risk of collapse by Liu [10] and Xue [11]. These authors proposed a risk evaluation model of collapse hazard, based on the comprehensive integrated method of extenics and fuzzy theory. Gao et al. [12] developed a model based on geographic information system. He et al. [13] established a comprehensive evaluation model of collapse risk evaluation based on uncertainty measurement models, and Liu [14] used the probability method and a Newmark displacement calculation model to evaluate the landslide hazards in Changbai Mountains area. The traditional evaluation method is simple and fast, but deciding which factors are used in each case is subjective. In some cases significant calculations are required and the application of this approach is limited. Since the specific geological conditions vary by location, the different evaluation methods in consideration of factors due to the different criteria cannot be used each other [15]. The discriminant analysis method is also known as the “resolution method” and is a multivariate statistical method of analysis. This method is used to determine the attribution type based on various eigenvalue. The discriminant analysis method has been widely used in many fields of natural and social sciences since its development. Especially in China, the distance discriminant analysis model was firstly introduced in the practical geotechnical engineering by Gong and Li [15, 16], and the good results had been obtained. We consider here the factors affecting collapse activity during construction and develop a risk prediction and evaluation model of collapse hazard, based on the distance discriminant analysis method. This approach provides a new method of risk prediction and evaluation of collapse hazard.
In practice, the impact index of collapse activity is more complex than can be modelled, due to variations in both the internal characteristics of the collapse body (internal causes), such as elevation, slope, lithology, soil type, and land utilization, and exterior factors that induce collapse disaster (external causes), such as groundwater, precipitation, vibration, and human activities. Some of these factors have no influence on the results of the present study. Therefore, we applied the entropy reduction method when analyzing the collapses on both sides of the YingxiuWolong highway in Hanchuan County, Sichuan Province, after the 5.12 magnitude Wenchuan earthquake. To complete this process, we reduced the initial indexes, gained the major evaluation index system affecting collapse activity, and excluded uncorrelated indexes. On this basis, we collected further data and used the distance discriminant model to perform a comprehensive evaluation of the collapse risk.
2. Entropy Measurement
We obtained an evaluation matrix by evaluating indexes of schemes. The term is an assessment value of the index of scheme. For a given , has greater difference, and the comparison function of index to scheme is bigger. It also contains and transmits more decisionmaking information; information entropy can measure the information intensity; that is, where , is the natural logarithm, and .
This formula of entropy quantification shows that, for a given , has a greater difference and is greater. When all are equal, , is at its maximum value. That is, if , . For the purpose of scheme comparison, index has no distinguishing ability.
Evaluation value of each project is of greater difference, comparison of index to scheme is bigger, and the distinguishing ability of the index is stronger.
The total entropy of the evaluation matrix is defined for . If , is a measure of the ability of each index to distinguish [17, 18].
3. Classification Model of the Distance Discriminant Analysis Method
The discriminant analysis method is typically based on sample data of each category grasped in the past and summarizes the law of objective classification. This establishes specific criteria to determine which overall category a new sample belongs to. In discriminant analysis, consideration of the Euclidean distance does not consider the dispersion characteristics of the overall distribution. Mahalanobis [15, 16, 19–21] first suggested the concept of the Mahalanobis distance in 1936. The basic principle is to compare the Mahalanobis distance of the sample with some entire population; the nearest belongs to some entire population.
3.1. Mahalanobis Distance
The population is given as , where is the dimension population, the sample is , and the set is . The population mean vector is . The covariance matrix of the population, , iswhere and are set for two samples from the population . The square of the Mahalanobis distance between and is
The square of Mahalanobis distance of the sample and the population is
3.2. Distance Discriminant of Two Populations
The means of the two populations and are and , respectively, and the covariance matrices are and , respectively (). is a new sample for which the population was determined. The distance from to and is defined for and and is determined according to the following criteria:
When , the discriminant can be simplified as follows:where and .
Note that the real number is equal to its transpose, so
The set ; therefore, the discriminant rule is
In practical terms, because the population mean and covariance matrix are typically unknown, the data are from two population training samples; then the mean and covariance matrix of the sample is used instead of the population mean and covariance.
Only two sample covariance matrices, and , can be determined. We therefore use the following to determine the total covariance matrix, when the two population covariance matrices are equivalent:where and are the capacity of the two samples.
When ,The discriminant rule is
3.3. Multipopulation Distance Discrimination
For the mean vector of and the dimension population and , the covariance matrix is . Therefore, the square of the Mahalanobis distance from sample to each group is
The discriminant rule is , if .
3.4. Evaluation of Discrimination Criteria
To investigate the properties of the abovementioned criterion, the back substitution estimation method, based on the training samples, is used to calculate the error.
Two populations, and , are used, and the population of the training sample is where is the number of samples taken from , and the capacity of the two populations is and . All of the training samples are new samples (), and substituted discriminant criteria are established to discriminate ownership, a process known as back discriminant. indicates that the number of samples in population is misjudged as the population , and indicates that the number of samples in the population is misjudged as the population . The back substitution estimation of error is
4. Evaluation Index System of Highway Collapse Risk
The factors contributing to risk of collapse are varied and complex [22–28]. We have proposed here a highway collapse hazard evaluation system, based on 15 indexes. The evaluation grade of each index is one of four points . These can be rated as I (extremely high risk), II (high risk), III (moderate risk), and IV (low risk). The specific grading standards and descriptions are listed in Table 1, and the classification of series is listed in Table 2.


5. Typical Examples of Collapse along the YingxiuWolong Highway following an Earthquake
The YingxiuWolong highway (highway S303) is located in Hanchuan County, Sichuan Province, along the northwest margin of the Sichuan Basin. The highway’s total length is 45.5 km and it is an important trunk road. The highway was fully paved prior to the 5.12 magnitude Wenchuan earthquake. The highway follows the Longmenshan tectonic belt from the BeichuanYingxiu Fault to the Houshan Fault, and the geological conditions are complex. It is the nearest highway to the epicenter in the Hanchuan disaster area, and seismic hazards in the region can cause serious damage. Many rock mass collapses caused by the earthquake buried and damaged the road itself, bridges, and a tunnel entrance and exit. As a result of the focus of research on disaster assessment and the prevention of further slope collapses along the highway after the earthquake, a wealth of data regarding collapse risk has been accumulated.
The collapse data used here were collected along the YingxiuWolong highway after an earthquake [13, 23]. Fifteen collapses were chosen and assigned values for each influencing factor. Each qualitative index is valued by the classification standard quantitative method. It is used to divide each index into four categories, namely, I, II, III, and IV, respectively, indicating extremely high risk, high risk, moderate risk, and low risk, and the corresponding numerical value of 4, 3, 2, and 1 is given to these four grades, respectively. Each quantitative index is valued by measured values. The basic evaluation data of each collapse are listed in Table 3.

5.1. Reduction of the Evaluation Index System Based on Information Entropy
In the case of collapse BT05, the information entropy calculation is as follows: . The information entropy values of other indicators are listed in Table 4.

According to the total entropy formula of the valuation matrix, , where and the distinguishing ability of index . Similarly, the distinguishing ability of other indexes can be obtained (Table 5).

The collapse hazard distinguishing abilities of indexes 5, 8, 10, 11, 14, and 15 (Table 5) are small and can be considered for exclusion. Removing these six indexes from the evaluation model, at the time of evaluation, these factors are no longer the most important index; they can be set to the “threshold” or “critical value” to the preliminary screening alternatives. Based on the above analysis, the nine indexes remaining form the basis of the evaluation index system of highway collapse risk. The investigation statistics data of the index system after reduction are listed in Table 6.

5.2. Distance Discriminant Analysis Model for Collapse Hazard Level Discriminant
In the collapse data, 10sample data were studied and the remaining 5sample data were taken as unknown samples to determine. Nine factors (slope shape, aspect, gradient, and height and exposed structural face, strata lithology, relationship between weakness face and free face, rainfall erosion, and weathering degree of rock) were taken as the discrimination factors for the distance discriminant analysis model. The collapse hazard was divided into four levels (I, II, III, and IV), respectively, indicating extremely high, high, moderate, and low risk.
The I, II, III, and IV collapse risk categories were derived from four different populations, and we assume that the covariance matrices of the four populations are equivalent. From above, a model of distance discriminant analysis was established. The input layer of the node number in the model is nine and the output layer node number is four, corresponding to four kinds of collapse identified. The distance discriminant analysis model employed in this study is shown in Figure 1.
5.3. Test of the Discriminant Model
Ten groups of measured data were used as training samples, five groups of measured data for the discriminant samples, and 15 samples for analyses, based on a distance discriminant model. Ten training samples were classified and compared with actual results (Table 7).
 
Discriminant samples of collapse. 
Ten historical training datasets in the study area were analyzed and compared with the actual collapse risk types. The ten groups of collapse risk type were the same as the discriminant type, and the error rate was zero. These results show the reliability of the distance discriminant model after training.
Five groups of measured data for the discriminant samples were analyzed using distance discriminant prediction by statistical software, and the prediction results were compared with the actual situation (Table 7), obtained through field investigation and comprehensive analysis. The prediction result is in accordance with observed results, and the accuracy rate is 100%. The results indicate that the risk level prediction and identification method of collapse hazard established in this study is effective.
6. Conclusions
() This distance discriminant analysis model for collapse hazard prediction is based on results from previous works and considers factors influencing hazard uncertainty and information entropy.
() We have used methods of entropy measurement to reduce the number of indexes, by excluding irrelevant indexes, to achieve the aim of index optimization. The nine principal indexes affecting collapse activity were extracted as the discriminant factors of the distance discriminant analysis model to develop discriminant prediction. These indexes are slope shape, aspect, gradient, and height, along with exposed structural face, strata lithology, relationship between weakness face and free face, vegetation cover rate, and weathering degree of rock.
() Results show that the model achieves the aim of index optimization and has good learning performance, a zero error rate, and high prediction accuracy. It is an effective method for the prediction of collapse hazard and provides a new way to evaluate the risk of collapse.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was financially supported by a major program of the National Natural Science Foundation of China (41072223), geological survey projects of the Geological Survey Bureau of China (1212010914205), the Natural Science Foundation of Shaanxi Province (2016JM4001), the Special Fund for Basic Scientific Research of Central Colleges (310827153408 and 310827173702), and the Open Fund for State Key Laboratory of Water Resource Protection and Utilization in Coal Mining (SHJT1630.19). The authors would like to thank their colleagues for their support during this study.
References
 Y. Yue, Study on hazard evaluation landslide geological disaster based GIS, Tongji University, Shanghai, China, 2008.
 J. M. Liu, Risk Assessment of Collapse Disaster along NanYan Section of Provincial Road S219, China University of Geosciences, Beijing, China, 2010.
 S. Li, Study on classification and formation mechanism of highway slope landslide disaster after earthquake, Chang'an University, Xi'an, China, 2011.
 H. J. Qiu, Study on the Regional Landslide Characteristic Analysis and Hazard Assessment: A case study of Ningqiang County, Northwest University, Xi'an, China, 2012.
 K. Silhan, R. Prokesova, A. Medved'ova, and R. Tichavsky, “The effectiveness of dendrogeomorphic methods for reconstruction ofpast spatiotemporal landslide behavior,” Catena, vol. 147, pp. 325–333, 2016. View at: Google Scholar
 S. Mead, C. Magill, and J. Hilton, “Raintriggered lahar susceptibility using a shallow landslide and surface erosion model,” Geomorphology, vol. 273, pp. 168–177, 2016. View at: Publisher Site  Google Scholar
 E. Yalcinkaya, H. Alp, O. Ozel et al., “Nearsurface geophysical methods for investigating the Buyukcekmece landslide in Istanbul, Turkey,” Journal of Applied Geophysics, vol. 134, pp. 23–35, 2016. View at: Publisher Site  Google Scholar
 H. L. Qi, W. P. Tain, and J. C. Li, “Regional risk evaluation of highway collapse induced by continuous rain in Shaanxi province,” Journal of Changan University (Natural Science Edition, vol. 36, no. 3, pp. 7–12, 2016. View at: Google Scholar
 W. Xu, G. H. Chen, G. H. Yu et al., “Seismogenic structure of Lushan earthquake and its relationship with Wenchuan earthquake,” Earth Science Frontiers, vol. 20, no. 3, pp. 11–20, 2013. View at: Google Scholar
 L. Liu, Study on Landslip Disaster Risk of Chengkun Railway K242K331, Southwest Jiaotong University, Chengdu, China, 2010.
 K. X. Xue, Theory and Application of Risk Assessment for Extreme Rainfallinduced Geological Hazards of Mountain Road, Chongqing University, Chongqing, China, 2011.
 K. C. Gao, P. Cui, C. Y. Zhao, and F. Q. Wei, “Landslide hazard evaluation of Wanzhou based on GIS information value method in the three gorges reservoir,” Yanshilixue Yu Gongcheng Xuebao/Chinese Journal of Rock Mechanics and Engineering, vol. 25, no. 5, pp. 991–996, 2006. View at: Google Scholar
 H. J. He, S. R. Su, X. J. Wang, and P. Li, “Study and application on comprehensive evaluation model of landslide hazard based on uncertainty measure theory,” Journal of Central South University (Science and Technology), vol. 44, no. 4, pp. 1564–1570, 2013. View at: Google Scholar
 X. Liu, Hazard Assessment of Landslide and Collapse induced by Volcanic Eruption in Changbai Mountains, Jilin University, Changchun, China, 2016.
 F. Q. Gong and X. B. Li, “Application of distance discriminant analysis method to classification of engineering quality of rock masses,” Chinese Journal of Rock Mechanics and Engineering, vol. 26, no. 1, pp. 190–194, 2007 (Chinese). View at: Google Scholar
 F. Q. Gong and X. B. Li, “Distance discriminant analysis to the classification of the grade of shrink and expansion for the expansive soils,” Yantu Gongcheng Xuebao/Chinese Journal of Geotechnical Engineering, vol. 29, no. 3, pp. 463–466, 2007. View at: Google Scholar
 W. Q. Li, L. N. Zhang, and W. Q. Meng, “Comprehensive evaluation on MIS based on information entropy and unascertained measure model,” Journal of Hebei Institute of Architectural Science and Technology, vol. 22, no. 1, pp. 49–52, 2005. View at: Google Scholar
 W. L. Cai and Y. S. Huang, “Study on supply chain evaluation index reduction based on information entropy research,” Science and Technology Innovation Herald, vol. no. 36, pp. 174–176, 2007. View at: Google Scholar
 H. X. Gao, Multivariate statistical analysis, Peking University Press, Beijing, China, 2005.
 C. L. Mei and J. C. Fan, Data Analysis Method, Higher Education Press, Beijing, China, 2006.
 Z. H. Zhu, F. J. Zhao, and Z. Y. Ye, “Prediction of rock burst in mining based on distance discriminant analysis method,” China Safety Science Journal, vol. 18, no. 3, pp. 41–46, 2008. View at: Google Scholar
 A. Iosifidis, A. Tefas, and I. Pitas, “Multidimensional sequence classification based on fuzzy distances and discriminant analysis,” IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 11, pp. 2564–2575, 2013. View at: Publisher Site  Google Scholar
 H. J. He and C. X. Qu, “Evaluation of highway landslide hazard based on information entropy and uncertainty measure theory,” Disaster Advances, vol. 7, no. 4, pp. 57–64, 2014. View at: Google Scholar
 J. Jia, Q. Ruan, and Y. Jin, “Geometric Preserving Local Fisher Discriminant Analysis for person reidentification,” Neurocomputing, vol. 205, pp. 92–105, 2016. View at: Publisher Site  Google Scholar
 Z. W. Chen, Z. Q. Chen, and C. W. Yang, “Study and analysis of statistical law of slope collapse based on the Dujiangyan to Yingxiu section of national highway G213,” Highway, vol. 60, no. 11, pp. 51–56, 2015. View at: Google Scholar
 H. Q. Guo, L. K. Yao, and S. W. Sun, “Potential disaster prediction of seismic highlocality landslide on line project,” Journal of Harbin Institute of Technology, vol. 49, no. 3, pp. 25–32, 2016. View at: Google Scholar
 M. H. Hu, The Risk Assessment of Mountain Collapse in HuaibeiYanqi of Beijing, China University of Geosciences, Beijing, China, 2016.
 S. Sorbolini, G. Gaspa, R. Steri et al., “Use of canonical discriminant analysis to study signatures of selection in cattle,” Genetics Selection Evolution, vol. 48, no. 1, pp. 1–13, 2016. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2017 Hujun He et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.