Piper-PCA-Fisher Recognition Model of Water Inrush Source: A Case Study of the Jiaozuo Mining Area
Source discrimination of mine water plays an important role in guiding mine water prevention in mine water management. To accurately determine water inrush source from a mine in the Jiaozuo mining area, a Piper trilinear diagram based on hydrochemical experimental data of stratified underground water in the area was utilized to determine typical water samples. Additionally, principal component analysis (PCA) was used for dimensionality reduction of conventional hydrochemical variables, after which mutually independent variables were extracted. The Piper-PCA-Fisher water inrush source recognition model was established by combining the Piper trilinear diagram and Fisher discrimination theory. Screened typical samples were used to conduct back-discriminate verification of the model. Results showed that 28 typical water samples in different aquifers were determined through the Piper trilinear diagram as a water sample set for training. Before PCA was carried out, the first five factors covered 98.92% of the information quantity of the original data and could effectively represent the data information of the original samples. During the one-by-one rediscrimination process of 28 groups of training samples using the Piper-PCA-Fisher water inrush source model, 100% correct discrimination rate was achieved. During the prediction and discrimination process of 13 samples, one water sample was misdiscriminated; hence, the correct prediscrimination rate was 92.3%. Compared with the traditional Fisher water source recognition model, the Piper-PCA-Fisher water source recognition model established in this study had higher accuracy in both rediscrimination and prediscrimination processes. Thus it had a strong ability to discriminate water inrush sources.
The threat posed to coal resources by mine water inrush accidents is prominent, usually causing local and even complete submergence of the mining area. Consequently, mine production efficiency will degrade or stagnate and bring about enormous economic loss. Jiaozuo mining area is a famous large water-filling mine in China and a typical North China-type coal field. Water damage caused by large-capacity and frequent water inrush occurs during the production process, resulting in submerging accidents and increasing the water drainage cost for coal output, as well as the production cost of per ton of coal. Water damage can also lead to major injuries, property loss, and loss of life. The main cause of water inrush in coal mines is rock fractured channels coming into contact with aquifers and mining roadways. A large volume of groundwater from a coal seam floor or a roof aquifer rushes into the roadway.
Water-conducting passages and water-bearing structures buried before the roadway drivage face may easily lead to water bursts in coal mines. Therefore, rapid and accurate recognition of water inrush source has important practical significance in water prevention and control in coal mines.
Methods of water inrush source recognition used in mines in recent years include hydrochemical characteristic component analysis, isotype analysis, artificial tracing, multivariate statistical analysis, multiclass clustering functions, environmental isotope [1–12], and other relatively typical analysis methods. The following methods are applied mostly in multivariate statistical analysis: hierarchical clustering linear discriminant method based on Fisher theory [13–15] and hydrochemical concentration forecast method based on artificial neural networks [16–18]. The distance discriminant analysis model was established to effectively predict the sources of mine water inrush at the base of the Jiaozuo mining area through measured data by selecting six discrimination factors . With the progress of production systems, extensive experimental studies have been carried out on the application of some nonlinear mathematical statistics analysis methods in recent years. For instance, Pen et al.  utilized fuzzy comprehensive judgment method to carry out water inrush source recognition in mines. Yang et al.  carried out a corresponding research on water inrush grading of the Hebi Mining Bureau. Yan et al.  used support vector machine theory to analyze the water inrush source of mines. Xu et al.  applied neural network to perform practical work on water inrush sources, and Zhang et al.  made a corresponding application of quantification theory to discriminate the mine water inrush source.
Numerous scholars have carried out considerable research on water inrush source models and obtained great success in their practical application, However, present discriminant methods have not considered the complicated information superposition problem between hydrochemical data, a problem that results in misdiscrimination of the established model in the practical application process, and their discrimination accuracy still needs further improvement. Therefore, in this paper, a new discriminant method (Piper-PCA-Fisher) is presented, which extracts and compresses the information of hydrochemical data, transforms original data into mutually independent new data without information superposition, and combines with the mathematical classification method to establish the discriminating model of water inrush source. In this way, the high-accuracy water source recognition model can be trained out. The Piper-PCA-Fisher water inrush source recognition model has important practical value in the Jiaozuo mining area, and it can also provide theoretical guidance for the prevention and control of water damage in other North China coal mines.
2. Description of the Study Area
Jiaozuo mining area is in the northwestern part of Henan Province, north of the Taihang Mountains and south of the Yellow River. This area has a temperate continental monsoon climate with four distinctive seasons and an annual average temperature of approximately 13.2°C . Influenced by meteorological conditions and topographic factors, the annual average rainfall in the northern mountainous area is about 700 mm, while that in the southern piedmont plain is about 600 mm. Rainfall mainly occurs in summer and autumn. The ground level of the northwestern mountainous area is 200–1790 m. The ground has elevations and depressions. The ground level of the piedmont alluvial plain in the southeast is 80–200 m. This topographic feature forms a noticeable control effect on underground water runoff.
The overall structural feature of strata in Jiaozuo mining area is monoclinic morphology with a tendency toward the southeast and an orientation at the southwest. Dip angle is 6°–16° and that of the local part is 25°–30°. This area has faulted structures, mostly normal faults with few folds. The principal fracture is a strong runoff zone of underground karst water with mature karstic development that intersects with a secondary fault to form a horizontally and longitudinally staggered stereoscopic fracture network. As a result, underground water in this area has close connections in both vertical and horizontal directions and hydraulic connections of different degrees between multiple aquifers to form complicated hydrogeological conditions (Figure 1).
From top to bottom, the main aquifers can be divided into four categories according to the lithology, thickness, water features, and burial conditions of the stratum. The first category is the Quaternary aquifer made up of Quaternary sandstone, clay, calcareous nodules, and the conglomerate bottom, which is the main aquifer for the segment. The second category is the coal-bearing sandstone aquifer that consists of sandstone, siltstone, and shale layers with a permeability () of 0.1–0.3 m/d. The third category is the carboniferous limestone aquifer composed of Taiyuan limestone. The fourth category is the Ordovician limestone aquifer that is the sedimentary basement of the coal measure strata. Jiaozuo coal-mining area mainly contains Ordovician limestone aquifer groundwater (type I), carboniferous limestone aquifer groundwater (type II), coal sandstone aquifer groundwater (type III), and Quaternary system aquifer groundwater (type IV).
3. Material and Methods
3.1. Data Analysis of Modeling Experiment
Water samples collected in the study area included underground water at different aquifers in the coal-mining area (Figure 1). 38 groups of water samples were collected from the coal-mining district, including 8 groups from type I, 10 groups from type II, 9 groups from type III, and 11 groups from type IV. Water samples were collected in clean 550 ml plastic bottles to determine hydrochemical ions. 13 groups of data (numbers A1–A13) in this paper were quoted from Wang et al.  and were used to test the model. Table 1 shows the results of water sample analysis. The chemical composition of the water samples was determined in the State Key Laboratory of Hydrology, Henan Polytechnic University, using a Shimadzu CTO-10Avp ion chromatograph and ICP-MS with a relative error of 1%, while was determined using dilute sulfuric acid-methyl orange titration.
3.2. Piper Trilinear Diagram Analysis
Underground hydrochemical components at aquifers change with the movement of groundwater. Different aquifers always have different hydrochemical features, and hydrochemical components at the same aquifer will also be noticeably different because of varying hydrogeological conditions. However, their chemical components will maintain a dynamic equilibrium through a series of physical and chemical reactions. Therefore, water samples at the same aquifer will present the same hydrochemical features after hydrochemical analysis. During the practical analysis process, hydrochemical components at the same aquifer sometimes vary greatly because samples are influenced by the hydraulic connection between different aquifers, the movement of underground water, and so on. Hence, we must determine typical water samples and find the water sample that can best represent hydrochemical components at this aquifer to establish a high-accuracy water inrush source discrimination model. The Piper trilinear diagram of water samples can be used to determine typical water samples. If water samples significantly deviate from the formation center in the Piper trilinear diagram, these samples should be discriminated as abnormal and excluded in this study.
3.3. Principal Component Analysis
Principal component analysis (PCA) is a multivariate statistical analysis method whose basic idea is data dimension reduction. Multiple observational variables with the original existence of information superposition are transformed into several mutually irrelevant aggregate variables through orthogonal transformation to extract feature information. Multiple correlation variables must also be simplified with the least information loss possible .
3.4. Fisher Mathematical Principle
The basic idea of Fisher discriminant analysis is projection. Specifically, it projects high-dimensional points onto low-dimensional space and uses univariate analysis of variance to establish a linear discriminant function per criteria of maximum between-class distance and minimum inner-class distance. Sample class can be discriminated per corresponding criterion. Fisher discriminant analysis can be used to skillfully avoid the “curse of dimension” and solve high-dimensional problems using a 1D method.
3.4.1. Solving Method of Fisher Discriminant Analysis
Suppose there are totalities , corresponding mean values, and correlation coefficient matrixes , respectively.
Samples with capacity are extracted from totality as follows: Then, is a projection of on the axis, where
vector represents one direction in -dimensional space; and is the inner product of and .
The projection of on the axis iswhere and and are the sample mean and total sample mean, respectively. Therefore, the inner-group difference is as follows:where is the sample dispersion matrix of samples in , and the intergroup difference isThe equation is . To make the maximum and make the solution unique, is set.
Therefore, the problem is transformed into solving , which causes to reach the maximum under . The Lagrange multiplier method is used, and the following is set:The partial differential of the above equation is solved and set as 0; that is,Through further arrangement, the following equation is obtained:
The equation shows that λ should be the maximum eigenvalue of , and is the eigenvector corresponding to . Hence, the Fisher discriminant function can be solved.
3.4.2. Verification of Fisher Discrimination Effect
To ascertain whether the above criterion was suitable, back-substitution estimation method was used for rediscrimination to estimate the misdiscrimination rate. For training samples with capacity from totality (where , all training samples were successively substituted into the established discriminant function and the corresponding criterion was used for water source recognition. Total misdiscrimination number was , and the misdiscrimination rate through back-substitution estimation was as follows:
4. Results and Discussion
4.1. Screening of Typical Water Samples
The Piper trilinear diagram (Figure 2) of Ordovician water samples showed that water samples numbers 2 and 7 significantly deviated from the formation center. Hence, these samples were discriminated as abnormal and excluded in this study. The remaining six groups were screened out as typical water samples in the Ordovician aquifer. The cation content in the remaining six groups of water samples, such as Ca2+, Mg2+, and Na+, was comparatively stable. Anion content was relatively large, except for the variation range of . The variation ranges of both Cl− and were small. The main water sample types of aquifer formation were Ca–Mg–HCO3, Ca–Mg–HCO3–SO4, and Na–SO4 because the Ordovician limestone aquifer runoffs had favorable discharge conditions. Influenced by the dissolution filtration effect, carbonate-type ores centering on calcite and dolomite formed a hydrochemical type with cations, mainly Ca2+ and Mg2+, and anions, mainly . Meanwhile, influenced by iron pyrite (Fe2S) at the bottom of the coal strata in this area, oxidation caused the water pH to decline and ions to subsequently enter the Ordovician limestone aquifer and increase the content of ions.
Piper trilinear diagram of Archaean limestone water samples (Figure 2) showed that Ca–Mg–HCO3, which belongs to typical carbonatite karst water, is the main hydrochemical type in carboniferous Archaean aquifer formation. In addition, water samples numbers 9, 10, and 14 were far from the formation center and clearly beyond the Ordovician aquifer formation. Thus, these samples were excluded, and the remaining seven groups were screened out as typical water samples of Ordovician aquifer.
Water samples numbers 28, 29, and 38 of the Quaternary aquifer (Figure 2) were very far from the formation center and were thus excluded. The remaining eight groups were regarded as typical water samples of the Quaternary aquifer.
Piper trilinear diagram analysis of water samples in the sandstone aquifer (Figure 2) showed that the main hydrochemical types of water samples in this formation were Na–Ca–HCO3, Na–HCO3, Ca–Mg–HCO3, and Na–HCO3–SO4. The sample number 19 was excluded. Clay content was high and underground water flowed slowly, which contributed to the sufficient exchange and accumulation of Na+ in the Permian sandstone aquifer formation. Hydrochemical types centering on Na–HCO3 were formed in underground water. The distributions of Na+, K+, Ca2+, and were uniform. Na+ and K+ ions were noticeably higher than those in other aquifer formations, and the distributions of Mg2+, Cl−, and ions were concentrated.
Therefore, 28 typical water samples (1, 3, 4, 5, 6, 8, 11, 12, 13, 15, 16, 17, 18, 20, 21, 23, 24, 25, 26, 27, 30, 31, 32, 33, 34, 35, 36, and 37) in Table 1 were finally confirmed as water samples for training the water inrush source recognition model in the next step.
Considering the difference and data effectiveness of components for different water sources, six components, namely, K+ + Na+ (), Mg2+ (), Ca2+ (), (), Cl− (), and (), were selected and compressed into five variables, namely, , , , , and , through PCA. These variables were then used as indexes of the water inrush source recognition model of the mine to establish the Fisher water source recognition model.
4.2. PCA Treatment of Data
PCA was carried out for the sample data in Table 1. The six water source components had definite correlations, where the correlation coefficient between Mg2+ () and Ca2+ () was 0.901, and a noticeable information superposition existed between data. PCA treatment of sample data was necessary to establish the accuracy of the water inrush source discrimination model. The first five factors covered most of the information quantity of the original data and occupied about 98.92%. Therefore, the five given principal components could effectively represent the original data information of samples.
According to the PCA matrix, relational expressions of the five extracted principal components , , , , and with the original variables , , , , , and were obtained as follows:
4.3. Fisher Discriminant Analysis
(a) Piper-PCA-Fisher Model. The data of principal components , , , , and obtained through PCA method were taken as input variables of the Fisher discriminant analysis model to make calculation for the Fisher discriminant analysis. The following Fisher discriminant functions were solved:
The first discriminant function: The second discriminant function: The third discriminant function:
Table 2 shows the central values of the first, second, and third discriminant functions in the distribution of water sources. Taking the first discriminant function as an example, the central values of the type I, II, III, and IV water sources were −3.126, −4.828, 7.812, and −0.266, respectively. Water source recognition was implemented by comparing distances from functional values of the water samples to be discriminated against the central values of the distribution of water sources.
(b) Fisher Model. The data of hydrochemical component were taken as input variables of the Fisher discriminant analysis model to make calculation for the Fisher discriminant analysis. The solved discriminant functions were as follows: where , , , and were the respective Fisher discriminant functions of types I, II, III, and IV; , , , , , and represent the contents of K+ + Na+, Mg2+, Ca2+, , Cl−, and , respectively; and the final item of discriminant function was a constant.
4.4. Verification of Water Inrush Source Recognition Model
To verify the accuracy of the established PCA-based Fisher water source recognition model, 28 groups of training samples in Table 1 were rediscriminated one by one. The results showed that all water samples were discriminated correctly, with a discrimination rate of 100%. Meanwhile, compared with the Fisher water source recognition model established without data processing through PCA, the traditional Fisher water source recognition model incurred multiple errors in its rediscrimination steps; the correct discrimination rate was less than 90%. Therefore, the Fisher water source recognition model based on PCA was more accurate, had a higher degree of stability, and could meet the actual requirements of water inrush water source recognition.
In practical application, 13 water samples to be discriminated in the Jiaozuo mining area were substituted into the trained Piper-PCA-Fisher water source recognition model for discrimination (Table 3). Except for Quaternary water sample number 11, which was misdiscriminated as Ordovician water, the prediscrimination results showed that the prediction results of other water samples complied with actual classification, and prediscrimination success rate was 92.3%. However, the traditional Fisher water source recognition model misdiscriminated repeatedly for water samples, resulting in a prediscrimination success rate of less than 80%. Through comprehensive comparison, the Piper-PCA-Fisher water source recognition model was more accurate and had more extensive applicability than the other models.
4.5. Application of Water Inrush Source Recognition Model
The identification method of water inrush source can be used in coal mines and can be carried out according to the following steps.
Step 1. Collect water sample data and aquifer information from the coal mine and select the representative water samples based on the Piper diagram.
Step 2. Analyze the hydrochemical ion concentration of the representative water samples through PCA and obtain the key principal component data.
Step 3. Use the Fisher classification model to train the key principal component data and establish the water inrush source recognition model of the mine.
Step 4. Test the water sample using the established water inrush source recognition model.
Step 5. Present the identification results of the water inrush source.
During the application process of the Piper-PCA-Fisher water source recognition model in the Jiaozuo mining area, as proposed in this study, misdiscrimination only appeared in one water sample from the Quaternary water source. The prediscrimination success rate was 92.3%, which indicated that the established water source recognition model was successful. The misdiscrimination of the water sample was primarily caused by the hydraulic connection between the Quaternary and Ordovician aquifers in the mining area.
During the model establishing process, the whole mining area was taken as the study object. During the application process, it might be impossible to implement accurate water inrush water source recognition for individual mines. Hydrogeological conditions and perfect underground hydrochemical database in aquifers must be sufficiently studied to apply the model to a single mine.
In this study, discrimination of mine water inrush water sources was based on finite data and was affected by data randomness, representativeness, and accuracy. Thus, we must extensively collect measured data, establish a corresponding training sample database, and enhance the applicability of this model.
Stratified sampling and experimental analysis of water quality were carried out based on the hydrogeological conditions of the mining area. The Piper trilinear diagram was then utilized to analyze and extract typical water samples. Finally, the Fisher water source recognition model based on PCA method was established. The following conclusions were drawn:
The overall hydrochemical features between aquifers were significantly different. The difference of hydrochemical type between individual water samples at the same aquifer was noticeable, and thus exclusion was needed.
Compared with the traditional Fisher water source recognition model, the Piper-PCA-Fisher water source recognition model established in this work had higher accuracy in both rediscrimination and prediscrimination processes.
The Piper-PCA-Fisher water source recognition model could meet the requirements of modern water inrush waterproof development.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was financially supported by the National Natural Science Foundation of China (Grant 41672240), Science and Technology Key Research Project of the Education Department of Henan, China (nos. 13A170313, 14A510022), China Postdoctoral Science Foundation (2017M612395), Innovation Scientists and Technicians Troop Construction Projects of Henan Province (Grant CXTD2016053), Henan Province’s Technological Innovation Team of Colleges and Universities (Grant 15IRTSTHN027), Fundamental Research Funds for the Universities of Henan Province (NSFRF1611), Scientists and Technicians Projects of Henan Province (Grant 172107000004).
P.-H. Huang, J.-S. Chen, and C. Ning, “The analysis of hydrogen and oxygen isotopes in the ground water of Jiaozuo mine area,” Journal of the China Coal Society, vol. 37, no. 5, pp. 770–775, 2012.View at: Google Scholar
G. Y. Pan, S. N. Wang, and X. Y. Sun, “Application of isotopic technique in identification of mine water inrush source,” Mining Safety & Environmental Protection, vol. 2008, no. 4, pp. 7–9, 2009.View at: Google Scholar
Z. Chen, D. Li, and F. Jiang, “The prediction model of ground water inrush from floor in jiaozuo coal mine,” Coal Geology & Exploration, vol. 4, pp. 38–40, 1996.View at: Google Scholar
X. Xu, B. Guo, and G. Z. Wang, “Application of artificial neural network for recognition of multiple water sources in mine,” Journal of Safety Science and Technology, vol. 12, no. 1, pp. 181–185, 2016.View at: Google Scholar
J. Qian, L. Wang, L. Ma, Y. Lu, W. Zhao, and Y. Zhang, “Multivariate statistical analysis of water chemistry in evaluating groundwater geochemical evolution and aquifer connectivity near a large coal mine, Anhui, China,” Environmental Earth Sciences, vol. 75, no. 9, article no. 747, 2016.View at: Publisher Site | Google Scholar
P.-H. Huang and J.-S. Chen, “Fisher indentify and mixing model based on multivariate statistical analysis of mine water inrush sources,” Journal of the China Coal Society, vol. 36, no. S1, pp. 131–136, 2011.View at: Google Scholar
J. T. Lu, X. B. Li, and F. Q. Gong, “Recognizing of mine water inrush sources based on principal components analysis and fisher discrimination analysis method,” China Safety Science Journal, vol. 22, no. 7, pp. 109–115, 2012.View at: Google Scholar
H. Chen J, X. B. Li, A. H. Liu et al., “Identifying of mine water inrush sources by Fisher discriminant analysis method,” Journal of Central South University, vol. 40, pp. 1114–1120, 2009.View at: Google Scholar
X.-Y. Wang, T. Xu, and D. Huang, “Application of distance discriminance in identifying water inrush resource in similar coalmine,” Journal of the China Coal Society, vol. 36, no. 8, pp. 1354–1358, 2011.View at: Google Scholar
X. D. Pen, Y. H. Guo, Y. W. Jie et al., “Application and discussion of fuzzy comprehensive evaluation in identifying mine inrush water source,” Mining safety and environmental protection, vol. 33, no. 3, pp. 57–59, 2006.View at: Google Scholar
Y. G. Yang, B. T. Li, and K. Q. Shang, “Fuzzy comprehensive evaluation of water gush and its prediction in coal mines of hebi mining bureau,” Journal of China University of Mining & Technology, vol. 27, no. 2, pp. 204–208, 1998.View at: Google Scholar
Z.-G. Yan, P.-J. Du, and D.-Z. Guo, “SVM models for analysing the headstreams of mine water inrush,” Journal of the China Coal Society, vol. 32, no. 8, pp. 842–847, 2007.View at: Google Scholar
Z. J. Xu, Y. G. Yang, and L. Tang, “Application of BP neural network in evaluation of water source in mine,” in Safety in Coal Mines, pp. 4–6, 2 edition, 2007.View at: Google Scholar
X. L. Zhang, Z. X. Zhang, and S. P. Peng, “Application of the second theory of quantification in identifying gushing water sources of coal mines,” Journal of China University of Mining & Technology, vol. 32, no. 3, pp. 251–254, 2003.View at: Google Scholar
I. M. Farnham, K. J. Stetzenbach, A. K. Singh, and K. H. Johannesson, “Deciphering groundwater flow systems in Oasis Valley, Nevada, using trace element chemistry, multivariate statistics, and geographical information system,” Mathematical Geology, vol. 32, no. 8, pp. 943–968, 2000.View at: Publisher Site | Google Scholar