A Support Vector Machine Model with Hyperparameters Optimised by Mind Evolutionary Algorithm for Assessing Permeability of Rock
In this paper, a database developed from the existing literature about permeability of rock was established. Based on the constructed database, a Support Vector Machine (SVM) model with hyperparameters optimised by Mind Evolutionary Algorithm (MEA) was proposed to predict the permeability of rock. Meanwhile, the Genetic Algorithm- (GA-) and Particle Swarm Algorithm- (PSO-) SVM models were constructed to compare the improving effects of MEA on the foretelling accuracy of machine learning models with those of GA and PSO, respectively. The following conclusions were drawn. MEA can increase the predictive accuracy of the constructed machine learning models remarkably in a few iteration times, which has better optimisation performance than that of GA and PSO. MEA-SVM has the best forecasting performance, followed by PSO-SVM, while the estimating precision of GA-SVM is lower than them. The proposed MEA-SVM model can accurately predict the permeability of rock indicating the model having a satisfactory generalization and extrapolation capacity.
Natural geological formations that mainly consist of rock are optimal sites for many engineering projects, such as hydropower stations, oil and gas reservoirs, and coal mines [1–3]. Rock can enhance the safety of engineering projects . For example, rock can prevent the leakage of oil and gas from oil and gas reservoirs . Additionally, for huge hydropower stations, rock can prevent water from being extruded into rock under a high water level to avoid reducing the stability of rock masses . Hence, a sound understanding of the permeability properties of rock is necessary for ensuring the safety and stability of engineering projects in this type of geological environment.
As a porous material, rock has cracks and pores, which are the main seepage channels for fluid flow . In nature, water inevitably exists in the small pores and cracks of rock [8–10]. For instance, rock in underground oil and gas reservoirs is often subjected to artificial water injection to improve the tightness of underground oil and gas reservoirs, and for the host rock for hydropower stations, water is forced into the pores and cracks of rock under a hydrostatic load [11, 12]. In actual engineering projects, the host rock is under stress conditions. The small cracks and pores in the rock are closed or opened with the changing confinement pressure and seepage pressure, significantly affecting the flow and distribution of the water inside the rock, as well as the permeability of the rock . To provide comprehensive guidance for the construction and operation of engineering projects involving rock, it is worthwhile to investigate the permeability changing rules of rock.
According to the results of physical experiments, empirical equations have been proposed to predict the permeability of rock [14–16]. Although the empirical equations have been used with success in some cases for predicting the permeability of rock, they are confined to a narrow application range and have a low forecasting accuracy owing to the small number of input parameters [17–19]. In the recent years, powerful machine-learning algorithms have often been utilized for estimating the properties of rock [20–27]. However, limited studies have been performed on the application of machine-learning techniques for predicting the permeability of rock [28–32].
In general, the machine learning models, without combining optimisation algorithms, are inefficient. [22, 33, 34]. Hence, the optimised algorithms, such as, Genetic Algorithm (GA) and Particle Swarm Optimisation Algorithm (PSO), were applied by some researchers to optimise the initial parameters of machine learning models for evaluating the permeability of rock, and the increase in both of predictive accuracy and convergence speed of the constructed machine learning models after combining optimisation algorithms has been demonstrated [35, 36]. However, GA and PSO still have some inherent drawbacks. For example, their computational efficiency is low with long operational time, and they cannot guarantee whether the gained result is globally optimum [37, 38], causing detrimental impacts on their optimisation effects. To solve the shortcomings, many works have been carried out by scholars, among of which is Mind Evolutionary Algorithm (MEA) proposed by Chengyi et al.  to overcome the aforementioned defects of GA and PSO to some extent and improve the optimisation effects [40, 41]. The better performance of MEA than that of GA and PSO on increasing the estimating accuracy of machine learning models has also been proved by researchers in the engineering field [37, 42, 43]. However, to the best knowledge of the authors, currently, the application of MEA in improving the performance of machine learning models for predicting permeability of rock has not been reported.
In this study, a database was developed based on the data collected from the existing literature. According to the database, SVM model combined with optimisation algorithm of Mind Evolutionary Algorithm (MEA) was established to predict permeability of rock. Meanwhile, the SVM models with hyperparameters optimised by GA and PSO, respectively, were constructed, to compare the improving effects of MEA on the foretelling accuracy of machine learning models with those of GA and PSO.
2. Database and Preprocessing Data
The database was developed based on the data collected from the existing literature. The database consists of 616 groups of data, which were divided randomly into 493 groups of training data (80%) for training the machine-learning models and 123 groups of testing data (20%) for testing the trained machine-learning models. In this program, each data group was randomly assigned a unique number ranging from 1 to 616. The data groups with number ranging from 1 to 431 were selected as the training dataset, and the data groups with number ranging from 431 to 616 were selected as the testing dataset. In each data group, pore pressure (Ps), confining pressure (Pc), moisture saturation (S), and confining pressure loading or unloading state (L) in the permeability tests were taken as the input parameters, and the corresponding permeability (k) of compact rock was taken as the output parameter. The statistics parameters and data type of the input and output parameters were tabulated in Table 1. Data distribution of the established database based on the input parameters was presented in Figure 1, respectively. In Figure 1, x-axis means the value of input parameters, and y-axis means the number of data groups in the established database corresponding to the value of input parameters. As shown in Figure 1, the distribution of data is uniform, which avoids the adverse influences of uneven data distribution on the predictive performance for the built machine learning models.
The input parameters for the machine learning models had different dimensions, which may affect the training time and prediction accuracy of the models . To improve the forecasting accuracy and operational efficiency of the machine learning models, the input and output parameters were normalized to the range of 0-1 using the following equation.where represents the normalized value, x represents the original value, represents the minimum value, and represents the maximum value.
3. Machine-Learning Algorithm and Optimisation Algorithms
SVM is a binary predictive model, which is able to divide and predict sample data to achieve structural risk minimization according to maximum margin principles . In this case, the radial basis function was selected as the kernel function of the SVM model, with penalty parameter c and of the kernel function being optimised.
GA introduces the biological evolutionary principle of “survival of the fittest” in the coded tandem population formed according to the optimisation parameters and selects individuals in the tandem population according to the individual fitness values and the operations of selection, crossover, and mutation such that individuals with high fitness values are retained and individuals with low fitness values are eliminated. The new generation of individuals inherits information from the previous generation of individuals and is superior to the previous generation. This process is repeated until the predetermined expired criterion is satisfied .
The PSO is based on the behavior of birds . In the PSO, there is a population of particles. Each particle in the population represents a potential solution to the target problem and also corresponds to a fitness value determined by the fitness function. The particle speed determines the direction and distance of the particle’s movement and is dynamically adjusted according to the previous movement of the particle and other particles to optimise individual solutions in the solution space .
4. Hyperparameters Optimisation
In this research, MEA was adopted to optimise the hyperparameters of the constructed SVM model, with Root-Mean-Square Error (RMSE) as the fitness function, as expressed in equation (2). MEA is a new heuristically evolutionary intelligence algorithm, and previous researchers have demonstrated that MEA is more capable of improving the predictive performance of machine learning models compared with other evolutionary intelligence algorithms [42, 49]. Compared with traditional intelligence algorithms, such as GA and PSO, MEA has several main advantages: (i) the computational efficiency is high due to the parallel computation of similartaxis and dissimilation operations. (ii) The evolutionary information that MEA can retain is more than one generation, which provides beneficial guidance on the operational directions of similartaxis and dissimilation operations. (iii) The similartaxis and dissimilation operations in MEA can avoid the damage of original information for individuals.where n is the number of sample data, is the measured value, and is the predictive value.
The specific operation of adopting MEA to optimise the SVM models is as follows: (1) randomly generate individuals that are composed of different hyperparameter values in solution space, (2) score the individuals based on fitness values (RMSE) obtained by calling the corresponding SVM model, and divide the individuals with low RMSE value as superior individuals and other individuals with high RMSE value as temporary individuals, (3) make the superior individuals and temporary individuals as centres, respectively, and generate new individuals around each centre individual to obtain superior subgroups and temporary subgroups, respectively, (4) perform similartaxis operations in each subgroup until the subgroup is mature (the RMSE value of the subgroup keeps unchangeable during continuous 6 times iteration), and take the RMSE value of optimal individual (centre individual) in each subgroup as the RMSE value of corresponding subgroup, (5) when the subgroups are mature, post the RMSE value of each subgroup on the global bulletin board, as shown in Figure 2, and conduct dissimilation operations between the superior subgroups and temporary, including replacing or abandoning subgroups, releasing individuals in abandoned subgroups and supplying new subgroups, (6) carry out similartaxis operations in the new supplying subgroups, and repeat Step (4) to Step (5) until the RMSE value of new supplying subgroups is lower than those of superior subgroups, respectively, (7) take the centre individual in the superior subgroup with the lowest RMSE as the global superior individual, and assign the hyperparameter values of the global superior individual as the initial hyperparameter values to the established SVM model, and (8) train the built SVM model, and conduct prediction; the detailed optimisation process is as shown in Figure 3.
In this case, the population size was set as 300, and the number of superior subgroups and temporary subgroups was the same, set as 3. The size of the subgroup is 30, and the maximum iteration number is 20. Additionally, the fitness function value (RMSE) of individuals in MEA was obtained using a 10-fold cross-validation method (k-CV) on the corresponding SVM model during the process of hyperparameter optimisation . To compare the optimisation effects between MEA and traditional optimisation algorithms, the SVM models with hyperparameters optimised by GA and PSO were constructed, respectively, and the predictive results of the GA and PSO-SVM model were compared with those of MEA-SVM, respectively. The detailed introduction about the optimised hyperparameters of the built SVM algorithms and their optimising range is provided in Table 2.
5. Quality Assessment
Predictive precision of the established machine-learning models was assessed using two indices: RMSE and Mean Absolute Percentage Error (MAPE), as expressed in equations (2) and (3), respectively .where n represents the number of sample data, represents the measured value, represents the average measured value, and represents the predicted value.
6. Results and Analysis
6.1. Results of Hyperparameter Optimisation
As aforementioned, MEA was utilized to optimise the hyperparameters of machine learning algorithms, with RMSE as the fitness function. For each machine learning algorithm, three initial superior subgroups and three initial temporary subgroups were generated. The optimising process of the subgroups during the simillartaxis and dissimilation operations was recorded, as shown in Figure 4.
Based on Figure 4, for the initial superior subgroups and temporary subgroups, after several similartaxis operations, the RMSE of each subgroup tends to be steady, which indicates the subgroups are mature. Then, the dissimilation operations were conducted. In the dissimilation operation, the RMSE of temporary subgroups and superior subgroups was compared, and the superior subgroups were replaced by the temporary subgroups with lower RMSE. The rest of the temporary subgroups with high RMSE were abandoned, and individuals in them were released. Then, the released individuals were regrouped to form new temporary groups, and the similartaxis operations were performed again on the subgroups, as shown in Figure 5. For the new superior subgroups, the RMSE of them remains stable because they are already mature, as shown in Figure 5. By comparing the RMSE of new temporary subgroups and new superior subgroups, it can be seen that the RMSE of each superior subgroup is lower than that of the temporary subgroup, which meets the end criterion. Therefore, the subgroups need not perform dissimilation operations again, and the corresponding hyperparameters of the centre individual in the superior subgroup with the lowest RMSE were assigned as the initial parameters to the corresponding machine learning algorithms, respectively.
Based on the aforementioned analysis, for most of the subgroups, their RMSE reduces obviously and becomes stable within 15 iteration times. It indicates that MEA was efficient in the hyperparameters optimisation of the established SVM models, which can enhance the predictive accuracy of the established SVM model remarkably with high efficiency.
To compare the optimisation effects between MEA and GA/PSO, the optimised processes of the SVM model using GA and PSO are shown in Figure 5, respectively.
According to Figures 4 and 5, when MEA was adopted to optimise the SVM model, RMSE becomes stable within 18 times iterations, which is obviously lower than those of GA (80 times) and PSO (70 times). Additionally, after the optimisation of MEA, the RMSE of SVM model reduces to 4.69, while for GA and PSO, they are 8.52 and 7.99, respectively. It demonstrates that the optimisation performance of MEA on the SVM model is better than those of GA and PSO in both the aspects of optimising efficiency and increasing magnitude in predictive accuracy.
6.2. The Prediction on the Training Dataset
According to Figures 6–8, for the training dataset, the forecasting performance of the SVM model with hyperparameters optimised by MEA is the best among the constructed models in terms of the statistics parameters: RMSE and MAPE. More specifically, the MEA-SVM model achieved the lowest RMSE (4.61) and MAPE (9.15%), among the models, followed by PSO-SVM, with GA-SVM having poor predictive accuracy.
6.3. The Prediction on the Testing Dataset
As shown in Figures 7–9, for the testing dataset, the SVM model with hyperparameters optimised by MEA still has the highest foretelling precision among the models, with the lowest RMSE (6.69) and MAPE (11.07%) followed by PSO-SVM. The predictive performance of the GA-SVM model is the worst among them.
The results indicate that, for both of the testing dataset and training dataset, the MEA-SVM model has the highest precision in estimating the permeability of compact rock, and MEA has better performance in improving the estimating accuracy of the machine learning model than those of GA and PSO, respectively.
7. Engineering Application
Based on the foregoing analysis, reliability of the established MEA-SVM model for predicting the permeability of rock is validated. To further evaluate the generalization and extrapolation capacity of the machine-learning models for predicting the permeability of rock from other engineering sites, the established MEA-SVM model was adopted to estimate the permeability of another type of rock. This was performed by inputting the confinement pressure, seepage pressure, and moisture saturation of the rock into the developed MEA-SVM model. In the existing literature, researchers have investigated the effects of the confinement pressure, seepage pressure, and moisture saturation of rock on its permeability.
The permeability of the rock conducted by other researchers in the existing literature was compared with the permeability predicted using the proposed MEA-SVM model to validate the applicability of the established MEA-SVM model for the prediction of the permeability of another type of rock. The predictive permeability and the measured permeability by other researchers are presented in Figure 10. To evaluate the forecasting performance of the established MEA-SVM model, the prediction precision of the built MEA-SVM models was assessed according to MAPE and RMSE, as shown in Figure 10, respectively.
As indicated in Figure 10, the permeability predicted using the proposed MEA-SVM model matched the measured permeability by other researchers well, with an RMSE of 4.00 and MAPE of 10.14%, indicating that the proposed MEA-SVM model can accurately predict the permeability of the rock and that the established MEA-SVM model is reliable for predicting the permeability of another type of rock with a satisfactory generalization and extrapolation capacity.
In this paper, based on the established database collected from the existing literature, an SVM model with hyperparameters optimised by adopting MEA was proposed to predict the permeability of rock. Meanwhile, the GA and PSO-SVM models were constructed to compare the improving effects of MEA on the foretelling accuracy of machine learning models with those of GA and PSO, respectively. The following conclusions were drawn.(1)MEA can increase the predictive accuracy of the constructed machine learning models remarkably in few iteration times, which has better optimisation performance than that of GA and PSO(2)For both of the testing dataset and training dataset, MEA-SVM has the best forecasting performance, followed by PSO-SVM, while the estimating precision of GA-SVM is lower than them, in terms of the statistics criterions: RMSE and MAPE(3)The proposed MEA-SVM model can accurately predict the permeability of another type of rock indicating the model having a satisfactory generalization and extrapolation capacity
All data used to support the findings of this study are derived from public domain resources and available within the article.
Conflicts of Interest
The authors declare no conflicts of interest.
Zhiming Chao and Guotao Ma are the joint corresponding authors.
W. Liu, Y. Li, C. Yang, J. J. K. Daemen, Y. Yang, and G. Zhang, “Permeability characteristics of mudstone cap rock and interlayers in bedded salt formations and tightness assessment for underground gas storage caverns,” Engineering Geology, vol. 193, pp. 212–223, 2015.View at: Publisher Site | Google Scholar
X. Liu, C. a. Tang, L. Li, and P. Lv, “Microseismic monitoring and stability analysis of the right bank slope at dagangshan hydropower station after the initial impoundment,” International Journal of Rock Mechanics and Mining Sciences, vol. 108, pp. 128–141, 2018.View at: Publisher Site | Google Scholar
B. Moradi, P. Pourafshary, F. Jalali, M. Mohammadi, and M. A. Emadi, “Experimental study of water-based nanofluid alternating gas injection as a novel enhanced oil-recovery method in oil-wet carbonate reservoirs,” Journal of Natural Gas Science and Engineering, vol. 27, pp. 64–73, 2015.View at: Publisher Site | Google Scholar
A. Weller, L. Slater, A. Binley, S. Nordsiek, and S. Xu, “Permeability prediction based on induced polarization: insights from measurements on sandstone and unconsolidated samples spanning a wide permeability range,” Geophysics, vol. 80, no. 2, pp. D161–D173, 2015.View at: Publisher Site | Google Scholar
J.-Q. Shi and S. Durucan, “Near-exponential relationship between effective stress and permeability of porous rocks revealed in Gangi’s phenomenological models and application to gas shales,” International Journal of Coal Geology, vol. 154-155, pp. 111–122, 2016.View at: Publisher Site | Google Scholar
W. Chen, M. Hasanipanah, H. N. Rad, D. J. Armaghani, and M. Tahir, “A new design of evolutionary hybrid optimization of SVR model in predicting the blast-induced ground vibration,” Engineering with Computers, vol. 14, pp. 1–17, 2019.View at: Google Scholar
E. Ebrahimi, M. Monjezi, M. R. Khalesi, and D. J. Armaghani, “Prediction and optimization of back-break and rock fragmentation using an artificial neural network and a bee colony algorithm,” Bulletin of Engineering Geology and the Environment, vol. 75, no. 1, pp. 27–36, 2016.View at: Publisher Site | Google Scholar
K. O. Akande, T. O. Owolabi, S. O. Olatunji, and A. AbdulRaheem, “A hybrid particle swarm optimization and support vector regression model for modelling permeability prediction of hydrocarbon reservoir,” Journal of Petroleum Science and Engineering, vol. 150, pp. 43–53, 2017.View at: Publisher Site | Google Scholar
X. Shi, J. Wang, G. Liu, L. Yang, X. Ge, and S. Jiang, “Application of extreme learning machine and neural networks in total organic carbon content prediction in organic shale with wire line logs,” Journal of Natural Gas Science and Engineering, vol. 33, pp. 687–702, 2016.View at: Publisher Site | Google Scholar
A. Saghatforoush, M. Monjezi, R. Shirani Faradonbeh, and D. Jahed Armaghani, “Combination of neural network and ant colony optimization algorithms for prediction and optimization of flyrock and back-break induced by blasting,” Engineering with Computers, vol. 32, no. 2, pp. 255–266, 2016.View at: Publisher Site | Google Scholar
K. Xie, Y. Du, and C. Sun, “Application of the mind-evolution-based machine learning in mixture-ratio calculation of raw materials cement,” in Proceedings of the 3rd World Congress on Intelligent Control and Automation (Cat. No. 00EX393), Hefei, China, July 2000.View at: Publisher Site | Google Scholar
I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, Burlington, MA, USA, 2016.
C. R. Houck, J. Joines, and M. G. Kay, “A genetic algorithm for function optimization: a Matlab implementation,” Ncsu-ie Tr, vol. 95, no. 09, pp. 1–10, 1995.View at: Google Scholar
R. Eberhart and J. Kennedy, “Particle swarm optimization,” in Proceedings of the IEEE International Conference on Neural Networks, Perth, Western Australia, December 1995.View at: Google Scholar
R. V. Hogg, J. McKean, and A. T. Craig, Introduction to Mathematical Statistics, Pearson Education, London, UK, 2005.