Abstract

Porosity is an important parameter for the oil and gas storage, which reflects the geological characteristics of different historical periods. The logging parameters obtained from deep to shallow strata show the stratigraphic sedimentary characteristics in different geological periods, so there is a strong nonlinear mapping relationship between porosity and logging parameters. It is very important to make full use of logging parameters to predict the shale content and porosity of the reservoir for precise reservoir description. Deep neural network technology has strong data structure mining ability and has been applied to shale content prediction in recent years. In fact, the gated recurrent unit (GRU) neural network has further advantage in processing serialized data. Therefore, this study proposes a method to predict porosity by combining multiple logging parameters based on the GRU neural network. Firstly, the correlation measurement method based on Copula function is used to select the logging parameters most relevant to porosity parameters. Then, the GRU neural network is used to identify the nonlinear mapping relationship between logging data and porosity parameters. The application results in an exploration area of the Ordos basin show that this method is superior to multiple regression analysis and recurrent neural network method, which indicates that the GRU neural network is more effective in predicting a series of reservoir parameters such as porosity.

1. Introduction

Porosity is an important physical property parameter reflecting the reservoir capacity. Accurate calculation of reservoir porosity is the key work in geological interpretation and oil exploration and development. Each logging parameter carries porosity information to varying degrees, and the relationship between porosity and logging parameters is a typical multiparameter nonlinear mapping relationship. Making full use of various effective logging parameters to comprehensively predict porosity is of great significance to oil and gas exploration and development.

Reservoir porosity is affected by many geological factors, such as burial depth, structural location, sedimentary environment, lithology change, and diagenetic degree. From the perspective of rock geophysics, there is a typical nonlinear relationship between reservoir porosity and logging parameters [1, 2]. Bakhorji et al. believe that the porosity parameter obtained by petrophysical analysis through core sampling is the most accurate [3], since then, researchers have done a lot of relevant research, and Tao et al. [4] realized the reconstruction of the pore-fracture system of different marcolithotypes. Tao et al. [5] constructed a continuous distribution model of pore space for coal reservoirs. Tao et al. [6] determined the pore and fracture system by using the low-field nuclear magnetic resonance technique. However, the cost of sampling and testing is too high for a large-scale industrial production application. In conventional logging interpretation, the quantitative calculation of porosity usually adopts the theoretical porosity model of density, acoustic time difference, and compensated neutron logging, or establishes the regional empirical porosity model combined with core analysis [79]. However, from the point of view of the interpretation model, parameter selection, and mathematical processing method, it is difficult to establish a good mapping relationship between the porosity and core analysis data, and the results are greatly affected by human factors. Nuclear magnetic resonance (NMR) porosity logging is basically not affected by rock skeleton and only detects the signal of the fluid contained in pores. Therefore, this method has a strong advantage in predicting formation porosity [10], but due to the limitation of equipment and cost, this method cannot cover the logging work of the whole research area. In other words, if we can make full use of various logging parameters to comprehensively model and prediction of porosity, we can avoid not only the errors caused by man-made subjective factors but also the use of special methods such as expensive petrophysical experiments and nuclear magnetic resonance techniques. In a word, making full use of logging parameters to establish a porosity prediction model is expected to obtain accurate reservoir porosity information with high efficiency and low cost.

At present, some conventional machine learning algorithms have been applied to reservoir evaluation parameter prediction, such as BP neural network [1114], support vector machine [15, 16], and other shallow machine learning algorithms [1721]. However, shallow machine learning methods often need to extract artificial feature parameters, which requires strong domain knowledge and experience. Moreover, the ability of shallow machine learning to represent complex functions is limited in the case of limited samples, and its generalization ability is limited for complex nonlinear problems [22]. Traditional BP neural network has some problems, such as slow convergence speed and easy to fall into local optimal solution. The most important difference between deep learning and shallow learning is the increase of network structure depth. Deep neural network usually has more than three hidden layers. Through layer by layer feature extraction and transformation, the samples are transformed from the original spatial feature representation to the new high-dimensional feature space for representation and description, thus simplifying and improving the high accuracy of classification or regression prediction problems. Hinton et al. [23] revealed that the greatest value of neural networks lies in automatic extraction and feature extraction. It avoids the tedious work of manual feature extraction and can automatically find complex and effective high-order features.

At present, the commonly used deep learning methods mainly include convolutional neural network (CNN), recurrent neural network (RNN), and stack auto encoder (SAE). These methods have been successfully applied in the fields of image processing and speech recognition [2426]. Compared with the shallow machine learning method, the deep learning method has higher prediction accuracy. Because the sedimentary process of strata is gradual in time series, porosity is the response of the sedimentary characteristics of strata, so it has certain time series characteristics. When only using machine learning or deep learning method to predict physical property parameters, it is easy to ignore the variation trend of porosity parameters with reservoir depth and the correlation between historical data of different formation parameters. Recurrent neural network (RNN) is a typical deep neural network structure. Compared with fully connected neural networks, the biggest difference is that each hidden layer unit is not independent of each other and each hidden layer is not only related to each other, but also related to the timing input before the acceptance time of the hidden layer unit accepts. This feature is of great help to the processing of time series related data. The long short term memory (LSTM) is an improvement of the conventional RNN. The problem of gradient disappearance in conventional RNN is solved by the fine design of network ring. It is one of the most successful recurrent neural networks. Long-term and short-term memory (LSTM) is an improvement of traditional RNN. The problem of gradient disappearance in conventional RNN is solved by the fine design of network ring. It is one of the most successful recurrent neural networks. LSTM is very suitable for solving time series problems, but there are still some problems, such as complex grid structure, many training parameters, and slow convergence in training process. Gated recurrent unit (GRU) neural network is the optimization of the LSTM network, which has the same function as the LSTM network, but the former convergences faster. The GRU network has been applied in the fields of power, transportation, and finance [2730], but it has not been used to predict reservoir porosity parameter.

In summary, this study will use GRU neural network to predict reservoir porosity parameters on the bias of conventional logging parameters. Firstly, the correlation analysis (CA) based on Copula function is used to quantitatively calculate the nonlinear correlation between logging curves and porosity parameters, and the logging parameters with higher correlation degree with porosity parameter are selected. Then, based on the optimized logging parameters, a nonlinear mapping model between logging data and porosity parameters is established by using GRU neural network. In addition, in order to prove the advantages of the CA-GRU model in series porosity prediction, RNN, GRU, and MLR models are established as comparison models. Finally, the model proposed in this paper is applied to the actual data test, which proves the prediction accuracy and robustness of the proposed method.

2. Theory and Methodology

2.1. Correlation Analysis

Logging curves and porosity parameters reflect the characteristics of different depths of strata. To a certain extent, there is a certain correlation between the porosity parameters and the logging curves, but the test data often contain a variety of parameters reflecting different formation information from different angles. In practical application, if all sample data are directly used to establish the mapping relationship model between logging curves and porosity parameters, it will not only increase the complexity of the model but also lead to the loss of useful information or the increase of redundant information, resulting in the decrease of prediction accuracy. When some physical parameters need to be predicted in the research, it is necessary to consider the influence of different logging curves on the prediction accuracy of physical parameters. For example, through linear correlation analysis, some reliable, representative, and sensitive curves in logging data are selected as the input of modeling. However, Pearson linear correlation coefficient is often used to evaluate the correlation [13]. Pearson correlation coefficient analysis only focuses on linear correlation and often ignores the nonlinear relationship between porosity parameters and logging curves. Therefore, when the relationship between logging data and prediction parameters is nonlinear, it is not reliable to measure the correlation with the linear correlation coefficient. If Copula function is used to analyze the correlation between logging data and prediction parameters, the influence of the nonlinear correlation between parameters can be weakened to a certain extent. Based on Copula function and its derived correlation index, the nonlinear and asymmetric correlation between logging curve and predicted physical parameters can be accurately measured. Therefore, Kendall rank correlation coefficient τ and Spearman rank correlation coefficient ρ based on Copula function are used to quantitatively analyze the correlation between logging curves and porosity parameters. Among them, Kendall rank correlation coefficient τ can be used to measure the consistency change degree between logging parameters and porosity parameters, and spearman rank correlation coefficient ρ can be used to measure the monotonic correlation degree between logging curves and porosity parameters. The calculation results are compared with those calculated by Pearson linear correlation coefficient.

Copula function theory accurately describes the correlation between nonlinear and asymmetric variables. The details are as follows: suppose that the marginal probability distribution functions of an n-valued random variable distribution function (H) are , , , , respectively. Where is an n-dimensional random variable, there is a Copula function () which satisfies the following conditions:where N dimensional function () is defined as follows [31, 32]:where is the N-order symmetric positive definite matrix of all elements on the diagonal, is the determinant of matrix , is the distribution function of gamma distribution, is the degree of freedom of function with N variables, , is the inverse function of the univariate t distribution with degrees of freedom, and is the input independent variable.

Kendall rank correlation coefficient is used to measure the degree of consistency change between x and y. Suppose and are independent and identically distributed vectors, . Then, there iswhere is the probability distribution function. After deriving the above formula, we get

Suppose that the Copula function corresponding to is . Then, the Kendall rank correlation coefficient can be obtained from the corresponding Copula function as follows:

For Spearman rank correlation coefficient suppose that the joint distribution function of is , the marginal distribution functions of x and y are and , respectively; if , then the random variables x and y are independent of each other. If , , then and are independent of each other. If and are independent of each other, then there is

Suppose that the Copula function of is , where and , the Spearman rank correlation coefficient can also be obtained from the corresponding Copula function as follows:

2.2. Recurrent Neural Network (RNN)

Recurrent neural network (RNN) is a kind of neural network which is used to process sequence data [33]. In different time steps, RNN circularly shares weights and makes connections across time steps. The RNN structure with only one hidden layer is shown in Figure 1. Compared with multilayer perceptron, the RNN hidden layer is not only connected with the output layer but also connected with the hidden layer nodes. That is, the output of the hidden layer is transmitted not only to the output layer but also to the hidden layer itself. This makes RNN not only reduce the number of parameters but also establish a nonlinear relationship between the sequence data at different times. Therefore, RNN has unique advantages in dealing with nonlinear and time series problems.

2.3. Long- and Short-Term Memory (LSTM) Network

Long- and short-term memory (LSTM) network is an important improvement of RNN. It can effectively solve the problems of RNN gradient disappearance and gradient explosion and make the network have stronger memory ability. In addition, LSTM network can also remember longer historical data information, which not only has an external RNN cycle structure but also has an internal “LSTM cell” circulation (self-circulation). Therefore, LSTM does not simply impose an element by element nonlinearity on the affine transformation of input and loop units. It is similar to the common recirculating network, and each unit not only has the same input and output structure but also has a gate control unit system with more parameters and control information flow. The structure of LSTM hidden layer is shown in Figure 2. Ct−1 is the node state of the previous sequence of hidden layers, ht−1 is the output of the previous sequence of hidden layer nodes, xt is the input for the hidden layer node of the current sequence, Ct is the current sequence hidden layer node state, ht is the output of the hidden layer node of the current sequence, σ is the nonlinear activation function of sigmoid, and tanh is the hyperbolic tangent function.

Compared with RNN, LSTM network is better at learning the long-term dependence between sequence data, while LSTM network has complex structure, multi parameters, and slow convergence speed.

2.4. Gated Recurrent Unit (GRU) Neural Network

Gated recurrent unit (GRU) neural network, as an important variant of LSTM network, is the optimization and improvement of LSTM network. It inherits the ability of LSTM network to deal with nonlinear and time series problems. Moreover, it not only retains the memory unit function of LSTM network but also simplifies the structure and reduces the parameters, thus greatly improving the training speed [34]. The structure of the GRU neural network is shown in Figure 3, where Zt represents the update gate state, Rt represents the reset gate state, and Ht represents the pending output of the current neuron. GRU neural network improves the design of “gate” on the basis of LSTM network, that is, the original cell structure composed of three “Gates” is optimized to a cell structure composed of two “Gates”. In short, the gating cycle unit is consists of a reset gate and an update gate.

The state of reset gate and update gate at time t is defined as and , respectively.where and are weight matrices and is input data. The hidden state and the candidate hidden state can be calculated according to the following formula:

Two different activation functions in equations (8) and (9) can be defined as follows:

2.5. Structure of GRU Neural Network Prediction Model

In the prediction of reservoir porosity parameters, the logging curves from shallow to deep reflect the formation characteristics of different geological periods, and their change trends include important information of physical parameters [14]. When using the traditional statistical analysis and the conventional machine learning method to predict the porosity parameters, it is easy to destroy the potential internal relations in the historical series of logging parameters and reduce the accuracy of the prediction results. Unlike other machine learning or deep learning methods, GRU neural network has long-term memory ability [35]. By dealing with the long-term dependence between series data, GRU neural network can effectively reduce the influence of such relationships, and its internal control mechanism can also automatically learn time-series features [36]. Figure 4 shows the structure of the three-layer GRU neural network model.

As can be seen from Figure 4, the structure of the GRU neural network model includes input layer, hidden layer, and output layer, in which the hidden layer is the core part of network structure. In the process of training, it is necessary to optimize and adjust the super parameters of the GRU neural network model structure, including the number of hidden layer layers and the number of hidden layer neurons. Theoretically, the more hidden layers and the number of neurons, the better the model performance, the deeper and more complex the network, the higher the prediction accuracy. However, some studies [13, 18] have shown that too many hidden layer numbers and neuron numbers will lead to training difficulties and over fitting, which will reduce the prediction accuracy of the model. If the network is too shallow and simple, it will easily lead to insufficient fitting and fail to meet the expected requirements. Therefore, the selection of the number of hidden layers and the number of neurons is very important to the prediction performance of the network. We need to balance the learning ability of the network with the complexity of the training and the requirements of the prediction accuracy and determine the optimal value of nodes and number of hidden layers according to experience and many experimental results. In addition, the optimization of training parameters such as learning rate, batch size, and maximum iteration times can reduce the complexity of the model to a certain extent and improve the convergence speed and prediction accuracy of the model.

The training process of GRU neural network can be roughly divided into three steps as follows:Step 1: input training data into the network, calculate the output of GRU neural network unit from shallow layer to deep layer along the forward propagation direction, and obtain the predicted output value corresponding to the input data at the current time point.Step 2: the error of each neuron is calculated along the back-propagation direction. Error back-propagation of the GRU neural network includes propagation along the time sequence and the propagation layer by layer between the network layers. Step 3: the gradient of each weight is calculated according to the error calculated by back-propagation, and the parameters of the weight gradient adjustment network are calculated by using the learning rate-adaptive optimization algorithm (Adam algorithm). Finally, repeat the above steps to iteratively optimize the network.

2.6. Prediction Model Based on CA-GRU

The modeling process of the combination forecasting model based on CA-GRU is shown in Figure 5, which mainly includes the following six steps: Step 1: the obtained logging curves and porosity parameters are used as the database. The Kendall rank correlation coefficient τ, Spearman rank correlation coefficient ρ, and Pearson linear correlation coefficient P based on Copula function are used to quantitatively calculate and analyze the correlation degree between them, and the logging curves sensitive to the prediction parameters are selected to form new sample data. Step 2: the new sample data are standardized and divided into the training set and test set according to a certain proportion. Step 3: the GRU neural network model is constructed for porosity prediction, and the network parameters are initialized. The number of network layers and the number of hidden layer neurons are determined according to the experiment. Step 4: during the training process, the network structure is continuously optimized until the training error of the model reaches the preset target, and then, the model is saved. Step 5: test the trained GRU neural network model with the segmented test set and deformalize the predicted value of the model to obtain the predicted value of porosity parameters corresponding to the actual value. Step 6: the predicted value is compared with the actual value, and the error is analyzed. The prediction performance of the model is evaluated according to the corresponding evaluation index.

3. Data Processing and Analysis

3.1. Data Preparation

As shown in Figure 6(a), Ordos Basin is a large superimposed basin in the central part of China with huge oil and gas resources. In China, it is considered as one of the basins with the greatest potential for oil and gas reserves and production growth in China, and it is a petroliferous basin with stratigraphic and lithologic traps as its main structural traps. The accumulated exploration of natural gas reserves in Ordos Basin is about 2.7 × 1012 m3, and the geological resources to be explored are about 12.5 × 1012 m3, which indicates that Ordos Basin is still in the early stage of exploration. In addition, Ordos Basin is also rich in novel oil and gas resources such as coalbed methane, shale gas, and tight sandstone gas. Additionally, the amount of geology resources of coalbed methane in the basin is about 9.86 × 1012 m3 of which the recoverable amount is about 1.79 × 1012 m3 of recoverable. The amount of geological resources of shale gas in the basin is about 5.3 × 1012 m3 of which the recoverable reserves are about 2.91 × 1012 m3. The geological resources of tight sandstone gas in the basin are about 7.84 × 1012 m3, of which the recoverable amount is about 3.53 × 1012 m3 [18]. For conventional or unconventional oil and gas resources, it can be seen that Ordos Basin has great exploration potential. Therefore, a fast and accurate method is needed to obtain porosity information, which is an important parameter for oil and gas exploration.

As shown in Figure 6(b), well E1 is regarded as shale gas well in the northeastern part of Ordos Basin, China. It is a very important step for the database including logging data to prepare for the construction of the model. In this study, available well logs of well E1 include the spontaneous potential (SP), compensated neutron log (CNL), compressional wave slowness (DTC), resistivity (RT), Uranium (U), natural gamma-ray (GR), bulk density (DEN), Potassium (K), and Thorium (TH). Table 1 shows the summary of the recorded logging parameters data for well E1.

Figure 7 presents the logging parameters plot for well E1. This study is very vital and meaningful to oil and gas exploration areas since it is very important to obtain porosity information based on logging parameters in reservoir evaluation.

3.2. Data Analysis

For the model, it is very important to select suitable logging input when preparing for accurate porosity prediction. In this study, Kendall rank correlation coefficient τ, Spearman rank correlation coefficient ρ, and Pearson linear correlation coefficient R based on Copula function are used to quantitatively calculate the correlation between logging parameters and porosity parameters. The comparison of absolute values of three correlation coefficients is shown in Figure 8.

As can be seen from Figure 8, Pearson correlation analysis often ignores the nonlinear correlation between logging parameters and porosity. In the correlation analysis between logging data and porosity, the correlation coefficients of DTC, DEN, and CNL with porosity are relatively high, indicating that DTC, DEN, CNL, and porosity have a strong correlation with porosity. Although the Pearson correlation coefficient of GR and porosity is relatively low, however, the correlation analysis method based on Copula function obtains higher τ and ρ, which shows that the linear correlation between GR and porosity is low, but the nonlinear correlation is high, and there is a strong nonlinear correlation between them. To sum up, linear correlation and nonlinear correlation analysis methods are used to optimize conventional logging parameters in this study. Finally, four logging parameters, DTC, DEN, CNL, and GR, are selected as independent variables of network modeling, and a porosity prediction model is established.

4. Results and Discussion

As mentioned above, a set of 5165 data points from well E1 has been used to build the model. This set of data is divided into test and training subsets by depth. In this study, the training subset consists of the first 3874 of all data, while the test subset of testing consists of the rest data points. In comparison, RNN, GRU, CA-GRU, and multiple linear regression (MLR) models were selected and applied to predict porosity.

In the beginning, all data are standardized ranging from zero to one with the following equation (11) is used to test and train data sets in the RNN, GRU, and CA-GRU models. In addition, the convergence of neural nets may be guaranteed by the preprocessing which made the calculating speed of network methods increase.where is the normalized data; is the original data; the maximum and minimum of the original dataset are represented by and , respectively.

In the study, various concepts related to statistics, including correlation coefficient (R), mean absolute error (MAE), variance accounted for (VAF), and root mean square error (RMSE) are used to compare performance prediction. In addition, these performance indicators could provide a sense of how great the performance of the prediction model is in terms of the actual value. The following equations give the above standard tools:

R criteria can be described as follows:

MAE criteria may be depicted as follows:

VAF is usually used to assess the accuracy of one model, through making a comparison between the assessed values and the evaluated output of the model, and VAF criteria can be computed with this equation as follows:

The RMSE is traditionally applied to monitor the quality of error function of the model. The performance of the model increases with RMSE decreasing. RMSE criteria can be computed with this equation as follows:where for equations (12)–(15), is the measured porosity; denotes the assessed porosity while n is the amount of testing data points.

In the study of machine learning, robustness is a key characteristic. Considering that the method of selecting training and testing data sets avoids the impact on the robustness of GRU, we randomly selected ten training and testing set from 5165 data points in well E1. The randomly selected sets are shown in Figure 9(a), where the color represents the corresponding depth of the sample. For each of these ten cases, GRU can be modeled and its RMSE of the training set and the testing set can be calculated, respectively, and the results are shown in Figure 9(b).

As shown in Figure 9(b), the difference of training samples will lead to inconsistent prediction errors of GRU. Therefore, it can be concluded that the selection of training samples will bring about changes in GRU prediction errors. In order to further analyze the influence of different training samples of GRU forecasting error, we use statistical methods. The above ten cases of RMSE data have carried on the single factor analysis of variance (the confidence level of 5%), and the analysis results are shown in Table 2.  = 0.569 > 0.05, indicating that there are no significant differences between RMSE data of 10 cases. Through comparative analysis, we can safely conclude that different training samples had differences in the prediction error of GRU, but there is no significant difference in statistics. It means, from a statistical point of view, the selection method of training and testing samples had little influence on the robustness of GRU. Therefore, this study divides the training sample set and the test set according to the depth order, which conforms to the statistical law and is feasible and practical.

As we all know, determining the parameters of the neural network model is the premise of the successful construction of the network model. In this study, adaptive learning rate optimization algorithm (Adam algorithm) is used to optimize the network. Adam algorithm combines the advantages of RMSProp algorithm and AdaGrad algorithm and can design the independent adaptive learning rate for different parameters. According to the test, the best learning rate is 0.005, the batch is 10, and the time step is 50. According to the previous studies [18, 37] on neural networks, the number of hidden layers and hidden layer nodes has great influence on the prediction performance of neural networks. For different research fields, the number of hidden layers and hidden layer nodes are different. Choosing the optimal number of hidden layers and hidden layer nodes is the key to ensure the prediction accuracy of neural network model. Therefore, based on the ergodic optimization thinking, this study sets the hidden layer value range as [1, 10] and the hidden layer node value range as [1, 100]. By comparing the root mean square error (RMSE) of the model under different hidden layers and hidden layer nodes, the hidden layer value and hidden layer node value of the model under the minimum root mean square error (RMSE) are obtained. The optimization test results are shown in Figure 10. It can be seen from the figure that too many or too few hidden layers and neurons in the network will lead to drastic changes in root mean square error (RMSE) of prediction results, resulting in the decrease of prediction accuracy. Through traversal optimization, the optimal number of hidden layers and nodes for this study is 3 and 41, respectively.

According to the above correlation analysis results, four logging parameters, DTC, DEN, CNL, and GR, which have strong correlation with porosity, are selected as the modeling independent variables of the porosity prediction model. Figure 11 shows the comparison between the actual and predicted porosity of RNN, GRU, California-GRU, and MLR models with the depth of E1 well, as well as the local enlargement operation. It shows that the results of MLR models are quite different from the measured porosity, while the results of RNN, GRU, and California-GRU models are very consistent with the actual situation.

Eventually, the statistical indicators mentioned previously were also employed to carry out this comparison and outcomes of this comparison were presented in Table 3. What can be shown in Table 3 was that the three kinds of porosity prediction models based on deep learning method were far superior to those established with the traditional multiple linear regression (MLR) method in their prediction accuracy. And by comparing the prediction precision of the three types deep learning models (RNN, GRU, and CA-GRU) established in this study, it can be evidently shown that the GRU and CA-GRU models were superior to RNN model in prediction precision, among which CA-GRU took the best precision due to the highest R of 0.9423 and VAF of 88.7578, and the lowest MAE of 0.2101 and RMSE of 1.1412. Through the comparison between the prediction precision of CA-GRU and GRU, there was just slight discrepancy between their precision.

To conclude, both CA-GRU and GRU models could provide a successful porosity prediction performance. The CA-GRU model showed a higher precision in predicting porosity and from Table 3 which we can conclude that CA-GRU modeled in this study took a higher efficiency in predicting the porosity with its high precision. And the verification results in well E1 prove that the CA-GRU model with optimal inputs could be regarded as an efficient tool for predicting the porosity, in particular, in the area where the high precision porosity data is required.

5. Conclusions

Logging parameters reflect the sedimentary characteristics of acoustic, discharge, and electric in different geological periods. Porosity is the characteristic response of different formations and has strong time series characteristics. Directly predicting porosity by using logging parameters can effectively reduce the high cost of using special methods such as rock physical analysis and nuclear magnetic resonance, which can provide accurate and low-cost decision-making basis for the petroleum exploration and development in oil-gas field. Deep learning technology can find the nonlinear relationship between different parameters completely from the data, which is very suitable for solving the problem of nonlinear geophysical interpretation. It can not only make full use of the response characteristics of various logging parameters to different formations at the same time but also can get rid of the limitations of linear prediction of traditional empirical formula.

Considering the time series characteristics of logging parameters and porosity parameters, a reservoir porosity prediction method based on GRU neural network is proposed in this study. By using correlation analysis method based on Copula function to select sensitive logging parameters and then uses GRU neural network to build prediction model, which not only considers the influence of strong correlation sample data on the prediction of porosity parameters but also takes into account the nonlinear mapping relationship between porosity parameters and logging curves, as well as the change trend and correlation of logging information with depth. The correlation measure method based on Copula function can optimize the well logging curves which are sensitive to porosity parameters, reduce the dimension of model input, eliminate the redundancy between variables, and improve the overall prediction performance of the model. The research results show that the neural network model of GRU neural network has strong feature extraction ability and can effectively extract deep characteristics reflecting porosity parameters from logging data. Compared with deep learning models such as multiple linear regression analysis, it can predict porosity parameters more accurately and has strong robustness and anti-interference ability. This study provides a new idea for accurate interpretation of logging data in oil-gas field exploration and development.

Abbreviations

DNN:Deep neural network
GRU:Gated recurrent unit
NMR:Nuclear magnetic resonance
CNN:Convolutional neural network
RNN:Recurrent neural network
SAE:Stack autoencoder
U:Uranium
DEN:Bulk density
TH:Thorium
MAE:Mean absolute error
RMSE:Root mean square error
RNN:Recurrent neural network
LSTM:Long short-term memory
CA:Correlation analysis
MLR:Multiple linear regression
SP:Spontaneous potential
CNL:Compensated neutron log
DTC:Compressional wave slowness
RT:Resistivity
GR:Natural gamma-ray
K:Potassium
R:Correlation coefficient
VAF:Variance accounted for
CA-GRU:Correlation analysis-gated recurrent unit.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

The authors are grateful for the financial support provided by the National Natural Science Foundation of China (41504041). The authors would also like to thank Xiaoyan Deng for assisting in preparation of this manuscript, particularly grateful to her for checking grammar, style, and syntax in the manuscript.