The Triassic Yanchang Formation in the Pengyang area of Ordos Tianhuan depression is an important oil and gas formation. However, most of the oil pays in the study formations are low-resistivity or low-contrast reservoirs with low permeability, bringing challenges to the reservoir identification and evaluation by well logs. In this paper, we first measured the nuclear magnetic resonance (NMR), phase permeability, cation exchange capacity (CEC), and X-ray diffraction for core samples. Then, the genetic types for the low-resistivity pays were analyzed based on the experiment results, water analysis, and well log data collected. It was found that large variations of formation water salinity, high irreducible water saturation, and clay conductivity are the primary genetic types. Further, the random forest (RF) algorithm with sensitive parameter inputs was used to identify the oil, oil and water, and water layers. The anomaly of spontaneous potential (∆SP) that characterizes water salinity, the relative value of gamma ray log (∆GR) that describes the bound water content, resistivity, density, and acoustic logs were taken as sensitive logs according to the genetic analysis. Finally, this identification method was verified by comparison with the traditional crossplot method and oil test results. The identification accuracy of the RF is 90%, far higher than that by the crossplot method.

1. Introduction

The oil and gas exploration in China has entered the stage of complex and unconventional oil and gas reservoirs. The geological conditions of exploration objects are complex, and exploration is becoming more and more difficult. The Mesozoic in the southern part of Ordos Tianhuan depression is an important oil and gas reservoir belt. However, the pore structure is complex, and water properties of oil and gas reservoirs are changeable [13], causing a large number of low-contrast or low-resistivity reservoirs to develop. The interpretation of oil and water layers with different log response characteristics is unclear, resulting in challenges to the identification and evaluation of oil reservoirs based on well logs.

In the middle of the last century, Tixier et al. [4] of Schlumberger first put forward the concept of low-resistivity reservoir, then started to investigate the definition of low-resistivity reservoir. From the perspective of oil saturation and resistivity index, Zemanek [5] thought that the resistivity index of low-resistivity reservoir should be less than three, and the oil saturation should be less than 50%. However, many scholars pay more attention to the resistivity ratio of oil and water layers to measure whether the oil layer is a low-resistivity reservoir. Ouyang et al. [6] believed that the oil layer can be determined as a low-resistivity reservoir if the resistivity of oil reservoirs is less than twice that of the water layers. The main genetic types for low resistivity of oil reservoirs are conductivity of clay, high-salinity formation water, high irreducible water saturation, deep invasion of high-salinity filtrate, and a small amount of conductive minerals [712]. For the Chang 8 member, Ordos Basin, Bai et al. [13] believed that the water salinity and irreducible water saturation are the main reasons causing the resistivity of oil reservoirs to be low.

Many scholars have studied the fluid identification of low-resistivity reservoirs. The more classical methods include overlapping, crossplots, nuclear magnetic resonance logs, and mathematical statistics methods [1417] Among these methods, the most commonly used methods are overlapping and crossplot methods [18, 19]. Although nuclear magnetic resonance (NMR) logging plays a vital role in fluid identification [20], there is almost no NMR logging data in some old oilfields. Ren et al. [21] defined a fluid indicator factor based on the double-porosity overlap method. By constructing the crossplot of fluid identification factor with porosity, they predicted the fluid properties of low-porosity and low-permeability reservoirs in the Subei Basin with high accuracy. Sun et al. [22] established a double-porosity model of tight sandstone based on the Voigt–Reuss–Hill model and further used the elastic parameters predicted by the established model to construct a fluid identification chart suitable for low-porosity and low-permeability reservoirs. Based on the correlation coefficient for the upper Paleozoic tight sandstone reservoir in the Linxing-Shenfu area of Ordos Basin, Hou et al. [23] proposed a fluid discrimination method by using density porosity and resistivity logging data.

With the advent of the era of big data, scholars have tried to process and interpret logging data through various machine learning methods to improve production efficiency and accuracy [2426]. The process of machine learning is to find the objective function through training data. The commonly used algorithms include decision tree, random forest algorithm, logical regression, support vector machine, neural network, and cluster analyses. These methods mine the complex nonlinear relationship between data through complex transformation, which is more effective for the problems that cannot be effectively solved by traditional physical or empirical models. Combining logs with production data, Zhang et al. [27] accurately identified oil, gas, and water layers in the coexistence area of low-resistivity oil layer and high-resistivity water layer by using the support vector machine method. Based on electrical and physical properties, acoustic parameters, or gas logging parameters, Liu et al. [28] extracted oil and gas sensitive parameters and applied four mathematical algorithms of the decision tree, radial basis function, neural network, and cluster analysis to the comprehensive evaluation of fluid properties. The identification method of carbonate reservoir in the Jidong exploration area is more accurate than that of single information. Chen et al. [29] applied the machine learning AdaBoost M2 algorithm to the fluid recognition of sandy conglomerates. The K-type multifluid type is disassembled into a binary classification problem, and the decision tree is called as a weak learning algorithm to automatically obtain the classifiers for fluid discrimination. Tan et al. [30] used a committee machine method to identify the fluid types for tight sandstones from GR, resistivity, and porosity logs. Luo et al. [31] used long-term and short-term memory network (LSTM) and convolutional neural network (CNN) to characterize the timing characteristics of log curves and the correlation between multiple log curves, respectively. The recognition accuracy of the oil-bearing reservoir is significantly improved by using the weighted crossentropy loss function. The multilayer fluid recognition method improves the recognition accuracy of oil, oil and water, and water layers.

Random forest (RF) is an algorithm for regression prediction and classification prediction based on multiple decision trees [32]. Numerous decision trees are established by random repeated sampling technology and node random splitting technology. The prediction results of many decision trees are combined and output as a whole. The random forest algorithm has the advantages of highly parallel training, fast training speed for large samples of big data, a small variance of the trained model, strong generalization ability, and insensitivity to the lack of some features [33, 34].

In this paper, we collected a large amount of core porosity and permeability, water analysis, and well log data. Also, we measured the NMR, phase permeability cation exchange capacity (CEC), and X-ray diffraction for core samples. Based on the well log and core analysis data, the possible reasons for the low-resistivity oil pay are analyzed and discussed. Further, the RF algorithm with sensitive parameter inputs instructed by the genesis analysis was used to identify the oil, oil and water, and water. Finally, this identification method was verified by comparison with crossplots and oil test results.

2. Geological Characteristics of the Chang 8 Member in the Pengyang Area

Ordos Basin is the second largest sedimentary basin in China. The basin is characterized by its large area of about 330,000 km2 and broad resource distributions, with significant potential and economic reserves [35]. It is a multicycle superimposed petroliferous basin. The basin could be divided into 6 secondary structural units, including Yimeng Uplift, Weibei Uplift, Yishan Slope, Western Fold-Thrust Belt, Tianhuan depression, and Jinxi Fault Fold Belt [36]. The target area, the Pengyang area, is located in the southern Tianhuan depression, (Figure 1). The Triassic Yanchang Formation is one of the main exploration horizons.

The data were taken from the Chang 8 member of the Triassic Yanchang Formation in the Pengyang area, Ordos Basin, China. The target layers are characterized by low-porosity and low-permeability reservoir. The core porosity and permeability of Chang 8 member are shown in Figure 2. The porosity is mainly distributed between 6% and 21%, with an average of 15.59%, and the permeability is mainly distributed from 0.1 to 30 mD, with an average of 1.45 mD.

3. Experiments

To investigate the genetic mechanisms of the low-resistivity oil pay, we measured the NMR, phase permeability, cation exchange capacity (CEC), and X-ray diffraction (XRD) minerals of rock samples drilled from the target reservoirs. The NMR T2 spectra of twenty samples were measured, while CEC and XRD were measured in ten of them, and phase permeability was measured in two.

The nuclear magnetic resonance measurement adopts the nuclear magnetic resonance analysis and imaging system MesoMR23-060H-I produced by Niumet. The magnetic field strength is 21 MHz, the waiting time Tw is set to 3 s, and a value of 0.1 ms was used for the interecho spacing, . And the echo number (NECH) is set to 10000, and the number of scans (NS) is 32.

For the measurement of phase permeability curve, the core samples were dried at 110°C for 24 h, cooled to room temperature in a drying dish, and weighed and measured with vernier calipers. The dry core samples were evacuated for four hours and saturated with water at pressure for 24 h, after which the imbibition of oil at atmospheric conditions was performed until there is no water flowing out. At this time, the water in the core was called the irreducible water. Then, the water flooding experiment was carried out under constant speed conditions, recording the time, pressure difference value, and oil/water volume in the experiment. Finally, the data were substituted into the calculation formula to calculate the oil/water relative permeability value and oil and water saturation and plot the relative permeability curve.

The CEC and quantities of cation exchange (Qv) of reservoir rock are two critical physical parameters to characterize the conductivity of rock clay. CEC of clay minerals refers to the total number of cations that could be exchanged by clay minerals under pH value 7, that is, the number of cations that can be absorbed and exchanged by clay minerals. CEC is a measure of the number of negative charges of clay minerals. The unit of cation exchange capacity is mmol/100 g, that is, the number of millimoles of cation exchanged per 100 g dry sample. Qv is the amount of cation exchangeable per unit pore volume of clay, in milliequivalents per liter. In the CEC measurement experiment, according to the national standard SY/T 6352-2013, the reagents used include 1 mol/L ammonium acetate solution (pH 7.0), 77.09 g ammonium acetate solution (CH3000NH3, chemically pure), ethanol solution (for industrial use, must be free of NH4+), and 0.05 mol/L hydrochloric acid standard solution.

For the XRD measurement, the core sample needs to go through the steps of separating clay particles, sample preparation, glycerol treatment, and 550°C heat treatment. The instrument used for XRD analysis is Porter Philips 00186 diffractometer. The anode of the ray tube is copper (Cu Ka radiation) with a wavelength of 1.5406 A. The X-ray tube was operated at 50 kV and 30 mA.

4. Genetic Mechanisms of Low-Resistivity Oil Pay

4.1. The Effect of the Formation Water Salinity

Formation water salinity is one of the main factors affecting rock resistivity. Generally speaking, the water in the same formation is relatively stable, and the change of salinity is small. However, the water analysis data show that the formation water salinity of the target formation in the study area varied heavily. Figure 3 shows the distribution histogram of the formation water salinity collected from 27 wells. It shows that the formation water salinity of the Yanchang Formation is mainly distributed between 20 g/L and 90 g/L, but there are a few wells with salinity greater than 100 g/L. The formation water resistivity varies from about 0.1 Ω·m to 1.8 Ω·m at 25°C. The change of the formation water salinity results in low contrast between oil and water layers. And the difference in the formation water salinity weakens, conceals, or even cancels the contributions of oil content to electrical properties, resulting in a blurred boundary between oil and water layers and low coincidence rate of log interpretation.

Figure 4 shows logs of the Well M1. In this figure, the gamma-ray (GR) and caliper and spontaneous potential (SP) are displayed in track 1 (from left). The second track displays the array induction resistivity logs, including AT90, AT60, AT30, AT20, and AT10. The detection depth of AT90 is the deepest, while that of AT10 is the shallowest. Porosity logs, in terms of density (DEN), compensated neutron (CNL), and acoustic (AC) logs, are presented in track 3. Track 4 is measured depth. Tracks 5 to 7 present the porosity, permeability, and shale volume contents, respectively. The blue line in depth track indicates the oil test layer. Figure 5 shows logs of another well, Well M2. The oil test layer for Well M1 is the interval of 2439-2442 m with the resistivity of 8~10 Ω·m, and for Well M2 is the interval of 2265-2268 m with the resistivity of 5~8 Ω·m. However, the fluid type of M1 is water, and that of M2 is oil, according to oil test results. The formation water salinity of the two wells is, respectively, about 20.5 g/L and 73.8 g/L, making it difficult to identify fluid types of target reservoirs.

The influencing factors of the formation water properties between different wells include the following: (1) the sedimentary reason; that is, the lithology of fluvial sedimentary facies varies greatly. The reservoir with heavy mud and fine lithology retains high-salinity water in the diagenetic process. (2) In fine lithologic reservoirs, free water in large pore throats is driven away during hydrocarbon migration and accumulation, while immobile water with high salinity is retained in small pore throats. (3) Frequent tectonic movement destroys complete and closed traps, and the water from the reservoir bottom or rock and mineral filtration migrate to the reservoir again. Surface water can penetrate the underground primary reservoir through open faults, changing the properties of reservoir fluid. High formation water salinity is controlled by oil accumulation. With the increase of buried depth, hydrocarbon source rock is compacted. Further, the high salinity formation water and oil in the source rock pores are squeezed out, under the action of excess pressure and downward migration into the pore of the formation, displacing the original formation in pore water or mixing with the original formation water. And low salinity of formation water is controlled by tectonic setting, fracture, and fault development, which makes the upper low salinity of formation water and lower together. Low salinity of formation water along cracks, or faults to the reservoir pores, displaces the original formation in pore water or mixes the original formation water. Thus, the formation water salinity becomes lower.

4.2. The Impact of Irreducible Water Saturation

The high bound water saturation may also be one of the main leaders of forming low-resistivity reservoirs, so it needs to be analyzed and studied. Reservoir bound water usually consists of three parts: (1) film water retention on the surface of rock particles (nonclay) due to wettability, (2) capillary water retention in pores, and (3) clay particle adsorption of water.

Twenty samples from Chang 8 reservoir in the Pengyang area in 7 wells were tested for the NMR saturation experiment. The saturated NMR T2 spectra of each sample are shown in Figure 6. It shows that the T2 spectra of Chang 8 reservoir in the study area are dominated by small pores and high content of bound water. Table 1 presents the bound water saturation from NMR T2 spectra. From Table 1, it can be seen that the bound water saturation of the reservoir is mainly distributed between 40% and 80%, with an average of 73.74%. T2 logarithmic mean (T2lm) values vary from 0.48 to 7.79 ms. The average of the T2lm is 2.03 ms. Figure 7 shows the relative permeability curves of samples T2 and T3. From this figure, it is seen that the irreducible water saturation is about 40%. The high irreducible water saturation could reduce the resistivity of oil layers.

Figure 8 shows an example of producing different fluid types caused by different irreducible water saturation. In this figure, different from Figures 4 and 5, MSIGTA, MPHITA, and MFFI are, respectively, total porosity, free fluid porosity, and bound water porosity, obtained from NMR T2 spectra. The last track presents T2 spectra and T2lm logs. The resistivity of the upper layer (2370~2374.5 m) is about 30 Ω·m, and the oil testing shows that it is an oil and water layer. The lower layer (2384.5~2386.5 m) has resistivity values ranging from 10 to 20 Ω·m, and the oil testing result shows that it is a good oil layer. The irreducible water saturation of the lower layer is close to total water saturation; thus, they are pure oil layers. In contrast, the water saturation of the upper layer is greater than the irreducible water saturation, which may cause the oil and water producible at the same time.

4.3. Conductivity of Clay

Clay minerals in sandstone reservoirs generally have additional electrical conductivity. The electrical conductivity of clay-bearing sandstone is very different from that of pure sandstone, which is one of the important reasons forming low-contrast or low-resistivity oil layers. Clay-bearing sandstone formations all contain a certain amount of clay, and the surface of clay particles is usually negatively charged. Under normal circumstances, the negatively charged cations adsorbed on the surface of clay particles cannot move, but the adsorption is not very tight. Under the action of an electric field, the adsorbed cations can exchange positions with other hydrated ions in the solution in the rock, causing electrical conductivity. This conductive feature produced by the cation exchange of clay minerals is called the additional conductivity of clay minerals.

The clay types and contents derived from X-ray diffraction analysis show (see Figure 9 and Table 2) that the clays in the study area are dominated by chlorite, followed by mixed layer of illite/smectite, accompanied by a small amount of illite, kaolinite, and mixed layer of chlorite/smectite. There is no independent smectite mineral. As shown in the scanning electron microscope pictures in Figure 10, the hydromica (illite) in the reservoir interstitial of Chang 8 in the study area is honeycomb or filamentous. The particle surface has a honeycomb illite/smectite clay film (Figure 10(a)). Figure 11(b) shows the pores filled by filamentous illite clay. The chlorite film is attached to the surface of rock particles to absorb formation water, which improves the conductive network of the reservoir and reduces the resistivity of the reservoir.

Experiments worldwide indicated that the disordered illite/smectite layer clay minerals have a strong cation exchange capacity. The cation exchange capacity (CEC) and cation exchange capacity Qv of reservoir rock are two important physical parameters that characterize the additional conductivity of clay. In this area, ten core samples were selected for CEC experimental measurement, and the measurement results are shown in Table 3. It shows that the Qv value is small, all less than 1.0. In conventional sandstone oil and gas reservoirs, it should not have a great impact on the conductivity of the rock. However, it has a more significant influence on Chang 8 member reservoirs with small porosity and complex pore structure. Especially in the oil layer of the target reservoirs, the equivalent cation exchange capacity Qv () enhances the additional conductivity of clay. Therefore, the additional conductivity of clay minerals is one of the reasons for the formation of the low-contrast oil layer in the Chang 8 member of the Yanchang Formation.

5. Fluid Identification Based on RF Algorithm

5.1. RF Algorithm

The random forest (RF) is an ensemble learning algorithm for classification, regression, and other tasks in geological and geophysical [37]. RF utilizes multiple decision trees as base learners to build an intelligent system, which combines all prediction results from base learners by the majority voting or average approach to provide accurate results. RF, a unique form of the Bagging algorithm, applies a random selection of sample numbers and feature numbers to generate a series of different sample subsets [32]. The hundreds of independent base decision trees are constructed based on sample subsets. In this study, the final output is obtained using majority voting due to the classification task of fluid type (Figure 11). Although a single decision tree has poor accuracy, the accuracy of the comprehensive decision could be very high since each decision tree is well trained for a specific subset.

The corresponding basic steps of the algorithm are as follows [38]: (1) bootstrap sampling with the return is used to generate multiple sample subsets from the sample data. Each subset generates a decision tree through training. (2) The optimal feature is determined from the sample subset when the decision tree divides nodes. The decision tree makes the branches of the optimal feature grow until they cannot regenerate. (3) The eventual result is obtained by voting for each prediction result from the basic decision tree.

The main advantage of the RF is that each decision tree uses part of the sample data and extracts several features for modeling. The multiple independent decision trees provide RF with higher accuracy, better generation, and superior stability. Especially for a high-dimensional small sample classification problem like fluid identification of low-resistivity pay zones, the stability and generation of the model are very significant. RF has been successfully applied in many aspects, such as lithology identification [39], hydrocarbon source rock evaluation [40], and reservoir parameter prediction [41].

5.2. Data Description and Preprocessing

In order to illustrate the fluid identification capability of tight sandstone reservoirs of random forest, we collected data regarding the Chang 8 formation of the Triassic Yanchang Formation from the Pengyang area, Ordos Basin, China. Oil testing data of sixty layers are obtained as sample data in this study. Considering the analysis above of genetic mechanisms of low-resistivity oil pay, the complicated water salinity distribution, the impact of irreducible water saturation, and the additional conductivity of clay mainly lead to low-resistivity oil reservoirs. The distinct oil test layers may have different causes of low resistivity. The array induction resistivity log with a detection depth of 90 inch can well indicate oil and gas bearing property to some extent. The density log (DEN) and sonic time log (AC) represent the porosity of rocks. The anomaly of spontaneous potential (∆SP) that equals to the difference between the SP and shale baseline could characterize water salinity. The relative value of the natural gamma ray log (∆GR) describes the bound water content and CEC. Therefore, according to the petrophysics analysis of reservoirs, the RT, DEN, AC, ∆SP, and ∆GR are determined as input parameters of random forest.

Sample data includes feature data from well logs and label data from fluid types of reservoirs. Table 4 shows the statistics of sample data, including the maximum value, minimum value, average value, and standard deviation of well logs. The label data includes three types of layers in the study region based on the oil testing results, namely, oil layer (OL), oil and water layer (OWL), and water layer (WL). The corresponding label data is determined using a unique vector. For example, the oil layer is (1, 0, 0), which indicates that the probability of an oil layer is 1, and the probability of other types is 0. The oil and water layer is (0, 1, 0), and the water layer is (0, 0, 1).

Due to the difference in dimension and order of magnitude of each parameter, the min-max normalization method is utilized to scale these sensitive data. Data normalization can eliminate the unit difference between different logs and improve the convergence speed of the prediction process [42]. In this study, data is normalized in the range [0,1] based on the following equation: where is the normalization processing result, is the raw data, is the minimum value from the original data, and is the maximum value of the raw data.

5.3. Hyperparameter Selection and Model Establishment

To boost the generalization ability of the random forest, it is necessary to use the cross-validation method with grid search to optimize the hyperparameters (Figure 12). The cross-validation method divides sample data into portions. Each portion is called a fold. The data of () folds are exploited to the train model, and the remaining data is employed to the test model. The above steps are implemented times until each fold is used as once testing fold. We arrange all parameters into a network of parameter combinations. Each parameter combination is input into the cross-validation method in turn for performance evaluation. The optimum parameter with the highest accuracy is determined.

In this study, hyperparameters such as the number of trees in the random forest (n_estimators), the maximum depth of the tree (max_depth), the minimum number of samples required to split the internal nodes (min_sample_split), the minimum number of samples required to be at a leaf node (min_samples_leaf), the maximum number of features to be considered when looking for the best split (max_features), and the number of samples for training each basic estimator (max_samples) are optimized using the 10-fold cross-validation method with grid search. For the random forest algorithm, those hyperparameters have a great effect on the prediction result. The n_estimators control the strength of random forest. The max_features and max_samples determine the diversity of the base decision trees. The max_depth, min_sample_split, and min_samples_leaf represent the complexity of the base decision trees. In our practices, the optimal values of n_estimators are 50, 100, 200, 300, and 500, and the optional values of max_depth are ranged from 1 to 10, while the optional values of min_samples_leaf are 1, 2, 5, 10, 15, 20, 30, 40, and 60. The min_sample_split, max_features, and max_samples are set as 2, 60%, and 2/3, respectively. According to the optimization result of 10-fold cross-validation method, the optimized hyperparameter settings are listed in Table 5.

Finally, the built random forest model is obtained to predict the fluid property of tight sandstone reservoirs. In order to analyze the prediction result, a confusion matrix is calculated from the actual and predicted types of samples. The precision and recall of each sample type can be calculated based on the confusion matrix, and their equations are as follows: where is the number of true-positive samples, is the number of false-positive samples, and is the number of false-negative samples. Table 6 presents the confusion matrix of the classification identification. Taking the oil and water layer in Table 6 as an example, the of the water layer is 8 , the of the oil and water layer is 2 , and the of the oil and water layer is 4 .

From Table 6, the overall classification accuracy of the random forest is approximately 90.0%, which is a general coincidence rate, namely, the ratio of the correct number of layers to the total number of layers. The recall value of the oil and water layer is relatively low, because some oil and water layer samples are predicted as water layers due to the similar well log responses. Figure 13 presents a crossplot of with DEN, which is utilized to identify the fluid type of reservoirs in the study region. The red, green, and blue symbols represent the dots of oil, oil and water, and water layers, respectively. However, different types of fluids overlap, making it difficult to identify the fluid types of target reservoirs based on this crossplot. The random forest algorithm provides a better prediction performance than the traditional crossplot method.

To intuitively indicate the effectiveness of the random forest algorithm in the study region, we exhibit the well logs, oil testing layers, and predicted results of Well J (Figure 14). In this figure, the array induction resistivity logs are M2R9, M2R6, M2R3, M2R2, and M2R1, which are presented at track 2. The detection depth of M2R9 is the deepest, while that of M2R1 is the shallowest. The predicted results by the RF algorithm is displayed in Track 5. Tracks 6 to 9 present the porosity, permeability, water saturation, and shale volume contents, respectively. The predicted fluid types of layers 29 and 30 are oil, and of layers 31 and 32 are water. The oil test of layer 29 is shown inside the yellow box. The blue oval in Figure 13 represents the dot of layer 29. From Figure 13, layer 29 is located in the mixed area of oil layer and water layer, making it hard to distinguish the oil from the water layers. However, the fluid of this layer predicted by the RF algorithm is oil, which verifies the effectiveness of the RF algorithm with sensitivity parameters based on the genetic mechanism of low-resistivity pay.

6. Conclusions

(1)The porosity of the Chang 8 member of Yanchang Formation in the Pengyang area of Ordos Basin, China, is mainly distributed between 6% and 21%, with an average of 15.59%. And the permeability varied from 0.1 mD to 30 mD, with an average of 1.45 mD(2)The genetic types of the low-resistivity pay zones of the target formation are a large variations of formation water salinity, high irreducible water saturation, and clay conductivity. The water salinity mainly varied from 20 g/L to 90 g/L. The irreducible water saturation obtained from NMR and phase permeability is mainly distributed between 40% and 80%. The Qv values varied from 0.27 to 0.98, which is relatively high in the low-porosity oil reservoirs(3)The RF algorithm with sensitive parameter inputs instructed by the genesis analysis was used to identify the oil, oil and water, and water. The anomaly of spontaneous potential (∆SP) that characterizes water salinity, the relative value of gamma ray log (∆GR) that describes the bound water content and CEC, resistivity, density, and acoustic logs were taken as sensitive logs(4)The identification results by RF were verified by comparison with oil test results. The accuracy of the identification is 90%, far higher than that by the crossplot method

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.


This paper is supported by the National Natural Science Foundation of China (42004087); the Strategic Cooperation Technology Projects of China National Petroleum Corporation and China University of Petroleum, Beijing (ZLZX2020-03); the Science Foundation of China University of Petroleum, Beijing (2462020BJRC001); and Natural Science Foundation of Jiangxi Province (20202BABL211020).