Abstract

Gas reservoir development and the estimation of rock properties heavily rely on lithology classification, which can be difficult, time-consuming, and prone to errors. In this study, a novel deep learning-based approach has been developed for the rapid, accurate, and efficient prediction of lithology in a gas field from conventional well-log data. The well-logs, referred to as numerical well-logs (NWLs), are transformed into two-dimensional images through two proposed approaches: shallow images (SIs) and deep images (DIs). In these images, the pixels effectively represent the relationships between different logs. For this purpose, we developed residual convolutional neural networks (ResCNN) named SIs-ResCNN 2D and DIs-ResCNN 2D. The feed data for DIs-ResCNN 2D are images created and referred to as DIs, which are initially formed from a vector in which the order of logs is somehow repeated, ensuring that each pairwise combination occurs only once. This resulted in the incorporation of the connection between the logs within the pixels of the generated images, alongside the integration of unique binary combinations of the logs. We compared the proposed models, including DIs-ResCNN 2D, DIs-ResCNN 2D, and NWLs-ResCNN 1D with baseline methods such as random forest (RF), K-nearest neighbor (KNN), and support vector machine (SVM). Based on the evaluation metrics, DIs-ResCNN 2D outperformed the other proposed and baseline methods on the test dataset. A balanced DIs-ResCNN 2D model achieved 93% accuracy and F1-score of 80% on a test well, highlighting the importance of data balancing during CNN model training.

1. Introduction

Identifying the sedimentary rock that forms a reservoir is vital in the oil and gas industry, since dolomite and carbonate contain more petroleum resources. Lithological analyses can be used to estimate resources based on a variety of petrophysical characteristics. The classification of lithology has traditionally been accomplished by analyzing well-logs, which can be challenging, time-consuming, and difficult [13]. Reliable lithology interpretation in mining plays a crucial role in the identification and characterization of geological formations, thereby aiding in exploration and mineral extraction processes. This enables targeted exploration efforts, leading to increased efficiency in recognizing areas with high mineral content. An accurate evaluation of lithology assists in well-informed decision-making for mining operations, ultimately optimizing the extraction of valuable resources [46].

The need to develop classification models of lithology in the oil and gas industry is felt for many reasons, especially that (i) correct and quick diagnosis of lithology plays an important role in economic issues due to the high daily cost of the oil and gas industry and reduces costs and time. It also aids petroleum engineers in making better decisions on issues such as well perforation location selection, which leads to maximum and optimal production. (ii) The correct classification of lithology leads to a more accurate determination of reservoir petrophysical properties such as porosity, permeability, and saturation. A correct assessment of these properties is vital because they have a direct impact on reservoir productivity, development, and production [4, 7, 8].

There are three common approaches for classifying facies using artificial intelligence: (1)Images extracted from geological core plugs containing valuable and rich geological information are utilized. Various studies have employed computer vision and deep learning (DL) techniques to discern significant features and patterns within images. For example, according to Faria et al. [9], a multilayer feed-forward neural network was used to determine lithology by taking core images as input to the network and also for determining model hyperparameters. Bayesian optimization was employed, resulting in an accuracy of 83% on the test data. In other studies [10, 11], machine vision has been used for facies classification. However, due to the high cost of core sampling in the oil and gas industry, collecting images from the entire wellbore column is not economical, which is considered a limitation of this approach(2)Seismic data, which are measurements of ground vibrations generated by seismic sources and recorded by detectors, serve as features for the classification process. Chaki et al. [12] utilize five attributes to classify different types of rock formations and introduce a probabilistic neural network (PNN) framework. The PNN framework is considered effective due to its ability to handle scenarios and process data quickly. By analyzing well-log data, the researchers categorize lithology into four classes and evaluate their findings using seismic data collected from an Indian hydrocarbon field. The seismic-based classification has also been developed [13] to identify lithofacies, where hidden Markov models (HMMs) are utilized for seismic-based lithofacies classification. HMMs consider conditional probabilities and vertical transitions, yielding accurate predictions for most lithologies, including thin layers. Comparisons with independent methods highlight the spatial correlation advantage of HMMs, while a real case study reveals the need for further research on lateral geological relationships(3)Conventional well-logs collected after the completion of drilling are often used in the oil and gas industry in various fields to relate these logs to dependent variables. By analyzing well-logs, gamma ray, resistivity, and sonic experts can gain insights into reservoir properties and formation characteristics. He et al.’s [14] work centers on improving reservoir lithology identification using well-logging data. They introduce LSTM-FCN, a hybrid model that outperforms conventional methods, and enhance it with particle swarm optimization. Also, these conventional well reports are used in other fields such as predicting porosity [15], permeability [16, 17], and water saturation [18]. Among the methods discussed, utilizing well-logs proves to be a more practical and cost-effective approach in the oil and gas industry compared to relying on seismic data and core images

Machine learning (ML), one of the subbranches of artificial intelligence and data science, involves the possibility of training without the need for manual programming and the definition of precise rules. In Zhang et al. [19], multiple machine learning models were evaluated to automatically interpret well-log data, employing techniques like cross-validation and Bayesian optimization to fine-tune hyperparameters. The outcomes indicated a preference for ensemble methods, specifically XGBoost and random forest (RF) standing out due to the highest overall accuracy of 0.882 and AUC of 0.947 in the classification of lithologies. In Deng et al. [20], support vector machines (SVM) were improved using synthetic data techniques (SMOTE and Borderline-SMOTE) to tackle imbalanced data, surpassing neural networks in predicting lithology. In Yin et al. [21], a Class-Rebalancing Self-Training (CReST) framework is presented, leveraging well-logging data and limited labels to attain resilient lithology classification. Four high-performing algorithms, namely, Bagging, Extra Trees, RF, and SVM Classifier are selected from 25 options for constructing CReST models. CReST is shown through experimental results to effectively handle challenges associated with label scarcity and class imbalance, leading to a considerable increase in accuracy, especially for categories with limited data samples. Also, other studies have been done including XGBoost [22, 23], SVM [24], and RF [25, 26]. However, ML in complex and difficult problems usually has weaker performance than deep learning. The main reason behind this is the ability of deep learning models to capture complex patterns and dependencies in the data by leveraging the hierarchical structure of multiple layers.

DL has been more successful in extracting complex relationships in various fields, especially when dealing with big data. Imamverdiyev and Sukhostat [27] aim to develop an effective DL model for classifying geological facies in wells. They propose a new 1D-CNN model trained on various optimization algorithms using input data such as photoelectric effect, gamma ray, resistivity logging, and more. Jiang et al. [28] address the challenge of lithology identification in reservoir characterization. They observe three key characteristics in previous studies: the limited consideration of stratigraphic sequence information, the neglect of neighboring formation influence, and the lack of publicly available well-log data for comparison. The experiments demonstrate that the inclusion of geologic constraints significantly improves model performance, with RNN-based networks exhibiting better and more consistent results. Some other DL studies for rock facies classification using conventional well-log data include 1D-CNN [2931] and RNN [3234]. These studies have employed one-dimensional layers to learn the model. However, in this study, a unique approach has been taken to process numerical well-logs (NWLs). Instead of analyzing the numerical data directly, these well-logs have been transformed into image data. This transformation involves representing the numerical values as pixels in an image grid, effectively converting numerical information into visual data.

The core of the workflow design in this research is the CRoss Industry Standard Process for Data Mining (CRISP-DM), which is a methodology utilized for conducting data mining across various industries. This process encompasses six phases including understanding the business background, preparing the data, evaluating performance, and ultimately implementing them [3537]. The utilization of the CRISP-DM methodology in rock classification brings several benefits. It aids researchers in ensuring the reliability and repeatability of their classification models, allowing for more accurate and consistent results. Additionally, the emphasis on cost-effectiveness and faster project completion enhances the efficiency of rock classification projects, making them more practical and feasible.

The reviewed studies failed to explore the relationships between different features. This study introduces a novel method aimed at incorporating these relationships into image generation for individual samples. Specifically, the approach integrates these feature relationships into the pixels of shallow images (SIs) and deep images (DIs). Two-dimensional images, namely, SIs and DIs, are generated using NWLs. These resultant images serve as inputs for the SIs-ResCNN 2D and DIs-ResCNN 2D models, enhancing the classification accuracy of lithology. This improvement is achieved through the utilization of residual blocks with regular and reduced shortcuts. The proposed methods of generating images offer notable benefits, including faster learning and better performance. The research highlights the significance of employing log-derived images with physical meaning in training CNN models. Additionally, it demonstrates the vital contribution of modified architectures, such as residual architectures to improving the accuracy and overall performance of the model.

2. Study Pipeline and Materials

As shown in Figure 1, this paper can be divided into three general parts: (i) data collection and preparation, (ii) image creation methods (SIs and DIs) and modified convolutional neural network (CNN) architecture, and (iii) classification evaluation. The continuation of this section focuses on data collection and preparation, where the data and its statistical properties are explored in Section 2.1. Additionally, the preprocessing steps applied to the data are described in Section 2.2.

Moving on to the methodology, the creation of SIs and DIs is explored in Section 3.1. Section 3.2 covers the residual structure, while Section 3.3 presents the suggested modified CNN architecture. Section 4 of the study addresses classification evaluation methods. The results are presented and analyzed in Section 5. Finally, Section 6 concludes the paper.

2.1. Dataset

The gas field under investigation is located near Asaluyeh, Tabnak, Varavi, and Shanoul in Iran. The estimated recoverable gas reserves in the field amount to 200 billion cubic meters (BCM) [38]. The field’s specific location is indicated by marker X in Figure 2. Geological data were obtained from well-logs, core samples, and petrophysical charts. Logging data were recorded at 0.1-meter intervals within the wells. The geological formations within the gas field exhibit distinct layers. The Aghar shale section is characterized by reddish-brown shale interspersed with dolomite layers. The upper part of the Dalan formation consists of brownish limestone and cream-colored dolomite, occasionally accompanied by traces of anhydrite. In the lower segment of the Dalan formation, white limestone predominates with light brown dolomite and small amounts of anhydrite. Transitioning to the Nahr section, the composition shifts to light brown to light grayish-brown dolomite and anhydrite. The lithology composition of the different layers of the reservoir has a constant and uniform distribution along the layer and in the extent of the reservoir. For example, the alternation of anhydrite and dolomite layers throughout the Dashtak formation is constant. Only in one well due to a fault or folding has the thickness of the Dashtak formation increased. The lithology of the Kangan formation is also constant throughout the reservoir and consists of limestone.

Evaluatable information was found in 9 of the 14 production wells in this field, resulting in 44,521 recorded depths which include well-logs. Among these 9 wells, one well containing all four types of rock facies was identified as a test well and excluded from the training stage. The data from the remaining 8 wells was selected as training data for learning the model. The training dataset consisted of four rock facies, including ANHYDR (anhydrite), CALCITE (calcite), DOLOM (dolomite), and ILLITE (illite) with 9,643, 15,923, 15,071, and 1,043 instances, respectively. The test dataset consists of all four rock facies, and the test-to-train dataset ratio is six percent.

The collection of data comprises nine specifically chosen well-log characteristics: CGR (capture gamma ray), DFL (deep induction resistivity log), DT (sonic log), GR (gamma ray), HDRS (deep resistivity), HMRS (medium resistivity), NPHI (neutron porosity), PEF (photoelectric effect), and RHOB (bulk density). The calculation of total porosity (PHIT, ) and effective porosity (PHIE, ) from well-logs was carried out through the utilization of mathematical equations. The formulas for calculating PHIE and PHIT are as follows: where is matrix density, is bulk density, is the density of the fluid in the pore space, is water saturation, and is shale volume.

A comprehensive explanation of the well-logging data and calculated porosity ( and ) is presented in Table 1.

These statistics offer valuable insights into the distribution and range of the data. For example, the mean CGR is 18.14 with a standard deviation of 19.63, indicating a moderate spread of values around the mean. Similarly, other variables exhibit varying levels of dispersion as captured by their respective standard deviations. These statistical measures provide a comprehensive summary of the input well-log data enabling a detailed analysis and interpretation of the dataset. Moreover, a visual representation in the form of a heat map correlation plot enhances our understanding of the relationships between different variables. The correlations between the features are shown in Figure 3.

The positive Pearson correlation coefficients indicate strong linear relationships between variables, enhancing the predictive power of measurements. For instance, the positive correlation between GR and CGR (0.72) and NPHI and CGR (0.60) suggests a strong association between these parameters. Conversely, negative correlations, such as RHOB and PHIT (-0.79) and PEF and NPHI (-0.39), signify inverse relationships. These correlations play a crucial role in understanding subsurface conditions and optimizing decision-making in resource exploration.

2.2. Preprocessing

Preprocessing is pivotal for ensuring the quality and reliability of the collected NWLs dataset. A series of steps is performed to prepare the data for analysis; each step is described briefly: (1)The dataset is carefully examined for missing values. To address this issue, imputation is used, where missing values are replaced with the corresponding feature’s value. This preserves the overall structure of the data while minimizing the impact of missing values on the analysis(2)Normalization techniques are applied to standardize the scales of all features within the dataset, ensuring equal contribution to the analysis. The chosen method, StandardScaler, adjusts each feature’s values to have an average of 0 and a standard deviation of 1. This prevents any single feature from dominating the analysis due to its magnitude(3)The local outlier factor (LOF) method was selected for data processing due to its higher test accuracy of 87% compared to other methods, such as isolation forest and elliptic envelope based on the K-nearest neighbor (KNN) base model. LOF evaluates the clustering pattern of data points and identifies outliers based on their deviation from this pattern. By eliminating outliers, the analysis is less influenced by extreme values, thereby enhancing the model’s reliability. The removal of outliers from the training data resulted in the exclusion of 4168 samples

With the completion of these preprocessing steps, the NWL dataset is now primed for analysis. Missing values have been addressed, the data has been normalized using StandardScaler, and outliers have been identified and removed using LOF. Consequently, the dataset’s quality and reliability are assured, facilitating accurate and robust analysis.

3. Methodology

In this section, the proposed method for classifying lithology will be discussed. Our approach relies on two key components: SIs and DIs, as outlined in Section 3.1. The concepts of bottleneck structures are introduced in Section 3.2. Then, in Section 3.3, the core elements of our approach, namely, NWLs-ResCNN 1D, SIs-ResCNN 2D, and DIs-ResCNN 2D, are presented. These components form the foundation of our methodology and enable effective lithology classification.

3.1. Generate Images

Before feeding well-log data into a two-dimensional CNN, it needs to be preprocessed. Well-logs typically consist of NWLs recorded at various depths. To create images, these well-logs need to be converted into a 2D format. This section provides an explanation of how to generate shallow and deep images from well-logs.

3.1.1. Shallow Images (SIs)

The process begins by extracting the values of the selected features, which are then used to form a vector of length 11 (representing the number of selected columns). Next, a matrix is generated by vertically stacking vector 11 times. As a result, matrix will have dimensions (, ), where . This process can be expressed as

The final symmetric matrix is obtained by performing element-wise multiplication between the matrix and its transpose () resulting in a symmetric matrix with dimensions (, ). The element-wise multiplication is calculated as where and represent the elements at row and column in matrices and , respectively. The resulting matrix will also be of size .

This multiplication ensures that the interaction between feature and feature is the same as between feature and feature , resulting in a matrix that is symmetrical. This method allows us to represent relationships among the selected features in matrices which can provide valuable insights into geological and petrophysical measurements for analysis. Figure 4 illustrates the process of creating the shallow image (SI) at a depth of 3550.

3.1.2. Deep Images (DIs)

The algorithm provided by Jiang and Yin [39] was employed to generate a series of numbers. It starts with a value of 1 and checks each pair of consecutive numbers. The two numbers have not been paired before the second number is added to the sequence. This process continues until the first and second numbers become the same. Throughout each iteration, the algorithm maintains a list to keep track of all the generated numbers.

The algorithm confines the sequence within the range of 1 to 11, each number denoting a distinct well-log. If the second number in a pair exceeds this range, it loops back to 1. Moreover, the algorithm avoids including pairs of numbers that have already been incorporated in the sequence. By iterating through these steps, the algorithm generates a unique sequence of numbers characterized by alternating increments and specific skips. The resulting sequence is as follows: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 3, 5, 7, 9, 11, 2, 4, 6, 8, 10, 2, 5, 8, 11, 4, 7, 10, 3, 6, 9, 2, 6, 10, 4, 8, 2, 7, 11, 5, 9, 3, and 7. The generated sequence adheres to a pattern of alternating ascending numbers, punctuated by specific skips before repeating. In the context of the well-log data, the sequence is represented as a column vector as illustrated in

Here, is a column vector with 44 rows and 1 column, composed of coefficients through .

Equations (8)–(10) define square matrices , , and , respectively. These matrices involve various combinations of coefficients through , both element-wise and transposed. The structures of these matrices are described as follows:

The presented content combines mathematical equations representing matrix operations with an algorithm that dynamically generates a sequence of indices based on specific rules. This integration serves as an illustrative example of theoretical concepts being translated into practical computational implementation. The element-wise equations show matrix structures involving coefficients, while the algorithm demonstrates a systematic approach to index selection. Together, these elements provide a comprehensive understanding of both the mathematical and computational aspects presented. Figure 5 illustrates the procedure for generating the deep image (DI) at a depth of 3550.

Both SIs and DIs for all depths were created by combining the previously mentioned methods. In Figure 6(a), a data table containing logs recorded at various depths is shown. SIs, depicted in Figure 6(b), are generated using eleven well-logs, resulting in an 11-by-11-dimensional image. Additionally, DIs, as shown in Figure 6(c), were constructed using the described algorithm, resulting in a 44-by-44-dimensional image.

3.2. Residual Bottleneck

The layered structure of CNNs enables them to automatically recognize features and patterns in images, resulting in more effective learning, for example, in image classification [40, 41], transfer learning [42], object detection [43], segmentation [4446], and a variety of other visual applications. In the fields of surveillance systems, as well as facial recognition and medical imaging, CNNs have demonstrated their impact. MRI and CT scans can be analyzed to identify and diagnose diseases using these images [47, 48].

He et al. [49] introduced a technique in which gradients can be directly backpropagated to earlier layers. This method involves using skip connections, which enable the flow of information from early layers to later ones, creating an alternative path for the gradient to traverse. An additional advantage of skip connections is that they enable the model to learn an identity function by directly passing the input, ensuring that a layer performs as effectively as the preceding layer and then combines both signals, namely, the skip connection and the main path, as shown in Figure 7. The direct path ensures a straightforward flow of information to the output. On the other hand, the second path passes through the block to enable the extraction and integration of additional features. This enhances the model’s ability to capture complex patterns and improves overall performance [50].

The combination of the skip connection and convolutional layers is called a residual block. A residual module consists of two branches: a shortcut path and a main path. The main path consists of three convolutional layers with relu activations. Batch normalization was added to each convolutional layer to reduce overfitting and accelerate training. The main path architecture looks like this: . A bottleneck layer ( convolutional normalization) was added to the shortcut path as depicted in Figure 8(a) which is called the reduce shortcut. On the other hand, if the short path does not include the bottleneck layer, the output in add both path will be according to Figure 8(b) which is called the regular shortcut.

Modifications were applied to the original CNN structure by incorporating a residual connection (Figure 9) to enhance its efficiency in addressing the specific problem of lithology classification.

3.3. Proposed Residual Convolutional Neural Networks

The proposed models, illustrated in Figures 1012, have been enhanced through the integration of residual structures. In SIs-ResCNN 2D, the signal is passed along the shortcut path and then added to the main path, . Then, the relu activation is applied to to generate the output signal: (The resulting output is denoted as ). The output in the second add both path becomes , as illustrated in the architecture shown with .

The DIs-ResCNN model receives DIs as input, as shown in Figure 11. The features of these images after applying reduce shortcut and the relu activation function can be represented as

Finally, by applying the regular shortcut, the features are flattened according to

The NWLs-ResCNN 1D model receives NWL data as input, as shown in Figure 12. The features after applying regular shortcut and the relu activation function can be represented as

These flattened features denoted as are then fed into a dense layer. This layer consists of four neurons with the softmax activation function, which outputs probabilistic values. These values represent the probabilities of each class for each sample. To determine the most probable class for each sample, the Argmax function is applied based on these features (Argmax is a mathematical operation that identifies the argument or input value that results in the maximum value from a given target function).

In these proposed architectures, focus was placed on two hyperparameters in the model’s architecture: the number of filters (, , and ) and the kernel size (, , and ). These hyperparameters were tuned in Section 5.1 to obtain their respective values.

4. Model Evaluation

4.1. Metrics

Precision and recall are two metrics used to evaluate the performance of a classifier for individual classes. Precision refers to the probability that a sample truly belongs to a particular class given the classification result, while recall is the probability that a sample is correctly classified for a given class. The -score is a combined measure of accuracy and precision, providing a single measure of the relevance of classifier results [51]. The metrics for evaluating a model’s performance are precision and recall, which are defined as

The abbreviations , , , and stand for true positive, true negative, false positive, and false negative, respectively. Here are the equations for calculating the -score and accuracy:

4.2. Class Prediction Error Plot

Figure 13 visual representation uses a bar graph to show the support for each class in a classification model. Each bar is divided to indicate the percentage of predictions assigned to each class including negatives and false positives (similar to a confusion matrix). This plot assists us in comprehending both the strengths and weaknesses of models as well as specific challenges present in the dataset. Additionally, the class prediction error plot provides a way to assess how accurately the classifier predicts relevant classes. It is like a version of a confusion matrix that clearly highlights prediction errors. The error plot is particularly useful in classification problems allowing us to have a better understanding of the classification errors made by the model. As depicted in Figure 13, changes in recall and precision for faces CALCITE and DOLOM can be easily tracked. Additionally, the overall model performance can be evaluated in comparison to other models.

5. Results and Discussions

In this section, hyperparameters are initially considered, and their tuning process is explained. Subsequently, the results of the original and proposed models are compared. Following this, the best model is compared with baseline models such as support vector machine (SVM), K-nearest neighbor (KNN), and decision tree (DT). Additionally, the stability of the model and the impact of data balancing are investigated. Finally, the assessment of feature importance on NWL data is conducted.

5.1. Hyperparameter Tuning

An objective of hyperparameter tuning is to maximize the validation set’s evaluation score without overfitting [52]. The optimal selection of filters () and kernel size () significantly influences deep learning model performance. Filters extract features by capturing intricate details which in turn requires greater computation. Kernel size affects feature extraction; larger kernels encompass more context but require more computation. A careful balance is crucial because excessive filters/kernels can lead to overfitting, while insufficient filters can result in underfitting. For the three proposed models, the hyperparameters were tuned by dividing the training data into a 70-30 ratio to determine these parameters. Subsequently, the model was trained on the entire dataset and evaluated using the test data that had been excluded. The hyperparameters and their corresponding tuned values for each model are detailed in Table 2 and Table 3, respectively.

Bayesian optimization is an important technique that differs from grid search or random search methods because it adapts the search process based on previous trials. This assists the tuner in making choices about which hyperparameters to test next, focusing on configurations that seem promising for better performance. By doing this, it significantly cuts down on the number of trials required to find the hyperparameters, which saves both time and computational resources. Bayesian optimization is employed to improve the efficiency of hyperparameter tuning. By setting max trials to 30, it means that Keras Tuner will explore and evaluate 30 different combinations of hyperparameters to find the best configuration (Table 3). The tuner will perform multiple training runs, adjusting various hyperparameters like the learning rate and the number of filters for each trial. Then, it will compare the performance of these models based on a specified evaluation metric (e.g., validation accuracy).

5.2. Overall Performance

Table 4 provides a comparative overview of model performance using SIs and DIs in the context of well-log data analysis. SIs involve a process of extracting specific features, creating a vector and generating a corresponding matrix. On the other hand, DIs ensure a sequence of numbers between 1 and 11 with unique patterns formed by alternating and skipping steps.

The different models are evaluated based on precision, recall, and -score metrics for both training and testing datasets. The baseline models, NWls-CNN 1D, SIs-CNN 2D, and DIs-CNN 2D, represent the original approaches while the proposed models, NWls-ResCNN 1D, SIs-ResCNN 2D, and DIs-ResCNN 2D, incorporate refinements and enhancements. Comparing the original and proposed models, it is clear that the residual convolutional neural network (ResCNN) architecture significantly improves the performance across all methods. Notably, the DIs-ResCNN 2D model achieves remarkable precision, recall, and -score values on both training and testing datasets, indicating its effectiveness in handling the complexities of the well-log data. These findings highlight the importance of utilizing structured and DIs in combination with advanced deep learning architectures such as ResCNN to enhance the accuracy and reliability of well-log data analysis. The distinct patterns generated by DIs contribute to the improved performance of the models demonstrating the potential of this approach in the field of geological data processing and interpretation. The proposed DIs-ResCNN 2D model had the highest metrics among all the models on both the training and test sets. The training set had a precision of 93%, recall of 98%, and -score of 95%. The test set also showed strong results with a precision of 78%, recall of 81%, and -score of 79%.

Figure 14 illustrates bar charts comparing the original models with the proposed models in terms of accuracy and -score evaluation metrics on both training and testing datasets. The utilization of the DIs technique has clearly resulted in improved performance for the DIs-CNN 2D and DIs-ResCNN 2D models in comparison to the other models. This highlights the effectiveness of generating images using the proposed DI method.

In Figures 15 and 16, for NWLs-ResCNN 1D and DIs-ResCNN 2D models, the visual display of their quality and accuracy in determining the facies for the well that was not included in the training (test well) can be seen. As is clear, especially considering the parts marked with a dashed line, using the presented approach to convert NWLs to DIs and using the generated images as input to CNN have resulted in a significant improvement in the determination of ANHYDR and ILLITE facies.

Figures 17 and 18 show the class prediction error plots for the original models, and Figures 19 and 20 show the proposed models. As mentioned earlier, these plots provide better insights into the model’s performance compared to the confusion matrix. They also offer a simpler analysis, particularly when dealing with multiple classification problems. In error plot visualizations, the horizontal axis represents the actual lithological facies. The accompanying legend enables a clear distinction between accurately identified and misclassified facies by the model. These insights are imperative for refining the model and enhancing its predictive capabilities, thereby ensuring more accurate facies predictions in complex geological scenarios.

In Figures 17 and 18, the DIs-CNN 2D model has a higher recall in predicting illite than the other two models, namely, NWLs-CNN 1D and SIs-CNN 2D. On the other hand, it can be seen that this model exhibits higher precision and recall in determining the dolomite facies. By comparing Figures 18(c) and 20(c), the most obvious change is the trade-off between precision and recall, which, according to Table 4, has improved the -score value. The rest of the models are comparable with the same process in terms of precision, recall, -score, and accuracy.

In Figure 21, the presented models, namely, DIs-ResCNN 2D, SIs-ResCNN 2D, and NWLs-ResCNN 1D, depict the values of loss and accuracy on the training data during 100 epochs. As is evident, accuracy values exhibit a gradual increase, while loss values decrease for all three models. The model trained with DIs demonstrates the effectiveness of DIs in enhancing the performance of the proposed model. Additionally, the learning process of the DIs-ResCNN 2D and SIs-ResCNN 2D occurs more rapidly in the initial epochs indicating an enhancement in the models’ ability to recognize patterns.

5.3. Comparison to Baseline Models

In this study, a method for generating SIs and DIs, as well as residual convolutional neural networks, was introduced. The performance of the proposed model was evaluated by comparing it to established baseline models in this section (Table 5). The primary objective was to assess the effectiveness and superiority of our approach in addressing the problem under investigation.

Baseline models serve as applied algorithms that act as reference points when evaluating approaches. By comparing our model to these baseline models, the aim is to demonstrate its ability to surpass or at least match the existing state of techniques. The analysis results highlight the strengths of our proposed model and identify any possible limitations and areas, for further improvement. The DIs-CNN 2D model has demonstrated improvements over the baseline models in terms of test accuracy as well as -score (Figure 22).

The DIs-ResCNN 2D outperformed other models and achieved a train accuracy of 97% and a test accuracy of 92%. Additionally, it obtained a test -score of 79% indicating its proficiency in handling both precision and recall. The incorporation of connections (residual block), in the architecture, likely contributed to efficient training and improved generalization. Both DIs-CNN 2D and DIs-ResCNN 2D models exhibit better performance compared to other proposed and baseline models. The application of images obtained from data and the incorporation of residual connections have shown to be successful techniques for this specific lithology classification assignment. Using DIs and residual CNNs provides improved accuracy and -score, which are measures for ensuring accurate predictions and managing imbalanced categories.

5.4. Model Robustness

Incorporating the Gaussian noise technique into the model design enhances the robustness and performance of the system. This technique serves as a regularization method by adding controlled noise to the input data during training real-world variations and enhancing our model to handle unexpected inputs. By integrating Gaussian noise within the layers of our network using Keras [53], we introduce an element of controlled randomness that encourages the model to learn resilient and adaptable features. This approach prevents overfitting and also equips our model with the ability to make accurate predictions on unseen data ultimately resulting in improved overall performance and reliability. Figure 23 illustrates the impact of Gaussian noise on the DI at different levels. In order to evaluate the robustness of the proposed approach, Gaussian noise was deliberately introduced to the original data with standard deviations ranging from 0 to 0.10. This variation in standard deviations represents diverse intensities of noise, facilitating a comprehensive assessment of the model’s performance across varying degrees of noise interference.

Controlled randomness is encouraged by incorporating Gaussian noise into the input, promoting the learning of features, preventing overfitting, and enabling predictions on new noisy datasets. Increased model resilience in challenging scenarios has been observed through validation. Although there is a decrease in accuracy with levels of noise, the model still performs well overall indicating improved generalization (Figure 24).

5.5. Effect of Data Balancing

Data balancing is a crucial technique that involves redistributing data in a dataset to achieve a balanced representation across various classes. This is especially important when dealing with imbalanced data [54]. In supervised learning, it is crucial to train a model using balanced data to ensure that the model is equally informed about all classes [55].

The investigated data is unbalanced; in other words, the number of facies is unequal. Thus, another aspect to be considered is balancing the data. SMOTE proves to be a technique for dealing with imbalanced datasets when it comes to machine learning tasks. Particularly, it works when one class has more samples compared to the others. The number of samples for the minority classes was increased by utilizing SMOTE to balance the dataset. This led to an improvement in -score for classes as shown in Table 6, before using SMOTE, and in Table 7, after using SMOTE.

The improvement in -scores for all classes demonstrates the positive impact of SMOTE on the learning process. However, SMOTE impact on model performance can vary depending on the dataset and classification problem. While similar improvements cannot always be assumed, it has been beneficial in this case. It is essential to evaluate its effectiveness on a case-by-case basis. The integration of SMOTE into this lithology classification task has effectively resolved the problem of class imbalance leading to improved performance metrics for individual classes as well as overall accuracy.

5.6. Impact of Attention Mechanism

Attention is a mechanism in neural networks that enables the model to focus on specific parts of the input sequence when making predictions [56]. The key principle behind attention is to allocate different levels of importance or weights to different elements in the input sequence, which allows the model to weigh the contributions of each element differently. In the context of image classification or computer vision, attention mechanisms can be employed spatially to focus on specific pixels of an image [57, 58].

Table 6 provides insights into the performance of the DIs-ResCNN 2D model before the implementation of an attention mechanism. The effect of the attention mechanism on the DIs-ResCNN 2D model has been investigated, and its results are shown in Table 8.

Before using the attention mechanism, the model achieved a -score of 0.74 for class 0 (ANHYDR) and 0.55 for class 3 (ILLITE). After implementing the attention mechanism, there is an improvement, particularly in classes 0 and 3. The -score for class 0 increased to 0.75, and for class 3, it has improved significantly to 0.64. The macro -score also showed an enhancement by rising from 0.79 to 0.82 after incorporating the attention mechanism. This improvement in macro -score indicates a better balance in the model’s ability to correctly classify all classes, emphasizing a reduction in the bias towards specific classes.

5.7. Feature Importance

The SHapley Additive exPlanations (SHAP) technique is used to assess the importance of features in ML and DL models. One of its advantages is that it provides a comprehensive framework to interpret different types of models [59]. It effectively handles interactions between variables and provides accurate estimates of feature importance. Unlike methods like permutation importance, which struggle with complexity, SHAP is particularly useful for tree-based models and neural networks. It can evaluate both the importance of features across the entire dataset and their specific contributions, in individual cases. An interesting aspect of SHAP is that it assigns scores to indicate how features impact predictions [60, 61].

Figure 25 presents a bar chart that showcases the prioritization of the importance of eleven well-logs related to NWLs. The significance of these variables is determined based on their SHAP values. This bar chart allows us to easily identify features, across the entire dataset. According to the SHAP values, the two most crucial features for lithology classification based on well-logs are RHOB and PEF. These features play a role in identifying the lithology and mineralogy of rocks which holds importance in the oil and gas industry.

Figure 26 displays the SHAP values for each class in lithology classification based on well-logs. These SHAP values represent the contribution of each feature across all combinations of features. The features are ranked from top to bottom in this visualization. In this representation, blue dots represent low values, while red data points indicate high variable values. Analyzing this information enables us to gain an understanding of how the model behaves and responds to combinations of features for each class in lithology classification. This knowledge is extremely valuable when developing models for lithology classification based on well-logs. For example, in Figure 26(c), the impact of model features on class 2 (DOLOM) is highlighted. The PEF and PHIT features have a significant impact, while the HDRS feature has a small influence. It is interesting to note that as the PEF increases, the SHAP value decreases, but when the PEF decreases, the SHAP value increases.

6. Conclusion

In this comprehensive research, we have designed an advanced deep learning framework specifically developed for the accurate identification of rock facies in the oil and gas industry. Our distinctive methodology utilizes well-log data obtained during logging operations, significantly improving the accuracy and efficiency of rock facies classification.

The main elements of our framework include the use of imaging techniques, namely, shallow images and deep images, which expertly transform sensor data into synthetic images. These transformed data visualizations serve as feeds for our deep learning models, assisting them to identify complex patterns during the learning stage. This leads to a more successful classification of rock facies compared to traditional methods as well as approaches devised by human engineers. The following outlines the key discoveries from our study: (i)SI and DI methods were used to convert NWLs into images that capture the complex relationships between well-log features, improving our understanding of subsurface geology(ii)Incorporating residual and bottleneck structures significantly improved the performance of our lithology classification model. We have demonstrated the potential of the DIs-ResCNN 2D architecture with balanced data in geological applications by outperforming other models (SIs-ResCNN 2D and NWLs-ResCNN 1D) in terms of accuracy, achieving an impressive accuracy of 93%. Furthermore, our model excelled in precision and recall, resulting in an -score of 80%, highlighting its robustness and suitability for challenging geological classification tasks(iii)The SHAP method provided valuable insight into feature importance and model interpretability by identifying RHOB and PEF as the most influential input features of NWLs in CNN predictions(iv)The DIs-ResCNN 2D demonstrates a good level of stability and robustness against Gaussian noise at different noise levels(v)As part of future research, we will explore generative adversarial networks (GANs) to generate realistic well-logs and use transfer learning methods to enhance model training and generalization. These efforts are aimed at improving the capabilities of DL in addressing complex geological challenges

Data Availability

The data used to support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Seyed Hamid Reza Mousavi was responsible for conceptualization, methodology, software, and writing the original draft preparation. Seyed Mojtaba Hosseini-Nasab was responsible for supervision, technical advice, technical approaches, writing, reviewing, and editing.

Acknowledgments

We express our gratitude to Iran University of Science and Technology (IUST) for their provision of access to their cloud server infrastructure. Additionally, we appreciate the National Iranian Oil Company (NIOC) for providing the well-log data of the gas field in Iran. Their valuable contributions greatly improved our research. This study has been financially supported by the (IUST), with no involvement from any other funding institution. The funding for this study is not international.