Abstract

We propose a neural architectural search model which examines histopathological images to detect the presence of cancer in both lung and colon tissues. In recent times, deep artificial neural networks have made tremendous impacts in healthcare. However, obtaining an optimal artificial neural network model that could yield excellent performance during training, evaluation, and inferencing has been a bottleneck for researchers. Our method uses a Bayesian convolutional neural architectural search algorithm in collaboration with Gaussian processes to provide an efficient neural network architecture for efficient colon and lung cancer classification and recognition. The proposed model learns by using the Gaussian process to estimate the required optimal architectural values by choosing a set of model parameters through the exploitation of the expected improvement (EI) values, thereby minimizing the number of sampled trials and suggesting the best model architecture. Several experiments were conducted, and a landmark performance was obtained in both validation and test data through the evaluation of the proposed model on a dataset consisting of 25,000 images of five different classes with convergence and F1-score matrices.

1. Introduction

At present, lung and colon cancer types are among the most prevalent and deadliest cancers leading to cancer-related deaths globally [1]. In contrast with the combination of breast, ovarian, and prostate cancers, lung and colon cancers cause more death per annum. In recent reports, vaping and smoking have skyrocketed the risk of lung cancer; although nonvapers may be infected, the threat is minimal. Dietary habits, advancing age, obesity, and sedentary lifestyles [2] also contribute immensely to the risk factors leading to the progressive surge in incidences of colon cancer. People with an established family history of CRC inflammatory and bowel disease, adenomatous polyposis, or hereditary nonpolyposis colon cancer are highly prone to CRC infection. According to studies, 20–53% of the U.S. citizens above 50 years of age are projected to be infected with adenomas and the aged have about 5% lifetime threat of adenocarcinomas emergence [3]. Prompt detection of cancer is essential to its cure, but manual recognition and the stages of the processes involved in the identification are cumbersome and dangerous, especially in the early stages. Biopsies and imaging such as CT scans are two major diagnostic [4] methods.

The microscopic investigation of unhealthy tissues or histopathology is critical to the early diagnosis and treatment of cancers [5]. In recent times, the developmental strides in digital microscopy have pioneered the extraction of relevant information from these diseased tissues via whole-slide images (WSIs) of cancer tissues utilizing artificial neural network algorithms based on convolutional neural networks [6]. Deep artificial neural networks can perform pathological examinations independently, which are usually conducted manually by pinpointing strategic interpretable features that pose prognostic characteristics. In the past, deep learning networks required huge data, which was hard to gather in the healthcare field for model training and performing inference. However, recent advancements such as single-shot, few-shot, data augmentation, and architectural search learning methods have seen a reduction in demand for huge data in training and deploying deep learning models in healthcare.

Numerous factors impede the automatic detection and classification of these unhealthy tissues containing cancers. For instance, images of low quality due to poor fixation and strains during tissue preparatory works or failures due to autofocus when performing slide digitisation. Also, complex tissue builds, nuclei clutter, and variations in the morphology of the nucleus also constitute a great challenge. Specifically, lung and colon colorectal adenocarcinoma and lung squamous-cell-carcinomas often pose asymmetrical chromatin textures which are extremely cluttered together, having unclear boundaries, and this invariably makes the detection of distinct nuclei a perplexing task [7]. In addition, inconsistencies in the appearance of similar nuclei within and across several data samples make the classification of specific nuclei invariably tough. These challenges were taken care of during the data preprocessing process.

Due to these complex patterns of cancer-affected tissue images, the classical design of automated cancer recognizers needs domain expert knowledge to guide on which specific features to extract for training an artificial neural network algorithm. This process, also referred to as feature engineering [8], is time-consuming, labor-intensive, and error-prone. Our proposed method is capable of learning the feature representations in the colon and cancer-affected tissues, thereby eliminating feature engineering. Our hypothesis is based on using a Bayesian neural architectural search to estimate the exact classifier architecture in conjunction with patients’ endpoint as the outcome that has the potential of divulging recognized prognostic morphologies and also recognizing previously unfamiliar prognostic structures.

In recent times, researchers have applied different artificial intelligence and deep learning strategies to classify images containing different types of health problems, including cancer, for early identification and treatment. In a related work, a collection of classifiers such as the KNN (k-nearest neighbor), ANN (artificial neural networks), and SVM (support vector machine) in conjunction with the Bayesian model select and learn features from the leukemia dataset [9]. In their study, Hosny et al. [10] proposed an automated framework to classify skin lesions for early cancer detection. In the work, they used transfer learning on a pretrained deep learning network to achieve a reliable result for cancer detection. In another closely related investigation, feed-forward neural networks with a deep belief network and H2O were deployed to perform cancer classification from a cancer data repository [11].

Furthermore, a deep learning pipeline for a fully automated cervical cancer classification was proposed in work by [12]. In the proposed pipeline, two pretrained deep learning frameworks were integrated to automatically conduct cervical tumour classification and cervix detection tasks. Also, a deep learning model was adopted to explore the possibility of classifying from gene expression data cancer cells [13]. In another similar study, a supervised cancer classification model for the molecular subtyping of cancer cells, in particular, breast and colorectal cancers, was proposed [14]. In another study, a convolutional neural network-driven deep learning method was deployed to perform a multiclass breast cancer classification task [15].

In continuation, a study used a three-way decision-based Bayesian deep learning approach to conduct an uncertainty quantification in skin cancer classification [16]. Diverse convolutional neural network-powered deep learning models were used to perform dermatologist-level dermoscopy skin cancer classification tasks [17]. Also, another scientific investigation utilised a weakly-supervised 3D deep learning model to classify and localise breast cancer lesions found on MR imageries [18]. An optimal feature fusion from ultrasound images for breast cancer classification using a probability-driven optimal deep learning framework was introduced by [19]. A patch-based deep learning framework was introduced to perform breast cancer classification tasks from histopathological images [20]. In this work, a rapid, deep learning-inspired framework using a Bayesian–Gaussian neural architectural search strategy is proposed. Our work is motivated by the recent drive for efficient models that are capable of rapid cancer data processing and recognition.

3. Theoretical Background

3.1. Bayesian Neural Architectural Search

The neural architectural search [21] focuses on automating the neural network training cycle by eliminating the hassles of the manual neural network architecture value selection process (see Figure 1). Random and grid search [22], Bayesian optimisation [23], evolutionary search [24], reinforcement learning [25], and gradient descent [26] are some of the methods that have been proposed. Each of these algorithms has merits and demerits. For instance, the grid search has an issue known as the “Curse of Dimensionality” because it requires enormous time to train due to the drastic increase in the number of parametric combinations as more parameters are added to the model. The random search tries at random parameter-combinations instead of searching each parameter combination like the grid search strategy. So, as the parameter value increases, the probability of obtaining an ideal combination of parameters via random sampling reduces to zero.

Bayesian optimisation using the Gaussian process algorithm provides a better alternative for 0-th order optimisation of expensive function evaluation necessary for artificial neural network architecture selection. For a given Bayesian optimisation iteration, we train and observe a subset of the neural network to gauge the accuracy of unknown model architecture in a search domain. This method solves the aforementioned problems in the other search algorithms and eliminates the need for manual construction of distance functions between neural networks. Bayesian optimisation normally works by the assumption that an unidentified function was sampled via a Gaussian process (GP), having a firm grip on this function while observation persists. In this work, the observations are the degree of convergence of several choices of hyper-parameters we intend to optimise. In choosing the hyper-parameters of the next iteration, the expected improvement (EI) [27] is optimised on the best present result, or the upper confidence bound [28] of the Gaussian process. The efficiency of the upper confidence bound (UCB) and the expected improvement (EI) have been confirmed for the amount of function evaluation necessary to attain the global optimum of numerous black-box functions [29].

3.2. The Gaussian Process (GP)

Gaussian process (GP) is an optimal method for loss function modelling in models that require optimisation and is a prior of functions that are closed under sampling [29], that is, if the prior distribution of a function f is perceived to be GP having k kernel and 0 mean, then the conditional distribution of f, acknowledging a sample of its values, is also regarded as GP whose covariance and mean functions are derivable analytically. Gaussian processes possessing mean functions that are generic can also be used in principle, but it is efficient and easy to use only 0-mean processes for this work. We achieved this by focusing the values of functions on the data sets being processed.

3.3. Acquisition Functions for Bayesian Optimisation

Assumptions are made such that the function is selected from the prior of the Gaussian process, and observations are in the form of , given and representing the noise variance introduced into the observed function. Posterior over functions are induced by the data and the prior, which is denoted as , which fixes what point in should next (n) be estimated through a proxy optimisation ;here many diverse functions have been anticipated. Previous observations are relied upon by the acquisition functions, even the Gaussian process hyperparameters and these dependencies are denoted as . Many popular acquisition functions are available, but with the Gaussian process prior, they rely solely on the predictive mean function of a given model in conjunction with the predictive variance function . Therefore, the current best value is presented as as the cumulative distribution function of the standard normal and the standard normal density function [29]. Intuitively, a notable approach is to maximise the probability of improving the current best result, and this process is known as probability of improvement (PI) [25]. Analytically, this can be computed as follows:where

Alternatively, we adopted to maximise the expected improvement (EI) of the current best result in this work, which is closely related to the Gaussian process:

3.4. Convolutional Neural Network

The discovery of the convolutional neural network has ushered in a new dimension in healthcare image classification, segmentation, disease detection, and recognition. The digitalised healthcare system has enabled the collection of large-scale visual data required for CNN model training and inferencing [30]. The layers present in a convolutional neural network extract complex features from healthy and unhealthy (cancer) tissues required for critical diagnostic decision-making. The CNN consists of different blocks of layers and subsampling layers, each performing a distinct function necessary for easy extraction of adequate information for the detection of cancer in a given image.

The convolutional layer is the power block where vital convolutional operations are performed and important features are extracted, as shown in Figure 2. It extracts similar features from different image regions and matches them together for probabilistic decision-making. A chuck of images (x1, x2, x3) are taken from the image repository and fed into the convolutional blocks (Block1, Block2, Block3, …) for operations. Filters in the convolution layers convolve over the input image chunks to pick vital key points using the back propagation algorithm. The pooling layer or subsampling layers carry out down sampling processes on the images emanating from the convolution operations. Then, a max-pooling operation picks the largest pixel values from a specific part of the image kernels, thereby minimising the required parameters to be computed and making the convolution activities translational invariant to scale, size, and shape [31]. The last layer is a fully connected layer which accepts the inputs of all previous neurons and operates on them to produce output (y1).

4. Materials and Methods

4.1. Materials

We present the detailed materials and resources used in training and evaluating the proposed model. We employed Keras open-source deep learning framework with TensorFlow backend [32] to construct, train, and evaluate the Bayesian–Gaussian driven convolutional neural architectural search model for cancer identification. All experiments were performed on a high-end PC with an 8G GPU card of 16 GB internal memory, a cuDNN library, and a CUDA Toolkit.

4.2. Dataset

The dataset used in this work consists of 25,000 colon and lung histopathological images of five classes [33]. Each class contains 5000 images placed in separate folders, where 0, 1, …, n denotes the classes of the images. The classes belonging to colon histopathological images are colon adenocarcinomas and benign colonic tissues, and that having lung histopathological images are lung adenocarcinomas, lung squamous cell carcinomas, and benign lung tissues. All patients’ identities are removed and the data are freely made available for AI researchers. The original size of all the images is 768 × 768 pixels. However, during preprocessing, we resized all the images to 150 × 150 pixels to minimise computational demand and allow the dataset to fit into our computational model. The dataset was randomly split into three, having 70% samples assigned to the training set, 20% for validation and 10% designated for testing the model.

4.3. Methods

A baseline convolutional neural network with three layers was used for the training of the proposed Bayesian–Gaussian inspired convolutional neural architectural search. The first consists of 9 kernel sizes, 1-stride, 16-filters, and max-pooling of . The second has parameters as the first but with a dropout layer of 0.15. The third layer has nine kernel sizes, 1-stride, 36-filters, max-pooling of , and a dropout layer of 0.15. The fourth and final layer is the dynamic layer, where the neural architectural search processes are performed. We used categorical cross-entropy as the loss function and Adam as the optimiser. The best model was initialised at zero before training with 30 epochs and a batch size of 128. Expected improvement was used as the acquisition function with the number of calls set at 11. Initially, we kept the dynamic learning rate between 1e − 6 low and 1e − 1 high with a uniform prior, the artificial neural dense layer at 1-low and 10-high, and dense node at 2-low and 512-high. We set the default parameter P at 1e − 3 learning rate, 1–16 dense layer/node, and rectified linear unit (ReLU) as the activation function.

5. Results and Discussion

We deployed several measurement matrices to determine the cancer identification prowess and performance of the proposed model.

5.1. Convergence and Matrix Plots

The convergent plot in Figure 3 shows the learning progression during training with respect to the number of calls. As the call increase, the model convergence increases and attain the peak between 4 and 11 calls. Figure 4 is a matrix plot illustrating the combination of the three key training dimensions.

The first and second plots on the first row of Figure 4 show the partial dependences of two dimensions of the fitness-value-change approximation resulting from the simultaneous alteration of the dimensions. They represent the estimates of the modelled fitness function, which invariably are the approximation of the real fitness function. The partial dependence (PD) is computed by setting an individual value for the learning rate and selecting a large number of examples randomly for the dimensions left in the search space, and then the projected fitness values available in all the points are averaged. To demonstrate the influence of this exercise on the average fitness function, this process is redone on other learning rates. Similarly, this procedure is repeated on the plots of the partial dependencies of the other remaining dimensions.

The sample distribution of individual hyperparameters while performing Bayesian optimisation is shown in the diagonal of the histograms in Figure 5. The other plots under the diagonal diagram show the position of samples in the search space. The magnitude of the sample selections is demonstrated with the colour coding. It is most likely to observe a high concentration of samples in sections of the search space when bigger numbers of samples are chosen. The top ten accuracies of the model architectural search process drawn from 30 generated architectures are shown in Table 1. From the table, the model with 1.85 learning rate, nine layers, and 142 dense nodes yielded the best result overall.

5.2. The Confusion Matrix

We further measure the performance of our proposed method by examining the precision, recall, and F1-score of randomly selected test samples of each class of the colon and lung tissues. The recall is the ability of the proposed model to discover all the significant cases of cancer in a given set of samples. In order words, it is the number of true positives (TP) divided by the number of true positives (TP) [34] added to the number of false negatives (FN), i.e.,

The colon and lung tissue data points accurately classified as positive are the true positive (TP), and the ones classified as negative when they are actually positive in reality are false negatives (FN). The ability of the proposed method to detect only the relevant colon and lung tissue data points is the precision or the number of true positives (TP) divided by the number [35] of false positives (FP) added to the number of true positives (TP) i.e.,

False positives (FP) are instances where the model classifies data points as positive when they are negative in reality. Furthermore, the harmonic mean of the precision and recall are the F1-score expressed as follows:

Finally, the macro and weighted averages are the arithmetic mean of the F1-scores per class of the colon-lung tissue test data samples and the weight of the F1-score of each colon-lung tissue test data class by the number of samples from that class, respectively.

Analysing the performance of the proposed model on each class of the lung-colon tissue test samples, our model recorded a 98% F1-score on 489 randomly selected lung adenocarcinoma (LA) test samples, as shown in Table 2. Also, a 99% F1-score was achieved on 511 test samples of the lung squamous (LS) tissues and 94% on 534 lung benign (BL) test data samples, respectively. Likewise, the model yielded a 93% F1-score on the 512 randomly picked colon adenocarcinomas (CA) test data samples and 99% on 454 benign colonics (BC) test data samples. An overall 97% test accuracy on the 2500 reserved test data samples was achieved with 97% macro-averaging and weighted averaging, respectively.

We compared our result with the one obtained using our baseline conventional convolutional neural network (CNN) model, having 3,474,501 trainable parameters with 50 epochs, as shown in Table 3. All parameters in the baseline model remain the same as the proposed model but without the Bayesian–Gaussian architectural search process. A closer look at Table 3 indicates that the baseline model suffers overfitting problems which can be mitigated with some cumbersome measures, but our method does not require additional measures to obtain an optimal model. Our approach achieved an approximately 97% overall accuracy (Table 2), thereby outperforming the normal CNN method, which yielded an overall accuracy of 62%, as shown in Table 3.

Furthermore, a comparative analysis with related works on deep learning-based lung and colon cancer classification is performed against the proposed model in this section. Since our work is based on new novel dataset, some of these related results cited in this article are not completely comparable as the dataset they used in the various works are different from our work. Even so, the objective of the works is the same and thus put into comparison as shown in Table 4.

As shown in Table 4, our introduced method outperformed the other referenced methods in terms of classification and recognition of cancer infection types. The first three works on the used SVMs, SC-CNN, and RF, respectively to conduct the cancer classification tasks and recorded accuracies far below our proposed method within the range of 72% and 86%. This is followed by the models proposed [38] that used the RESNET-50 deep learning architecture and obtained an accuracy of 93.91%, then [39] Hatuwal and Thapa [40] and Masud et al. [41], respectively, used conventional CNN architectures on the histopathological cancer image datasets to obtain classification accuracies ranging from 97.89% to 97.92%. However, our proposed method used a novel variant Bayesian–Gaussian architectural search process to obtain a more better CNN architecture that yield a more superior performance in terms of performance accuracy and efficiency.

6. Conclusions

In this work, we proposed a neural architectural search model which examines a histopathological image to recognize the presence of some classes of cancer in both lung and colon digital images by learning and distinguishing critical features in them. This method works by using key points in a given batch of data occupying a search space to suggest a suitable and efficient neural network architecture. The results from this work have shown that by having a sizeable amount of histopathological image dataset, one can successfully construct an effective and efficient neural network model capable of recognizing a cancer-infected person without undergoing painful rigorous diagnostic processes. This technique works without manually setting network architecture features, unlike the conventional artificial neural network models. In the future, we plan to increase the robustness of the model by adding more cancer classes and cases and increasing the model’s efficiency and accuracy.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by Dongseo University, “Dongseo Cluster Project” Research Fund of 2022 (DSU-20220006)