Abstract

Breast cancer is characterized by abnormal discontinuities in the lining cells of a woman’s milk duct. Large numbers of women die from breast cancer as a result of developing symptoms in the milk ducts. If the diagnosis is made early, the death rates can be decreased. For radiologists and physicians, manually analyzing mammography images for breast cancer become time-consuming. To prevent manual analysis and simplify the work of classification, this paper introduces a novel hybrid DenseNet121-based Extreme Learning Machine Model (ELM) for classifying breast cancer from mammogram images. The mammogram images were processed through preprocessing and data augmentation phase. The features were collected separately after the pooling and flatten layer at the first stage of the classification. Further, the features are fed as input to the proposed DenseNet121-ELM model’s fully connected layer as input. An extreme learning machine model has replaced the fully connected layer. The weights of the extreme learning machine have been updated by the AdaGrad optimization algorithm to increase the model’s robustness and performance. Due to its faster convergence speed than other optimization techniques, the AdaGrad algorithm optimization was chosen. In this research, the Digital Database for Screening Mammography (DDSM) dataset mammogram images were utilized, and the results are presented. We have considered the batch size of 32, 64, and 128 for the performance measure, accuracy, sensitivity, specificity, and computational time. The proposed DenseNet121+ELM model achieves 99.47% and 99.14% as training accuracy and testing accuracy for batch size 128. Also, it achieves specificity, sensitivity, and computational time of 99.37%, 99.94%, and 159.7731 minutes, respectively. Further, the comparison result of performance measures is presented for batch sizes 32, 64, and 128 to show the robustness of the proposed DenseNet121+ELM model. The automatic classification performance of the DenseNet121+ELM model has much potential to be applied to the clinical diagnosis of breast cancer.

1. Introduction

Breast cancer is the uncontrolled growth of breast tissue. Breast cancer accounts for 12.5% of all new cancer cases each year, making it the most prevalent cancer in the world. About 30% of newly diagnosed malignancies in women are predicted to be breast cancers in 2022. Being a woman and getting older are the two most significant risk factors for breast cancer. The biggest global public health issue right now is cancer. According to the WHO (World Health Organization), IARC (International Agency for Research on Cancer), and the GBD (Global Burden of Disease Cancer Collaboration), they say that the cases of cancer have risen 28 percent from 2006 to 2010 and there will be 2.7 million new cases of cancer in 2030. Breast cancer affects more women than any other type of cancer (1.7 million incidents, 535,000 deaths, and 14.9 million life years adjusted for people with disabilities). The incidence and mortality rates of breast cancer have shown enduring inequities, according to the American Cancer Society. The American Cancer Society (ACS) research states that in 2022, breast cancer will be a major cause of high-mortality rates [1]. The report estimates a daily death toll of 1,670. Invasive breast cancer will affect roughly 13% (or 1 in 8) of American women during their lifetime. Breast cancer treatment at its early stage becomes an essential part of the diagnosis. There is also a highly significant diagnosis of breast cancer early on in order to improve the survival rate. Breast cancer diagnosis by manual intervention consumes lots of time to understand and classify from the mammogram images. In recent years, deep learning algorithms like CNN have proved their capability to detect breast cancers from pathological images. It was also found that some models failed due to overfitting. We believe this complex diagnosis system can be made more accessible by designing an automated deep-learning classification system. Motivated by the advancement of deep learning, we have developed a novel classification model for the classification of breast cancer. The development of deep learning as an image classification technique is crucial in the current research era. Several studies proposed classifying mammography images, and the effectiveness of the classifiers was demonstrated using binary, multi, and dual classification. By randomly deleting layers from CNN models during training, deep learning significantly enhances the training of deep networks [2]. The MobileNets are built using a productive architecture that creates deep neural networks using depth-wise convolutions [3]. ResNet was suggested by Xie et al. [4] for image classification. With a training accuracy of 98%, Falconi et al. [5] recommended VGG, Xception, and ResNet for the classification of breast cancer. DenseNet and SENet were suggested by Li et al. [6] to be interleaved with histological images for the classification of breast cancer. With five-folder cross-validation, Wang et al. [7] proposed modified InceptionV3 architecture and achieved an area under the curve (AUC) value of 0.9468, sensitivity, and specificity of 0.886 and 0.876, respectively. With the MIAS dataset, Shin et al. [8] proposed Multiscale All Convolutional Neural Network (MA-CNN) and achieved a sensitivity of 96% and 0.99 AUC. Squeeze-Excitation-Pruning (SEP) block for histopathology images for breast cancer classification was suggested by Zhu et al. [9] in their hybrid CNN architecture. After a thorough review of the literature, it was found that there was no research on the use of the DenseNet121+ELM model for mammography images. Due to the lack of faster-performance conventional CNN automatic classifiers, we are inspired to propose the DenseNet121-ELM model. The proposed model performs well when compared to the existing CNN models for classification.

The contributions are as follows: (i)We developed DenseNet121 model hybridization with an extreme learning machine (ELM) at fully connected layer(ii)We proposed an extreme learning machine (ELM) model after the flattened layer for classification(iii)We have utilized AdaGrad optimization for the weight optimization of the model with a batch size of 32, 64, and 128

The remaining sections are organized as follows: the related research done by researchers is presented in Section 2, the methodology, the research diagram, and the architecture of the proposed DenseNet121+ELM models are presented in Section 3, the results and discussion are presented in Section 4, and the conclusion and references are presented in Section 5.

On the basis of medical imaging, researchers from all around the world are developing techniques for automatic breast cancer identification and classification. Due to the rapid growth of deep learning in the medical imaging field, the researchers are attracted towards this new research. The essential element of any diagnostic system is that it aids radiologists and medical professionals in the early identification of cancer. For breast cancer, various deep learning-based classification algorithms were developed. Shear-wave elastography (SWE) data and CNN were combined with a segmentation-free radiomics technique for breast tumor classification proposed by Zhou et al. [10]. For feature extraction, this approach produced elastic morphology data. For the suggested experiment, 318 malignant breast tumors and 222 benign breast tumors were used. During the final experiment, the suggested SWE-CNN model achieved specificity of 95.7%, sensitivity of 96.2%, and accuracy of 95.8%. Mask regions with CNN were suggested by Chiao et al. [11] for the classification of breast cancer from CT scan images. The data was collected from China Medical University Hospital using ultrasound images together with biopsy, histology, and diagnostic reports. For the experiment, a total of 307 ultrasound images from 80 instances were gathered. They have achieved an overall accuracy of 85% in diagnosing benign and malignant breast tumors using the proposed Mask R-CNN. A compact SE-ResNet module for the CNN classifier was suggested by Jiang et al. [12] in order to enhance CNN performance with fewer parameters. The CNN’s squeeze-and-excitation block and residual module are combined to form the SE-ResNet. The research employed the BreakHis dataset, which has an accuracy rate of 98.87% for binary classification and 93.81% for multiclass classification. A deep learning architecture with transfer learning was presented by Khan et al. [13] for the classification of breast cancer in breast cytology images. The LRH hospital in Peshawar, Pakistan, provided the histopathology dataset. The GoogLeNet, Visual Geometry Group Network (VGGNet), and Residual Networks (ResNet) were utilized for feature extraction. The combined features were then given to the fully connected layer with average pooling for the classification of cancerous and benign cells. The accuracy of the proposed deep learning framework was 97.525%. The deep learning Xception model for breast cancer classification was proposed by Abunasser et al. [14]. The research used 7909 microscopic images from the BreakHis breast cancer dataset, which included 2,480 benign and 5,429 malignant samples. The proposed Xception model attained training accuracy, precision, recall, and F1-score of 99.78%, 97.60%, and 97.58%, respectively. A hybrid improved marine predators algorithm- (IMPA-) ResNet50 model and transfer learning were proposed by Houssein et al. in 2022 [15]. The optimum hyperparameters of the CNN architecture were identified using the IMPA. For experimentation, the DDSM dataset and the MIAS dataset were employed. The MIAS dataset achieved 98.88% accuracy, 97.61% sensitivity, and 98.40% specificity for the classification of breast cancer, while the DDSM and earned an accuracy of 98.32%, 98.56% sensitivity, and 98.68%. The DenseNet CNN model was proposed by Nawaz et al. [16] for the multiclass classification of breast cancer and predicted the subclass of cancers like fibroadenoma and lobular carcinoma. DenseNet CNN model produced outstanding processing results with 95.4% accuracy using the histopathology BreakHis image dataset. Khan et al. [17] proposed deep CNN ResNet50 model to segment and classify types of breast abnormalities into benign and malignant cases. The proposed model has reached 88% accuracy in classifying breast cancer abnormalities such as masses, calcifications, carcinomas, and asymmetry mammograms. In order to distinguish between normal tissue and benign lesions in hematoxylin-eosin-stained breast cancer microscopy images, Hameed et al. [18] proposed Xception with six intermediate layers. The classification proposal made by the Xception model makes use of its own layering structure. The performance of the model was investigated on four normalized datasets resulting from Reinhard, Ruifrok, Macenko, and Vahadane stain normalization. They employed cross-validation for the performance measure and achieved 97.79% accuracy with a kappa value of http://0.965. Hu moment, Haralick textures, color histogram feature extraction techniques, and a DNN classifier were proposed by Joseph et al. [19] for the multiclassification of histopathology images. To prevent overfitting, the DNN employs four dense layers, a softmax function in its structure, and data augmentation. When classifying images using the BreakHis dataset, 97.87% accuracy was attained for magnification-dependent histopathological images. In order to enhance the color separation and contrast, Alkassar et al. [20] suggested a magnification-specific binary (MSB) and magnification-specific multicategory (MSM) classification approach that normalizes the hematoxylin and eosin stains. With the BreakHis histopathology dataset, two unique feature types—deep and shallow features— were extracted using two deep DenseNet and Xception structure networks and achieved an accuracy of 99% and 92% in terms of MSB and MSM classification. Altameem et al. [21] proposed an ensemble approach, where the Gompertz function was used to form fuzzy rankings of the base classification. InceptionV4, ResNet-164, VGG-11, and DenseNet121 models were considered as base classifiers. They have used four datasets as DDSM, BCDR, Mini-MIAS, and INbreast. An accuracy rate of 99.32% was achieved by the InceptionV4 model with fuzzy rank-based Gompertz function which was higher than the other ResNet-164, VGG-11, and DenseNet121 base models. Alqhtani [22] proposed a novel layer-based Convolutional Neural Network (BreastCNN) for breast cancer method, which works in five different layers and uses different types of filters. Breast cancer was classified with an accuracy of 99.7% using the Database for Mastology Research (DMR), which contains 745 healthy and 261 sick photos. Hosni Mahmoud et al. [23] proposed a deep CNN method for feature extraction, and the features are coupled along with the texture features for the classification of breast cancer. A support vector machine was trained on deep CNN for classification along with scale-invariant feature transform (SIFT) algorithm and achieved an accuracy of 97.8% with a TP rate of 98.45% and a TN rate of 96%. A hybrid model based on “Pulse-Coupled Neural Networks (PCNNs) and Deep Convolutional Neural Networks (CNNs)” was developed by Altaf [24] using three publically accessible datasets, including the DDSM, INbreast, and BCDR datasets. From the three datasets, they have used 900, 300, and 450 images, respectively. The hybrid PCNN-CNN model attained 98.72%. For the DDSM, INbreast, and BCDR datasets, respectively, accuracy values were 97.5%, 96.94%, and 96.94%. Using a combination of deep neural networks (ResNet 18, ShuffleNet, and InceptionV3Net with 18, 48, and 50 hidden layers) and transfer learning with the BreakHis dataset, Aljuaid et al. [25] proposed a novel CAD method for breast cancer classification and achieved average accuracies of 99.7%, 97.66%, and 96.94% for ResNet, InceptionV3Net, and ShuffleNet, respectively. ResNet, InceptionV3Net, and ShuffleNet each achieved average accuracies of 97.81%, 96.07%, and 95.79% for binary classification and multiclassification, respectively.

3. Materials and Methods

3.1. Research Implementation Diagram

The research flow diagram in Figure 1(a) presents step-by-step accomplishment of the research work.

3.2. Proposed DenseNet121+ELM Model and Its Architecture

The proposed DenseNet121+ELM model architecture is shown in Figure 1(b). After resizing the input mammography images to their original size, the data was divided into training and testing phases. The dataset is normalized to have a unique dataset and fed to the models VGG19 [26], MobileNet [2], MobileNetV2 [27], Xception [28], ResNet50V2 [28], InceptionV3 [6], InceptionResNetV2 [6], DenseNet201 [29], and DenseNet121 [27]. All the data go through the convolution phases, average pooling, and flatten phases. The models VGG19 [26], MobileNet [2], MobileNetV2 [27], Xception [28], ResNet50V2 [28], InceptionV3 [6], InceptionResNetV2 [6], and DenseNet201 [29] were classify through the conventional phase of the classification through a simple neural network with AdaGrad weight optimization. The DenseNet121 architecture is presented in Figure 2(b).

3.2.1. ELM Model

The output function of ELM [29] with hidden neurons is represented by where is the hidden matrix, and is the weight vector, Equation (1) can be written as Where is the hidden layer matrix, whose matrix elements are mentioned as

Equation (2) is a linear system, which is given by where is the “Moore–Penrose generalized inverse of matrix .” The input is treated as the feature input dataset, which is collected from the flattened layer, and fed as input to the ELM model.

The DenseNet121 model has one convolution, 58 (fifty-eight) convolutions, 61 (sixty-one) convolution, 4 average pooling, and one fully connected layer. Features are extracted from dense blocks that go through transition. One convolutional layer and one average pooling layer with a stride of 2 are present in each individual transition layer. In this study, the classification performance of mammography images was improved and differentiated only by combining the DenseNet121 model with the extreme learning model. The feed-forward neural network known as the extreme learning machine serves as a classifier at the fully connected layer. We have replaced the neural network with ELM at the fully connected layer in the DenseNet121 model. The DenseNet121+ELM model’s weights have been optimized using the AdaGrad optimization technique. The architecture model of ELM is presented in Figure 2(a). For batch size 128, the classification performance results are shown in Table 1.

3.2.2. The Steps for Algorithm Is Presented as Follows

In actual practice, the CNN weights are optimized with a backpropagation algorithm. The weights of the DenseNet121 model are optimized with a backpropagation algorithm. We have considered AdaGrad optimizer for our research.

Step 1. The input size images are considered for this research. The images have undergone the process of data augmentation, and the augmented images are fed as input to the models. The convolution takes place with the image and the filter. Considering the image size as is the real image; the filter is chosen randomly as the image size as filter. The convolution is given by . where and .
By rotating the filter in 180 degrees, then take transpose,
We have

Step 2. After convolution, it passes the phases of dense layer, and the average pooling has been accomplished with pool size (2,2) with stride 2.

Step 3. After average pooling, the features are flattened by the flatten layer, making the feature matrix into a vector matrix. The vector matrix is fed as input to the fully connected ELM layer. The weights are updated with AdaGrad and the learning rate was adjusted.
The updated formulae are as follows:
Now, the new weight optimization equation is given by The learning parameter value of 0.001 is chosen for . Now, the flattened vector is given.
We have where and .

Step 4. The output of the fully connected layer is passed through the softmax layer to classifiy the images into cancerous and noncancerous.
The output of the hidden layer is given by as where is the feature matrix. Now, the is passed through the softmax layer, and the equation is given by where is the output class for cancerous and noncancerous cancer.

3.3. Dataset

This study collects data from open-source websites using the Digital Database for Screening Mammography (DDSM) dataset [17] shown in Figure 3 and uses classification models to compare the accuracy. The dataset is available at https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset.

3.4. Data Augmentation

The data augmentation process has undergone “shifting, rotating, and flipping.” A total of 3672 images went through the data augmentation phase, generating about 11000 images. Table 2 shows that out of the total data, 70% of the images are used for training and 30% are used for testing. Table 3 lists the hyperparameters that were applied during the experiment.

4. Results

4.1. Classification Results

For training and testing, 11000 images were used during classification, and the details of image distributions are provided in Table 2. The training accuracy of DenseNet121, DenseNet201, InceptionResNetV2, and DenseNet121+ELM is compared in Figure 4. Table 4 compares training accuracy for DenseNet121+ELM to the other listed models and shows that it performs better.

For batch size 32, the performance results are shown in Table 1. Figures 4 and 5 show the training accuracy and training loss for a batch size of 32. It has been found that the proposed DenseNet121+ELM model experiences less training loss than the DenseNet121, DenseNet201, and InceptionResNetV2 models. The suggested DenseNet121+ELM converged in about 20 iterations compared to almost 50, 75, and 84 iterations for DenseNet121, DenseNet201, and InceptionResNetV2, respectively.

The training accuracy and loss for the 64 batch size are shown in Figures 6 and 7. It has been noted that while DenseNet121, DenseNet201, and InceptionResNetV2 required roughly 40, 60, and 72 iterations, respectively; for convergence, the proposed DenseNet121+ELM required just about 15 iterations. When compared to the other specified models, it is found that the proposed DenseNet121+ELM model has a lower training loss. For batch size 64, the performance results are shown in Table 5.

The training accuracy and loss for a batch size of 128 are shown in Figures 8 and 9. It has been noted that the proposed DenseNet121+ELM required just about 15 iterations to reach convergence, compared to almost 20 iterations, 25 iterations, and 45 iterations for DenseNet121, DenseNet201, and InceptionResNetV2, respectively. When compared to the other models, it is observed that DenseNet201 has a higher training loss, whereas the proposed DenseNet121+ELM model has a lower training loss. The proposed DenseNet121+ELM model took lesser computational time of 159.7731 minutes, compared to 167.1242 minutes for DenseNet121, 164.3344 minutes for DenseNet201, and 160.4033 minutes for InceptionResNetV2. For batch size 128, the performance results are shown in Table 6. The system’s performance parameters, such as “sensitivity, specificity, and accuracy,” are also crucial [12] for classification. where “true positive,” “true negative,” “false positive,” and “false negative” are denoted by TP, TN, FP, and FN. We also took into account computational time, which is a crucial performance measure.

From Table 6, it can be shown that the 128 batch size leads to improved training and testing accuracy as well as faster calculation. The batch size of 128 resulted in 99.34% training accuracy, 98.84% testing accuracy, and a calculation time of 159.7731 minutes. The computation took 167.3545 minutes for batch size 32 and 163.8975 minutes for batch size 64, as shown in Tables 5 and 6, respectively. The testing accuracy is a crucial performance indicator that gives the models credibility, regardless of how well they were trained. Compared to the previous models, the suggested DenseNet121+ELM model suffers less loss. The proposed model converged in a substantially less number of epochs when compared to the total number of epochs. Table 1 shows the proposed model’s accuracy in comparison to earlier research. It can be concluded that the proposed DenseNet121+ELM model shows better performance results with batch size 128 when compared to the other batch sizes 32 and 64. Figure 10 presents the comparison of all the models considered for the research.

5. Discussion

The classification procedure took into account a total of 11000 images, out of which 7700 were used for training and 3300 for testing. Table 4 gives more information about the augmented images. The VGG19, MobileNet, MobileNetV2, Xception, ResNet50V2, InceptionV3, InceptionResNetV2, DenseNet201, DenseNet121, and DenseNet121+ELM models were employed to classify the augmented images. The fully connected layer weights of the CNN models were tuned using the AdaGrad algorithm. The training performance of the malignant and noncancerous classification of breast cancer was shown in Figures 4, 6, and 8. With batch sizes of 32, 64, and 128, all the mentioned models were considered for the classification along with the proposed DenseNet121+ELM model. The training losses for InceptionResNetV2, DenseNet201, DenseNet121, and DenseNet121+ELM were shown in Figures 5, 7, and 9. With a batch size of 128, the VGG19 model achieved training and testing accuracy of 97.76% and 97.98%, respectively. Additionally, the VGG19 model achieved specificity, sensitivity, and computational time at corresponding values of 97.73%, 97.88%, and 211.2344 minutes. The MobileNet model achieved training and testing accuracy of 98.23% and 98.42% with a batch size of 128. Additionally, it achieved 98.53% specificity, 98.41% sensitivity, and 198.1212 minutes of computational time, respectively. The Xception model achieved 98.85% and 98.73% training and testing accuracy, and 98.65% specificity, 98.82% sensitivity, and 194.2076 minutes of processing time. The ResNet50V2 model achieved training and testing accuracy of 98.56%, 98.87%, respectively, and took 193.1878 minutes computational time for convergence. The InceptionV3 model reached training and testing accuracy of 98.63% and 98.52%, respectively. Additionally, it achieved 100% specificity, 99.52% sensitivity, and 183.1881 minutes of processing time, respectively. The InceptionResNetV2 model achieved 98.58% training and 98.47% testing accuracy, and also achieved 98.59% specificity, 100% sensitivity, with a computational time of 167.1242 minutes. The DenseNet201 model achieved training and testing accuracy of 98.98% and 98.85%, 98.84% specificity, 99.28% sensitivity, and a computational time of 164.3344 minutes. The training and testing accuracy of 99.27% and 98.84% were achieved by the DenseNet121 model. Also, 99.18% specificity, 100% sensitivity, and a computation time of 160.4033 minutes were achieved by the model. The training and testing accuracy for the DenseNet121+ELM model were 99.47% and 99.14%, respectively. Further, 99.37% specificity, 99.94% sensitivity, and 159.7731 minutes of computational time were obtained, respectively, by the proposed DenseNet121+ELM model. For this study, training and testing data for all models with batch sizes of 34 and 64 were presented in Tables 5 and 6.100 epochs were taken into account for all categorization performance measure studies. The DenseNet121+ELM model has been proven to be suitable and worthy for classifying breast cancer.

6. Conclusion

In this study, a novel DenseNet121+ELM model was proposed for classifying breast cancer from mammography images. The ELM model took the role of the conventional neural network in the fully connected layer of the proposed DenseNet121+ELM model. The preprocessed images underwent data augmentation, and the aligned-augmented images served as the classification input. The DenseNet121+ELM model received the augmented images for classification. For weight optimization, the AdaGrad optimization was taken into account. The DDSM high-resolution breast-imaging dataset was considered for the classification. We have considered InceptionResNetV2, DenseNet201, DenseNet121, and DenseNet121+ELM models for figure illustration. The batch sizes 32, 64, and 128, and the learning rate of 0.001, were considered for this study. In comparison to the other models, the DenseNet121+ELM model converges faster during training and testing. The proposed DenseNet121+ELM model was considered as a reliable classifier in classifying cancerous and noncancerous breast cancer from the images. Compared to other CNN conventional algorithms, the proposed DenseNet121+ELM model will aid medical professionals and radiologists in recognizing breast cancer without needing manual interventions. The proposed DenseNet121+ELM model took lots of time to simulate, which is the model’s drawback. However, this proposed model will provide a superior solution to medical diagnosis due to the faster processing unit. This model is new, and till now, it has not been utilized elsewhere. The proposed DenseNet121+ELM model can be employed for a brain tumor dataset, liver disease dataset, etc. in future research.

Data Availability

The dataset is collected from the publically available website https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset.

Conflicts of Interest

The authors of the paper Satyasis Mishra, Raj Kumar Pattnaik, Mohammed Siddique, Sunita Satapathy, and Tiruveedula Gopikrishna declare no conflict of interest or financial conflicts.

Authors’ Contributions

Satyasis Mishra prepared the documentation and methodology part of the research. Raj Kumar Patnaik prepared the document per the journal format and helped collect and preprocess the data. Mohammed Siddique complied research diagram, and python programs were compiled. Sunita Satapathy collected data from different parts of Ethiopia. Tiruveedula Gopikrishna prepared all simulation works and all figures with the GPU system. All authors reviewed the manuscript.

Acknowledgments

The authors thank Dr. Mohammed Naimuddin for the critical reading of the manuscript and language editing.