Abstract

The breast cancer microscopy images acquire information about the patient’s ailment, and the automated mitotic cell detection outcomes have generally been utilized to ease the massive amount of pathologist’s work and help the pathologists make clinical decisions quickly. Several previous methods were introduced to solve automated mitotic cell count problems. However, they failed to differentiate between mitotic and nonmitotic cells and come up with an imbalance problem, which affects the performance. This paper proposes a Representation Differential Learning Method (RDLM) for mitosis detection through deep learning to detect the accurate mitotic cell area on pathological images. Our proposed method has been divided into two parts: Global bank Feature Pyramid Network (GLB-FPN) and focal loss (FL). The GLB feature fusion method with FPN essentially makes the encoder-decoder pay attention, to further extract the region of interest (ROIs) for mitotic cells. On this basis, we extend the GLB-FPN with a focal loss to mitigate the data imbalance problem during the training stage. Extensive experiments have shown that RDLM significantly outperforms on visualization view and achieves the best performance in quantitative matrices than other proposed approaches on the MITOS-ATYPIA-14 contest dataset. Our framework reaches a 0.692 F1-score. Additionally, RDLM achieves 5% improvements than GLB with FPN in F1-score on the mitosis detection task.

1. Introduction

According to the World Health Organization (WHO), over 2 million new cases were reported in 2018; furthermore, worldwide incidence and mortality ratios were 11.6% and 6.6% [1]. Breast cancer is the second most common cancer globally, usually in women in both developed and underdeveloped countries. It has been considered that breast cancer occurrences approximately exceed more than 50% from 2011 to 2030 [2]. As stated to the histologic tumour grade also named the Nottingham Grading System (NGS) from various expert global institutions, for instance, WHO, American Joint Committee on Cancer (AJCC), the Royal College of Pathologists (UK RCPath), College of American Pathologists (CAP), and the European Union (EU) [3], NGS has consisted of three morphological features including nuclear pleomorphism, tubule formation, and mitotic index, which are the most challenging part of breast cancer analysis. Each characteristic is given a score from 1 to 3. The ultimate NGS lowest possible score is 3 and highest possible score is 9, and NGS is divided into three grades: Grade 1 has been assigned for a total score of 3–5, also named well-differentiated; Grade 2 has been given for a total score of 6–7, also called moderately differentiated; and Grade 3 has been assigned for a total score of 8–9, also named poorly differentiated [4].

Automatic mitosis detection is a challenging task in microscopy images of breast cancer, especially in clinical practice. Detecting mitosis from stained high-power fields (HPFs), deal with difficulties such as whether the mitosis is in one of the four main phases; prophase, metaphase, anaphase, and telophase during mitosis development, which increases the complexity of the detection task as shown in Figure 1. Moreover, every phase has a very disparate shape and texture configurations in four phases. For instance, a nucleus in telophase is split into two distinct regions, even though they are still one connected mitotic cell. Nonmitotic cells such as apoptotic, dense nuclei, and lymphocytes similar to mitotic cells on the slides have a similar morphological appearance. The dealing slide obtainment can present artefacts and undesirable objects, distinguishing cells difficultly and leading to many false positives in the detection process [5]. In the last few years, various efficient automatic mitosis systems have developed to detect mitotic cells on pathological images. Automatic mitosis detection has been a very active research area because of the recent advancement in digital medication, especially helping pathologists to make clinical decisions. Some machine learning handcrafted features approaches [612] have been used to extract the features. However, these traditional approaches always suffer from the large shape variations of mitosis and also need a very much high exertion time to approve and cannot signify characteristics or features adequately of mitosis regions, resulting in low performance on the different contest dataset such as ICPR 2012 [13] contest, AMIDIA 2013 [14] contest, and ICPR 2014 [15] contest datasets.

Computer vision is one of the areas advancing rapidly [16]. Deep learning (DL) computer vision has helped to ease the massive amount of pathologists’ work, and the pathologists are making fast clinical decisions. DL also makes mitosis detection work much easier than before such as in mitotic cell classification task [1720], mitotic cells detection task [21, 22], and mitotic cells segmentation task [1, 2325]. Deep learning techniques outperform preliminary research techniques, for instance, handcrafted features that capture specific mitosis features for automatic detection. Deep learning techniques are based on CNN and a deep classifier like Deep Cascade Neural Network (CasNN). CNN is first used to extract candidate regions, and a deep classifier is used to differentiate cancer and noncancer cells [17]. Deep Mitosis (DM) technique, which consists of segmentation task, detection task, and classification task [21], always fails to differentiate between mitosis and nonmitosis cell shape. Recently, the object detection and instance segmentation framework Feature Pyramid Network (FPN) have been used to tackle different kinds of medical problems, such as detecting the porosity and cracks in concrete CT images [26], automatic segmentation of cervical nuclei [27], thyroid nodule detection from medical ultrasound images [28], detection of teeth and their components in X-ray images [29], lung nodule detection in chest X-rays images [30], and having attained outstanding results. The focal loss (FL) has been used to solve the data imbalance problem in the various kinds of biomedical datasets and performed well for minority class, such as classification of red blood cells morphology [31], colon gland instance segmentation [32], and localization of cell organelles [33]. The mitoses are embedded in complex backgrounds and influenced by various factors containing, staining, lightning condition, tissue acquisition, and similar appearance of nonmitosis. In 2019, the global bank (GLB) feature fusion method was proposed to tackle the complex background problem on the 2015 Gland dataset. This feature fusion method achieved outstanding results on both segmentation and cancer embolus detection tasks compared to other proposed approaches [34].

Inspired by the above object detection framework and the GLB feature fusion method, we considered the mitosis detection task as an object detection problem. We proposed a new method for mitosis detection through deep learning to detect the mitotic cell area on a pathological image, called RDLM, which is fast and accurate to detect mitotic cells for both slide scanner Aperio-type and Hamamatsu-type images as shown in Figure 2. Our proposed framework is divided into two parts: GLB and FL. The GLB feature fusion method with FPN essentially calibrates the encoder-decoder and further makes encoder-decoder pay attention to extract the region of interest (ROIs) for mitotic cells. GLB has three phases. In the first phase, the encoder connects the layer at multiple-scales by including features. The second phase reconnects and removes the background noise to acquire a suitable feature map that further pays attention to the region of interest for mitotic cells. In the final phase, the feature map is generally transferred to the decoder by convolutions. Besides, GLB outperforms on the MITOS-ATYPIA 2014 contest dataset comprised of HPF large shaped variations of mitosis and the similar appearance of nonmitosis that creates severe problems during the training phase. However, the training process cannot learn full information on positive samples because most of them are easy and simple disguisable negative samples, bringing the imbalance problem. We adopted the FL to alleviate this scenario and obtained good mitosis detection results on both slide scanners. To validate the effect of our proposed method RDLM in visualization view and quantitative matrices, we conducted experiments on the MITOS-ATYPIA 2014 contest dataset. Extensive experiments show that RDLM attained outstanding results compared to other state-of-the-art methods on both slide scanners.

This research’s main contributions could be summarized as follows: first, we introduced the feasibility and superiority of a global bank and focal loss in mitosis detection tasks. Second, we proposed a new method called RDLM for mitosis detection through deep learning to detect the mitotic cell area on pathological images. The GLB makes encoder-decoder to pay attention further to extract the ROIs for mitotic cells while the FL alleviates the imbalance problem resulting from the predicted detection of mitotic cells results. Third, we applied extensive experiments to validate the effectiveness of the proposed method. The results showed that our proposed method attained the best performance in mitosis detection tasks on both slide scanners in terms of visualization view and quantitative metrics compared to other state-of-the-art approaches.

2. Methodology Motivation

This section describes our proposed method for mitosis detection through deep learning to detect the mitotic cell area on a pathological image, as shown in Figure 3. The FPN framework builds feature pyramids inside the convolution neural network (CNN). Moreover, FPN delivers a top-down pathway to produce higher-resolution layers from a semantic rich layer. The GLB feature fusion method part essentially calibrates the encoder-decoder and makes the encoder-decoder pay attention further to extract the ROIs for the mitotic cells detection task. We adopted the FL to alleviate the imbalance problem during the training stage, and each component’s details as described in the following sections.

2.1. FPN

Feature Pyramid Network is a newer, clean, and simple framework for building feature pyramids inside the CNN. It delivers a top-down pathway to produce higher-resolution layers from a semantic rich layer. Simultaneously, the region proposal network (RPN) used a sliding subnetwork at every location over the multilevel feature maps to generate an object proposal. An anchor defines every object proposal, and each anchor scale has a corresponding level in the features pyramid [35]. Besides, CNN has been employed to extract each patch’s comprehensive features from the input image. We used ResNet-50, which is the most efficient and powerful in image classification. The WSI sampled patches are fixed size pixels, so we have resized the patches to pixels and then fed them into the FPN.

2.2. GLB
2.2.1. The Vault Layer

In short, the vault layer stores the well-chosen features compared to other layers in the equal phase. In the encoding phase, various consecutive layers construct the result maps of the equal dimension that are reckoned in a similar network phase. In a phase of encoder, we represent the last layer by , where denote the encoder number of phases. extracts better-quality features than different layers extracted features in a similar phase. By accumulating all , we can link the layers of phases in the encoder. Though, every has a unique dimension and depth, before accumulating them, the dimension of to upsample them and later utilize convolutions, where k denote hyperparameter. Finally, we include an entire elementwise and convolutions utilized to specifically store the accumulated features in the vault layer, where m is a hyperparameter. The vault layer equation can be described aswhere represents convolutions. for represents convolutions corresponding to , where s denotes the encoder number of phases. up describes upsampling of to the dimension of .

2.2.2. Calibration Layers and Gain Layers

In general, for detection tasks, a calibration layer is denoted by , where denotes the decoder number of phases during decoding phases. Every calibration layer has an altered resolution and depth. The vault layer extracted features maps refer to gain layers and represent them through , where is the number of . At that point, can be enhanced and calibrated with comparing We equalize the depth by using convolutions with a stride of 2 and downsampling to the dimension of as illustrated in equation (2). In one case, the downsampling operation is avoidable when the calibration layer’s dimension is equivalent to that of the vault layer and is illustrated in the following equation: where denotes convolutions with a stride of 2 and downsampling operations utilized to equalize the dimension of features with that of At that point, can be enhanced and calibrated with comparing So, we include calibrated and gain layers as elementwise, as illustrated in the following equation:

The design of the proposed GLB with FPN is shown in Figure 4, where we first include elementwise calibration layers () and gain layers () as illustrated in equation (3). Using a gain layer to enhance a predicting layer permits the full attention of local features, powerful rich semantic features, and the gain layer global features, resulting in improved predictions.

2.3. An Objective of Focal Loss

In 2017, the focal loss was proposed by Kaiming He’s team [36]. The author uses a piecewise function to represent the cross-entropy of binary classification problem between foreground and background classes, and a piecewise function of to represent with the value of 1 in binary classification. The value here is the category of 1 in one-hot [0 1] or [1 0]:where denotes the model prediction probability and indicates the ground truth label. Hence, , CE with a balancing factor, where α denotes separate categories:

So, the γ factor is introduced to focus more on the hard negatives samples during training. Thus, a focal loss is defined as

In our case, we have integrated the focal loss to solve the data imbalance problem in the mitosis detection task in our proposed method. The multilabels focal loss formula is defined aswhere alpha is equal to 0.5, gamma is equal to 0.5, is equal to sigmoid (x), and z is equal to the target in equation (7). The prediction represents the predicted logits for each class and the target z represents the one-hot encoded classification targets. are hyperparameters. For the positive prediction, we only need to consider the front part loss, and the back part is 0. z has been considered as greater than zero, less than or equal to zero, and z is equal to 1, so the positive coefficient is equal to z − . For the negative prediction, we only need to consider back part loss, and the front part is 0; z has been considered as greater than zero, less than or equal to zero, and z is equal to 1, so the negative coefficient is equal to 0.

3. Experiments

We have conducted experiments based on the RDLM method to examine the impact of accurate mitotic cell detection. Besides, we used the FPN baseline, FPN-FL, and GLB-FPN on both slide scanners. Finally, we obtained improved results on the RDLM proposed method, verifying this research work’s success.

3.1. Implementations Details

Our approach is implemented with python using Tensorflow libraries and tested on a machine with NVIDIA GeForce GTX 1080 Ti GPU. It takes 4.5 hours for training images and 2.3 minutes for testing images. ImageNet pretrained models ResNet-50 [18] has been used as our base network for FPN. The input image is resized such that its shorter side has 600 pixels. We adopt the SGD for optimization, set momentum as 0.9, and set weight decay as 0.0001. The learning rate has been set at 0.001 for 20 k iteration steps.

3.2. Dataset

We have conducted experiments on the MITOS-ATYPIA-14 contest dataset. The contest data samples were scanned by two slide scanners Aperio Scanscope XT and Hamamatsu Nanozoomer 2.0-HT, and the whole-slide histological images (WSIs) were stained with standard hematoxylin and eosin (H&E) dyes. The training set of Aperio Scanscope XT scanner data at X40 magnification provided 1,136 labelled 1539 × 1376 pixels HPFs frames, 749 labelled mitotic cells, and 496 are unlabeled. The centroids pixels of mitoses have been manually annotated via two senior pathologists. In a situation of contradiction between the pathologists, the 3rd one will provide the last say. We have divided the training dataset of Aperio-type into two parts; 300 images used for training, and 92 used for testing in our experiments.

3.3. Evaluation Metrics

We adopt the same contest evaluation criteria of the MITOSIS ICPR challenges; correct detection would be considered in this case if the distance to a ground truth candidate mitosis is less than 8 μm. F1-score, precision, and recall are used to compute the performance evaluation of our method:

The TP, FP, and FN are the number of true positive detections, the number of false-positive detections, and the number of false-negative, which means they are undetected.

3.4. Results

Table 1 shows the proposed method RDLM performance for mitosis detection through deep learning quantitative results on the Aperio-type slide scanner. The first part of the table lists the FPN baseline, FPN-FL, GLB-FPN, and RDLM results. The next four parts of the table represent the hyperparameters; α, γ, k, and m, where k represents the convolutional filters from , m represents the vault layer depth, α is an imbalance factor, and γ focuses on the hard negatives examples in our experiments on the training set. Lastly, the last three parts of the table list the evaluation metrics results. As we can observe from Table 1 when α is 0.5, γ is 0.5, k is 256, and m is 256, our proposed method achieves the best performance F1-score 0.692 compared to other methods. The number of parameters in the RDLM is more than other methods of GLB-FPN and FPN-FL. Instead of the fact that adding more parameters makes training more complicated in the network architecture, the entire performance of the RDLM is better than that of the FPN baseline, FPN-FL, and GLB-FPN.

3.5. Discussion

The mitosis cell detection for the MITOS-ATYPIA-14 contest dataset is one of the most challenging compared to other datasets like MITOSIS 2012, AMEDIA 2013, and TUPAC 2016 due to more complicated background tissue appearance and a variation of shape. Besides, the samples of this dataset have been weakly annotated. However, several methods have been proposed, and they cannot obtain the best F1-score from the challenges discussed above on this dataset. According to the experimental evaluation, as shown in Table 1, it can be noted that our proposed method achieved the best performance result with F1-score of 0.692 compared with current deep learning mainstream approaches in Table 2. The first four methods results have been taken from the MITOS-ATYPIA-14 contest, and ‘-’ denotes the not released results. Later, several other CNN based methods have published. The DeepMitosis [21] and MaskMitosis [37] yield a good performance on the MITOSIS 2012 dataset but inferior performance on the MITOS-ATYPIA-14 contest dataset. These two segmentation modules were not reliable enough during the detection phase and created inferior performance on the weakly annotated dataset. The CasNN [17] approach requires two different networks: one is used to retrieve the mitosis candidates, and the other is used to classify the candidates, leading to less accurate detection. The LRCNN [38] and SegMitos [5] approaches reported F1-score of 0.659 and 0.562. In contrast, the proposed method attained better performance than existing methods on the testing set in terms of quantitative matrices F1-score of 0.692.

4. Visualization and Discussion

In this section, we will discuss the qualitative results of mitosis detection. Furthermore, we analyze the effect of FL in visual view to make the predicted results more reasonable.

4.1. Visualization of Mitotic Cells

The visualizations of mitotic cells detection have been done in RDLM, where GLB is converting them into a heatmap. The vault layer focuses more on the HPF tissue regions with Aperio-type and Hamamatsu-type scanners images, and the gain layer focuses more on the inside area of the Aperio-type and Hamamatsu-type scanners images. Moreover, FL makes the result more sensible from predicted results, as shown in Figure 5. The equation of heatmap is defined as

HM represents the heatmap. A means the activation function. represents weight coefficient, and the image represents the feed input images. Figure 5 describes the results of the mitosis detection examples on the ICPR 2014 dataset where samples were scanned with both slide scanners, left side (a), (b) denote Aperio-type, and right side (c), (d) denote Hamamatsu-type. (a) and (c) are the predictions of the RDLM ((k = 256, m = 256 α = 0.5 and γ = 0.5) results on both slides scanner, where red labels denote the ground truth and blue is the model prediction. (b) and (d) are the visualization results on both slide scanners where manual labels are red.

5. Discussion

This section explains the parameter selection of focal loss from two aspects: quantitative evaluation and visual perception. First, we can notice from Table 1 that the quantitative evaluation of FL decreases the probability of false-positive predictions and false-negative predictions to make the predicted results more reasonable when α is 0.5 and γ is 0.5. We set α to 0.5 with different γ of 0, 0.5, 1, and 2, respectively. However, the RDLM model performs well compared to other methods in terms of F1-score metrics and achieves the highest score on Aperio-type test images. Second, to illustrate the visual effects of FL, we selected three examples from the Aperio-type slide scanner’s testing set. We can observe from Figure 6 that the RDLM obtained good results when α is 0.5 and γ is 0.5. In contrast, the RDLM method poorly performs over all the examples when γ is 0, 1, 2, and the detection results always have some noise and misclassification regions of mitotic cells. Finally, we can conclude that the proposed method performs well and attained high performance in terms of both quantitative evaluation and visual perception when α is 0.5 and γ is 0.5.

6. Conclusion and Future Direction

In this paper, we proposed a new method for mitosis detection tasks through deep learning called RDLM. Our proposed framework has been divided into two parts: GLB-FPN and FL. The GLB feature fusion method with FPN essentially calibrates the encoder-decoder and makes the encoder-decoder pay attention further to extract the ROIs for mitotic cells. FL is adopted to efficiently alleviate the data imbalance problem in mitotic cell detection tasks. Besides, extensive experiments were carried out on the MITOS-ATYPIA-14 contest dataset to verify the effectiveness of the proposed method. Results showed that the proposed approach is superior to most current mainstream approaches in visualization view and quantitative matrices. Compared with state-of-the-art techniques, our framework has achieved a higher score on the testing set in terms of quantitative metrics F1-score of 0.692. We intend to design a new pipeline for both detection and segmentation tasks of mitotic cells in the future study.

Data Availability

In our experiments, we have used the publicly available dataset MITOS-ATYPIA-14 contest (https://mitos-atypia-14.grand-challenge.org/Donwload/). Moreover, we have also cited this dataset in our references.

Conflicts of Interest

The authors would like to confirm that they have no conflicts of interest, financial or others.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 62073260), the key problems in Medical Image Analysis with the Fusion of Vision Perception and Cognition.