RLMD-PA: A Reinforcement Learning-Based Myocarditis Diagnosis Combined with a Population-Based Algorithm for Pretraining Weights

Moravvej, Seyed Vahid; Alizadehsani, Roohallah; Khanam, Sadia; Sobhaninia, Zahra; Shoeibi, Afshin; Khozeimeh, Fahime; Sani, Zahra Alizadeh; Tan, Ru-San; Khosravi, Abbas; Nahavandi, Saeid; Kadri, Nahrizul Adib; Azizan, Muhammad Mokhzaini; Arunkumar, N.; Acharya, U.Rajendra

doi:https://doi.org/10.1155/2022/8733632

Contrast Media & Molecular Imaging

On this page

Abstract Introduction Background Conclusion Data Availability Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Special Issue

Developments in High Content Cellular Imaging Systems for Molecular Devices

View this Special Issue

Research Article | Open Access

Volume 2022 | Article ID 8733632 | https://doi.org/10.1155/2022/8733632

RLMD-PA: A Reinforcement Learning-Based Myocarditis Diagnosis Combined with a Population-Based Algorithm for Pretraining Weights

Seyed Vahid Moravvej,^1,2Roohallah Alizadehsani,³Sadia Khanam,⁴Zahra Sobhaninia,¹Afshin Shoeibi,⁵Fahime Khozeimeh,³Zahra Alizadeh Sani,⁶Ru-San Tan,^7,8Abbas Khosravi,³Saeid Nahavandi,^3,9Nahrizul Adib Kadri,¹⁰and Muhammad Mokhzaini Azizan¹¹ et al.

Academic Editor: Mohammad Farukh Hashmi

Received16 Mar 2022

Revised07 Apr 2022

Accepted13 Apr 2022

Published30 Jun 2022

Abstract

Myocarditis is heart muscle inflammation that is becoming more prevalent these days, especially with the prevalence of COVID-19. Noninvasive imaging cardiac magnetic resonance (CMR) can be used to diagnose myocarditis, but the interpretation is time-consuming and requires expert physicians. Computer-aided diagnostic systems can facilitate the automatic screening of CMR images for triage. This paper presents an automatic model for myocarditis classification based on a deep reinforcement learning approach called as reinforcement learning-based myocarditis diagnosis combined with population-based algorithm (RLMD-PA) that we evaluated using the Z-Alizadeh Sani myocarditis dataset of CMR images prospectively acquired at Omid Hospital, Tehran. This model addresses the imbalanced classification problem inherent to the CMR dataset and formulates the classification problem as a sequential decision-making process. The policy of architecture is based on convolutional neural network (CNN). To implement this model, we first apply the artificial bee colony (ABC) algorithm to obtain initial values for RLMD-PA weights. Next, the agent receives a sample at each step and classifies it. For each classification act, the agent gets a reward from the environment in which the reward of the minority class is greater than the reward of the majority class. Eventually, the agent finds an optimal policy under the guidance of a particular reward function and a helpful learning environment. Experimental results based on standard performance metrics show that RLMD-PA has achieved high accuracy for myocarditis classification, indicating that the proposed model is suitable for myocarditis diagnosis.

1. Introduction

Myocarditis is a condition that causes inflammation of the heart muscle [1]. It can affect heart pump function as well as electrical activation and conduction, resulting in heart failure and arrhythmia, respectively. The etiology is diverse, including infection (e.g., viral infections such as COVID-19 and parvovirus) [2], systemic inflammatory and autoimmune diseases, and drug reactions. Symptoms of myocarditis include chest pain, fatigue, and shortness of breath [3]. Patients with suspected myocarditis should seek cardiology advice for early diagnosis and treatment. Endomyocardial biopsy, an invasive procedure, is recommended in severe cases to confirm the diagnosis and to guide treatment [4]. Management comprises supportive measures, symptomatic heart failure therapy, antimicrobials for identified infective agents, and immunosuppression for severe inflammation. Early diagnosis and prompt institution of treatment can significantly reduce morbidity and mortality. Noninvasive cardiac imaging with cardiovascular magnetic resonance imaging (MRI) [5] can help clinch the diagnosis. However, MRI requires expert interpretation, which is manually intensive and subject to operator bias. In this regard, automated diagnostic systems can be developed that employ various machine learning and data mining algorithms to solve medical image classification problems efficiently [6]. They can be applied to reporting workflows to screen images automatically, saving physicians time, reducing errors, and enhancing diagnostic accuracy.

Excellent performance of in-depth models has been demonstrated in diverse applications, including natural language processing [7–9], computer vision, and medical image analysis [10, 11]. Deep learning-based algorithms converge with suitable weights to minimize the error between the real and predicted outputs. Typically, deep models use gradient-based algorithms as backpropagation to learn the weights. However, such optimization methods are sensitive to initial weights and may become trapped in local minima [12]. This issue is mainly encountered during classification [13]. Few researchers have shown that population-based meta-heuristic (PBMH) algorithms [14, 15] may help to overcome this problem [16]. Among PBMH algorithms, the ABC algorithm is one of the most effective optimizers [17, 18]. It emulates the behavior of bees in nature and, unlike traditional optimization algorithms, dispenses with the need to calculate gradients, thereby reducing the probability of getting stuck in local optimizations [19].

Classification performance in many machine learning algorithms may be adversely affected by imbalanced classification [20], which occurs when one class contains disproportionately more data than the others [21]. While imbalanced models may still attain reasonable detection rates for majority samples, the performance for minority samples is weak as minority class specimens can be difficult to identify due to their rarity and randomness. Also, misalignment of minority class samples can result in high costs. Methods have been proposed to address the problem at two levels [22]: data level and algorithmic level. In the former [23–25], training data are manipulated to balance the class distribution by oversampling minority class and/or undersampling majority class [26]. For instance, the synthetic minority oversampling technique (SMOTE) generates new samples by linear interpolation between adjoining minority samples [24], whereas NearMiss undersamples majority samples using the nearest neighbor algorithm [25]. Of note, oversampling and undersampling can risk overfitting and loss of worthy information, respectively [27]. At the algorithmic level, the importance of the minority class can be raised using techniques [28–32] that include cost-sensitive learning, ensemble learning, and decision threshold adjustment. In cost-sensitive learning, different incorrect classification costs are attributed to the loss function for the whole class, with a higher cost being allocated to minority class misclassification. Ensemble learning systems train several subclassifications and then apply voting or combination to obtain better results. Threshold adjustment techniques train the classifier in the imbalanced dataset and modify the decision threshold during the test. Deep learning-based methods have also been suggested for imbalanced data classification [33–35]. The authors in Reference [36] introduced a new loss function for deep networks that could capture classification errors from both minority and majority classes. Reference [37] introduces a method that could learn the unique features of an imbalanced dataset while maintaining intercluster and interclass margins.

To the best of our knowledge, only one work [3] based on deep learning models has been proposed for the diagnosis of myocarditis. The authors developed an algorithm for classifying images based on CNN and the k-means algorithm [38], which has the following workflow: after the data preprocessing stage, the images were placed in several clusters, and each cluster was considered a class in which the CNN classified. The algorithm was repeated for different clusters, and all the results were combined for the final decision. The main problem with the method was that it considered the image matrix as a vector in k-means, which resulted in missed pixels around a specific pixel.

This paper presents a method based on the ABC algorithm and reinforcement learning called RLMD-PA that we believe would address the above mentioned problems. The RLMD-PA model poses the classification problem as a guessing game embodied in a sequential decision-making process. At each step, the agent receives an environmental state represented by a training instance and then executes a classification under the direction of a policy. If the agent performs classification perfectly, it will be given a positive reward and, otherwise, a negative one. The minority class is rewarded more than the majority class. The agent’s goal is to accumulate as many rewards as possible during the sequential decision-making process to classify the samples as correctly as possible.

The main contributions of this article are as follows: (1) we considered the classification problem of medical images as a sequential decision-making process. We presented a reinforcement learning-based algorithm for imbalanced classification; (2) instead of randomly weighting, we have developed an encoding strategy and calculated the optimal initial value using the ABC algorithm, and (3) this work is based on a new well-annotated MRI dataset acquired from Tehran’s Omid Hospital that we have named the Z-Alizadeh Sani myocarditis dataset and made publicly downloadable.

The rest of the article is structured as follows: the second section is a brief overview of the ABC algorithm and its working. The third section introduces the proposed model. The fourth section presents the evaluation criteria, dataset, and analysis of the results. The last section states the conclusions and future works.

2. Background

2.1. Artificial Bee Colony Algorithm

Artificial bee colony (ABC) introduced by Karaboga and Basturk [39] is one of the most efficient algorithms for optimizing numerical problems. It is straightforward, robust, and population-based [19]. The algorithm emulates the intelligent foraging behavior of bees to arrive at the optimal solution. There is a list of food sources that bees seek out over time to get to the best positions. The algorithm involves three groups of bees: employed bees, onlooker bees, and scout bees. Employed bees discover the positions of food sources, whereas onlooker bees wait in the hive for the nectar from food positions to be sent by employed bees. Onlooker bees use the information to select food source positions. Once an employed bee has exhausted the food source, it becomes a scout bee to search for new positions randomly. The number of employed bees equals the number of unemployed (onlooker and scout) bees. The steps for optimizing an algorithm using the ABC algorithm are as follows:(1)Initialization: in the first step, an initial population of size is formed from the positions (solutions), as in where represents the -th position, each solution is dimensions, and means the number of parameters that must be optimized. and are the smallest and largest values in , respectively.(2)Employed bee phase: at this point, new solutions are recognized by searching the neighborhood for current potential solutions. To keep the population size constant, the quality of new solutions is evaluated. If it is better than the previous ones, it will be replaced; otherwise, it will remain fixed. This step can be formed as follows: where is a random solution such that . is a random number picked from the interval [0, 1]. The potentially new solution is obtained by changing only one element of .(3)Onlooker bee phase: for the onlooker bees update, one solution is stochastically elected from the potential solutions, that is, one of the open facility solutions, according to the probability relation anticipated as follows: The selection process follows the equation provided: the more appropriate a solution is, the higher the chance it will be selected. If the chosen employed bee scores higher than the current onlooker bee’s current solution, the current solution replaces the previous one. This process is repeated for all onlooker bees in population .(4)Scout bee phase: a solution that does not improve its fit after some repetitions can get the algorithm caught up in local optimization [40]. To prevent this, once the solution’s fit does not improve after t iterations, the algorithm will discard it, and a new solution will be supplied according to equation (2).(5)Algorithm end condition: although different conditions can be defined for the end of the algorithm, the term termination is repeated in this study, which means that the algorithm ends after iterations.

The complete ABC algorithm is given in Algorithm 1.

	Input:: dimensions of every solution, : population size, : number of cycles, : maximum number of iterations;
(1)	Initialize a population of solutions using equation (1);
(2)	;
(3)	whiledo
(4)	//Employed Bee Phase
(5)	fortodo
(6)	Produce new solution using equation (2);
(7)	Calculate the fitness for ;
(8)	Replace with if better;
(9)	Calculate the probability for every solution in using equation (3);
(10)	//Onlooker Bee Phase
(11)	fortodo
(12)	ifthen
(13)	Produce new solution by using equation (2);
(14)	Calculate the fitness for ;
(15)	Replace with if better;
(16)	//Scout Bee Phase
(17)	If an abandoned solution is found, replace it with the solution produced by equation (2);
(18)	Put the best solution in ;
(19)	;

2.2. Reinforcement Learning

Reinforcement learning [41] is an important branch of machine learning that encompasses many domains. Reinforcement learning can achieve relatively good classification results because it can effectively learn the compelling features of noisy data. In Reference [42], the authors defined classification as a sequential decision problem that used several factors to interact with the environment in order to learn an optimal policy function. Due to the complex simulation between the factors and the environment, the run time was inordinately prolonged. The model presented in [43] is a classification based on reinforcement learning provided for noisy text data. The proposed structure comprises of two classifiers: sample selector and relational classifier. The former selects a quality sentence from the noisy data by following the agent, whereas the latter classifier learns acceptable quality performance from clean data and gives a delayed reward to the sample selector for feedback. Finally, the model yields a superior classifier and quality dataset. The authors in Reference [44] proposed a solution for time series data in which the reward function and Markov process are explicitly defined. In various specific applications [45–48], reinforcement learning has been applied to learn the efficient features. These models promote valuable features for the classification, which leads to higher rewards that guide the agent to select more worthy features. To date, limited work has been done on deep learning for the classification of imbalanced data. In Reference [44], an ensemble pruning technique for deciding subclassifiers that adopted reinforcement learning was proposed. However, the model underperformed when the amount of data was increased. This is because it is difficult to choose classifiers when there are too many subclassifications.

3. The Proposed Solution

The overall structure of the proposed model is shown in Figure 1. We considered two critical options for classification. In the first step, we formulated a vector that includes all the learnable weights in our model. We assumed an initial value for the weights with ABC and then applied the backpropagation in the rest of the path. As mentioned, another problem that most classifiers suffer from, including ours, is imbalanced data. To address this, we employed reinforcement learning [49]. These concepts are detailed in the following sections.

3.1. Pretraining Phase

Weight initialization of deep networks is an essential part of deep models. Sometimes, incorrect initial values can lead to a failure of convergence in the model. The proposed model has a deep network with weights that need to be optimized. In this section, we present an encoding strategy and fitness function for the ABC algorithm.

3.2. Encoding Strategy

In our work, the encoding strategy aims to arrange the CNN and feed-forward weights in a vector that will be considered the position of the bees in the ABC. Setting the specific weights is a challenge. Nevertheless, we have designed an encoding strategy that is as appropriate as possible after a few experiments. Figure 2 illustrates an example with encoding of a three-layer CNN network with three filters in each layer and a feed-forward network with three hidden layers. Note that all weight matrices in the vector are stored in rows.

3.3. Fitness Function

The fitness function is defined as follows to measure the effectiveness of a solution in the ABC algorithm [12]:where is the total number of samples, and and are the target and predicted labels for -th data, respectively.

4. Classification

Due to the difference in the amount of data between our two classes, we face the problem of imbalanced classification. To address this, we used the imbalanced classification Markov decision process (ICMDP) to construct a sequential decision problem. In reinforcement learning, an agent tries to obtain an optimal policy by performing a series of actions in the environment while maximizing its score. In the case of our model, a sample of the dataset is provided to the agent at each time point and classified. The environment then transmits the immediate score to the agent. A positive score corresponds to a correct rating, whereas a wrong rating gives a negative one. By maximizing cumulative rewards, the agent can arrive at the optimal policy. Let be the imbalanced set of existing images with samples, where corresponds to the -th image, and is its corresponding label. The following explains the intended settings:(i)Policy : policy means a mapping function , where and are a set of states and actions, respectively. In other words, every means performing the action in the state . is acknowledged as the classifier model with weights .(ii)State : each state is mapped with sample from the dataset . The first data are deemed the initial state of . For the model not to learn a particular order, the is shuffled in each episode.(iii)Action action is performed to predict the label . Since the offered classification is binary, , zero represents the minority class and one represents the majority class.(iv)Reward : reward considers the performance of an action. An agent with the correct classification gets a positive reward; otherwise, it gets a negative reward. The amount of this bonus should not be the same for both classes. Rewards can significantly improve model performance because the level of reward and action has been carefully calibrated. In this work, the prize is defined for action according to the following equation [27]: where and represent the minority and majority classes, that is, healthy and sick, respectively, and is a value in the interval [0,1]. The reward is less than 1/−1 as the minority class becomes more critical due to fewer data. In effect, we can ascribe more importance to the minority class in order for it to approximate the majority class. In the results section, we will see the importance of the value .(v)Terminal the training process is completed at several terminal states, which occur in every training episode. An episode is the transition trajectory from an initial state to a final state, namely, . In our case, an episode stops when all the training data have been classified or when a sample of the minority class is misclassified.(vi)Transition probability the agent goes from state to the next state based on the order of the read data. The transition probability is determined as .

In ICMDP, the policy function reports the probability of all labels by receiving a sample:

In reinforcement learning, the intention is to maximize the discounted cumulative reward, or in mathematical terms, to attain a high limit for the following expression:

Equation (7) is termed the return function, which contains all the accumulated return values of the agent searches in space. The discount factor [50] is the coefficient of the effect of each reward. The function measures the quality of a state-action combination:

Equation (8) is expanded according to Bellman’s formula [51]

By maximizing the function supported by , more cumulative rewards can be achieved. The optimal policy of is assessed by considering the function as follows:

By combining the two equations (9) and (10), the function is expressed as follows [27]:

In a low-dimensional space state, the function can be easily solved by a table. However, the table technique is inadequate when space is joined. To solve this problem, -learning algorithms are used. In these algorithms, the tuple received from equation (11) is saved as experience replay memory . The agent gets a mini-batch from and executes the gradient descent on these data according to the following equation:where is an estimate of the function expressed as follows [27]:where is the following state , and is the action performed in ; means whether the agent makes a wrong classification for the minority class or not. Finally, the policy weights can be updated as follows:

In conclusion, the optimal function can be achieved by minimizing the loss function presented in equation (12). Notably, the optimal policy of is taken using , which is the optimal model for the proposed classifier.

4.1. Overall Algorithm

We devised the simulation environment according to the above. The structure of the policy network depends on the complexity and number of training samples. According to the structure of the training samples and the output, the network input equals to the number of data classes, which is equivalent to 2. The general training algorithm of the RLMD-PA model is displayed in Algorithm 2. In this algorithm, the policy weights are first initialized using the ABC algorithm, and then, the agent continues the training process until an optimal policy is reached. Action is based on a greedy policy, which is also evaluated by Algorithm 3. The algorithm is repeated for times, which is taken as 18,000 in this paper. At each step, the policy network weights are stored.

	Data:
(1)	Initialize the weights of policy using Algorithm 1;
(2)	Initialize environment ;
(3)	Initialize replay memory ;
(4)	for Episodetodo
(5)	Shuffle the data ;
(6)	;
(7)	fortodo
(8)	; //select an action based on -greedy
(9)	= Reward ;
(10)	;
(11)	Save to M;
(12)	Sample randomly a mini-batch of transitions (, , , , ) from M;
(13)	, Accumulate gradients w.r.t : + ;
(14)	ifTrue then
(15)	break;

(1)	Function Reward :
(2)	False;
(3)	ifthen
(4)	ifthen
(5)	1;
(6)	else
(7)	;
(8)	True;
(9)	end
(10)	else
(11)	if = = then
(12)	;
(13)	else
(14)	;
(15)	end
(16)	end
(17)	return
(18)	End Function

5. Empirical Evaluation

5.1. Dataset

Cardiac magnetic resonance imaging (CMR) [52] allows for comprehensive anatomical and functional evaluation of the heart as well as detailed tissue characterization [53]. It is the preeminent imaging modality for noninvasive diagnosis myocarditis without biopsy. The Lake Louise criterion (LLC) [54] introduced benchmark criteria for diagnosing myocarditis using CMR [55] based on the presence of myocardial necrosis, edema, and hyperemia. The presence of late gadolinium enhancement confirms myocardial necrotic damage. T2-weighted images uncover areas of interstitial edema, which indicates inflammatory response. T1-weighted images before and after contrast can depict hyperemia in the myocardial tissue. Fulfilling two of three LLC criteria confers 80% accuracy for diagnosing myocarditis [56]. This article presents a model for identifying myocarditis by considering the three LLC criteria.

A one-year CMR research project on myocarditis was conducted from September 2016 at Omid Hospital in Tehran, Iran, where we performed CMR on patients who were clinically suspected to have myocarditis (e.g., chest pain, elevated troponin, negative functional imaging and/or coronary angiographic findings, and suspected viral etiology) and the treating physician assessed that CMR would likely affect clinical management (e.g., ongoing symptoms, ongoing myocardial injury evidenced by persistent ECG abnormalities, and presence of ventricular dysfunction). The protocol had been approved by the local ethics committee. CMR examination was performed on a 1.5-Tesla system [57]. All cases were scanned with body coils in standard supine position. T1-weighted images were acquired in the axial views. Shortly after gadolinium injection, the T1-weighted sequences were repeated. After approximately 10–15 minutes, late gadolinium enhancement [58] sequences were performed in standard left ventricular short- and long-axis views. Table 1 summarizes the CMR sequence parameters [3].

A total of 586 patients were identified who had positive evidence of myocarditis on the CMR images, which might show one or more areas of disease. A total of 307 healthy subjects were included as controls. We chose eight CMR images from each patient or control subject for the analysis, which were one long-axis image and one short-axis image acquired using each of the following four CMR sequences: late gadolinium enhancement, perfusion, T2-weighted, and steady-state free precession. The final CMR dataset comprises 4,686 and 2,449 samples from sick (i.e., myocarditis) and healthy subjects, respectively. Figure 3 shows example images obtained from this dataset. It may be noted that in this study, analysis is performed at the image level, and not at the patient level. In other words, prediction is based on a single image regardless of how many images are available for each patient.

Institutional approval was allowed to use the patient datasets in research studies for diagnostic and therapeutic purposes. Approval was granted on the grounds of existing datasets. Informed consent was received from all of the patients in this study. All methods were carried out in accordance with relevant guidelines and regulations. Ethical approval for using these data was obtained from the Tehran Omid Hospital.

5.2. Metrics

To evaluate the classification performance of the proposed model, we used six standard performance metrics, namely, accuracy, recall, precision, F-measure, specificity, and G-means [59], and they are defined as follows:where TP, TN, FN, and FP are true positive, true negative, false negative, and false positive, respectively. The F-measure and G-means are commonly applied to evaluate imbalanced classification [27], which aligns nicely with our dataset sample distribution and the reason for existing our proposed method. In addition, it is noteworthy that our prediction is per image. In this way, the intelligent myocarditis classification system can effectively screen entire CMR studies and flag individual images for scrutiny by physician readers. For this purpose, low FP and high recall metrics would be desired.

5.3. Details of Model

This work used Python and the PyTorch framework. The codes are written in Jupyter notebook. We used five layers of two-dimensional convolution for the CNN network with 128, 64, 32, 16, and 8 filters. The size of the kernel, stride, and padding in each layer are 3, 2, and 1 for both dimensions, respectively. Each convolution layer involves a max-pooling layer with dimensions of 2 × 2. The three fully connected layers have 128, 64, and 32 hidden layers, respectively. To prevent overfitting, dropout with a probability of 0.4 and early stopping are employed. In every experiment, the batch size is set to 64. The images in the dataset are in gray-scale and light intensities of image pixels are mapped to the range [0, 1]. The images in the dataset come in different sizes and are all resized to 100 × 100 for analysis.

5.4. Experimental Results

While standard techniques like data augmentation and weighted loss function [60] can sometimes be used to correct the imbalanced data distributions, they are not applicable in all situations. In our experiments, data augmentation and weighted loss function do not enrich our model, which is not unexpected.

We used -fold cross-validation ( or 5-CV) in all our implementations. The entire dataset is divided into subsets. subsets are applied for training and the remaining one for test. This procedure is iterated times until all data subsets are utilized exactly four times for training and once for testing. All parameters are expressed as means, standard deviations, medians, minimums, and maximums. First, we compared our proposed method with the only published work in this field, CNN-KCL [3]. Next, to investigate the contributions of the two distinct components ABC and RL in our model, we compared the performance of a basic model without ABC and RL, that is, CNN + random weight, versus the models CNN + ABC and CNN + RL, which used ABC and RL for training, respectively. The evaluation results of our RLMD-PA model performance as well as the aforementioned comparisons on the Z-Alizadeh Sani myocarditis dataset are presented in Tables 2 and 3. In general, the RLMD-PA model reduces the error by more than 43%. From the means of all the performance metrics, the RLMD-PA model outperforms the CNN-KCL method as well as CNN + random weight, CNN + ABC, and CNN + RL combinations of its components. Both ABC and RL individually improve on the basic CNN network across all assessed performance metrics, which supports the use of combined approaches of initial weight and reinforcement learning. For better visualization, the results are illustrated in Figure 4. In terms of time, the best model was obtained after 100 iterations in 2 hours, while CNN-KCL got the best after 350 iterations in 5 hours.

Standard machine learning classifiers have not been successful in classifying medical images, because they typically assume images as one-dimensional vectors, which cause the neighboring pixels of a specific pixel to be spaced apart. In order to compare with our deep model, we used five algorithms: support vector machine (SVM) [61], k-nearest neighbor [62], naïve Bayes [63], logistic regression [64], and random forests [65] to classify the CMR images of the study dataset. SVM performed the best among these methods but is still inferior to deep models. The results are summarized in Tables 4 and 5, and the mean performance metrics is shown in Figure 5.

5.5. Investigation of Other Metaheuristic Algorithms on the Algorithm

The proposed model employs ABC algorithm in conjunction with backpropagation for the initial value. To compare the performance of ABC versus alternative instructors, we employed ABC in our model with five conventional algorithms, namely, gradient descent with momentum backpropagation (GDM) [66], gradient descent with adaptive learning rate backpropagation (GDA) [67], gradient descent with momentum and adaptive learning rate backpropagation (GDMA) [68], one-step secant backpropagation (OSS) [69], and Bayesian regularization backpropagation (BR) [70], and four metaheuristic algorithms, namely, gray wolf optimization (GWO) [71], the Bat algorithm (BA) [72], Cuckoo optimization algorithm (COA) [73], and whale optimization algorithm (WOA) [74]. The population size and number of function evaluations are 100 and 25,000 for all metaheuristic algorithms, respectively. Other parameter settings can be seen in Table 6. The performance metrics of these comparisons are summarized in Tables 7 and 8 and illustrated in Figure 6. In general, metaheuristic algorithms are better than conventional algorithms with the exception of GDMA in terms of accuracy, recall, and F-measure scores. Importantly, the ABC algorithm outperformed all conventional and metaheuristic algorithms to improve the error in the recall and F-measure criteria by more than 25% and 22%, respectively.

5.6. Explore the Reward Function

The reward function is a practical device that helps the agent to achieve the goal. In this work, the minority class reward is , while the majority is . To examine the effect of the value on the classification model, we test 10 values of on the model. Details of the results for all the criteria for these experiments are given in Table 9. For better visualization, we have plotted the trends in Figure 7. On examination, for the accuracy criterion, when takes the values from [0, 0.3], the chart has an ascending trend, and from [0.3, 1] has a descending move` This process is valid for all criteria. If , the importance of the majority class is disregarded, and if , the importance of both classes is the same. Although the minority class is more important to us, the majority class cannot be ignored.

6. Conclusion and Future Directions

This article presents a new model for classifying myocarditis images. The proposed model consists of two steps. First, the model weights are initialized using the ABC algorithm. Next, the model is considered an ICMDP problem. The environment assigns a high reward to the minority class and a low reward to the majority class. The algorithm terminates when the agent makes a wrong classification for the minority class, or the number of episodes runs out. We performed several experiments to examine various factors that affect the performance of the proposed model. The designed experiments confirmed that the RLMD-PA model with ABC and RL is an effective classifier for myocarditis images.

In the future, we will try to employ ensemble convolutional neural network (ECNN), as our model to use a set of CNN networks and connect them to yield higher performance. In addition, we can also work with the generative adversarial network (GAN), which is widely used in many applications. It may be worth exploring to employ the developed model for other medical applications such as stroke detection, cancer detection and plaque detection.

Data Availability

The dataset used to support the findings of this study is available on GitHub: https://github.com/vahid-moravvej/Z-Alizadeh-Sani-myocarditis-dataset.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Seyed Vahid Moravvej, Roohallah Alizadehsani, and Ru-San Tan contributed to prepare the first draft. Nahrizul Adib Kadri, Muhammad Mokhzaini Azizan, and U. Rajendra Acharya contributed to editing the final draft. Sadia Khanam and Zahra Sobhaninia contributed to all analysis of the data and produced the results accordingly. Afshin Shoeibi and Fahime Khozeimeh searched for papers and then extracted data. Zahra Alizadeh Sani, N. Arunkumar, Abbas Khosravi, and Saeid Nahavandi provided overall guidance and managed the project.

Acknowledgments

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

References

L. T. Cooper Jr, “Myocarditis,” New England Journal Of Medicine, vol. 360, no. 15, pp. 1526–1538, 2009.
View at: Publisher Site | Google Scholar
V. Kytö, P. Saukko, E. Lignitz et al., “Diagnosis and presentation of fatal myocarditis,” Human Pathology, vol. 36, no. 9, pp. 1003–1007, 2005.
View at: Publisher Site | Google Scholar
D. Sharifrazi, R. Alizadehsani, J. Hassannataj Joloudari et al., “Cnn-kcl: automatic myocarditis diagnosis using convolutional neural network combined with k-means clustering,” Mathematical Biosciences and Engineering, vol. 19, no. 3, pp. 2381–2402, 2020.
View at: Google Scholar
A. Asher, “A review of endomyocardial biopsy and current practice in england: out of date or underutilised,” British Journal of Cardiology, vol. 24, pp. 108–112, 2017.
View at: Google Scholar
G. Katti, S. Arshiya Ara, and A. Shireen, “Magnetic resonance imaging (mri)–a review,” International Journal of Dental Clinics, vol. 3, no. 1, pp. 65–70, 2011.
View at: Google Scholar
M. Abdar, E. Nasarian, X. Zhou et al., “Performance improvement of decision trees for diagnosis of coronary artery disease using multi filtering approach,” in Proceedings of the 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), pp. 26–30, IEEE, Singapore, February 2019.
View at: Google Scholar
S. Vahid Moravvej, A. Mirzaei, and M. Safayani, “Biomedical text summarization using conditional generative adversarial network (cgan),” 2021, http://arxiv.org/abs/2110.11870.
View at: Google Scholar
M. Salimi Sartakhti, M. J. Maleki Kahaki, S. Vahid Moravvej, M. javadi Joortani, and A. Bagheri, “Persian language model based on bilstm model on covid-19 corpus,” in Proceedings of the 2021 5th International Conference on Pattern Recognition and Image Analysis (IPRIA), pp. 1–5, IEEE, Kashan, Iran, April 2021.
View at: Google Scholar
S. V. Moravvej, M. J. M. Kahaki, M. S. Sartakhti, and M. Joodaki, “Efficient gan-based method for extractive summarization,” Journal of Electrical and Computer Engineering Innovations, vol. 10, pp. 287–298, 2021.
View at: Google Scholar
S. Zahra, H. Danesh, R. Kafieh, J. Jothi Balaji, and V. Lakshminarayanan, “Determination of foveal avascular zone parameters using a new location-aware deep-learning method,” Applications of Machine Learning 2021, International Society for Optics and Photonics, San Diego, CA, USA, vol. 11843, Article ID 1184311, 2021.
View at: Google Scholar
S. Zahra, N. Karimi, P. Khadivi, R. Roshandel, and S. Samavi, “Brain tumor classification using medial residual encoder layers,” 2020, https://arxiv.org/abs/2011.00628.
View at: Google Scholar
S. V. Moravvej, S. J. Mousavirad, M. H. Moghadam, and M. SaadatmandT. Mantoro, M. Lee, M. A. Ayu, K. W. Wong, and A. N. Hidayanto, “An lstm-based plagiarism detection via attention mechanism and a population-based approach for pre-training parameters with imbalanced classes,” Neural Information Processing, Springer International Publishing, Cham, pp. 690–701, 2021.
View at: Publisher Site | Google Scholar
M. Mavrovouniotis and S. Yang, “Training neural networks with ant colony optimization algorithms for pattern classification,” Soft Computing, vol. 19, no. 6, pp. 1511–1522, 2015.
View at: Publisher Site | Google Scholar
B. Liu, L. Wang, Y. Liu, and S. Wang, “A unified framework for population-based metaheuristics,” Annals of Operations Research, vol. 186, no. 1, pp. 231–262, 2011.
View at: Publisher Site | Google Scholar
S. Vakilian, S. Vahid Moravvej, and F. Ali, “Using the cuckoo algorithm to optimizing the response time and energy consumption cost of fog nodes by considering collaboration in the fog layer,” in Proceedings of the 2021 5th International Conference on Internet of Things and Applications (IoT), pp. 1–5, IEEE, Isfahan, Iran, May 2021.
View at: Google Scholar
S. Jalaleddin Mousavirad, G. Schaefer, I. Korovin, and D. Oliva, “Rde-op: a region-based differential evolution algorithm incorporation opposition-based learning for optimising the learning process of multi-layer neural networks,” in Proceedings of the International Conference on the Applications of Evolutionary Computation (Part of EvoStar), pp. 407–420, Springer, April 2021.
View at: Google Scholar
S. Vakilian, S. Vahid Moravvej, and F. Ali, “Using the artificial bee colony (abc) algorithm in collaboration with the fog nodes in the internet of things three-layer architecture,” in Proceedings of the 2021 29th Iranian Conference on Electrical Engineering (ICEE), pp. 509–513, IEEE, Tehran, Iran, Islamic Republic, May 2021.
View at: Google Scholar
J. M. Sanchez-Gomez, M. A. Vega-Rodríguez, and C. J. Pérez, “Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach,” Knowledge-Based Systems, vol. 159, pp. 1–8, 2018.
View at: Publisher Site | Google Scholar
D. Karaboga, B. Akay, and C. Ozturk, “Artificial bee colony (abc) optimization algorithm for training feed-forward neural networks,” in Proceedings of the International conference on modeling decisions for artificial intelligence, pp. 318–329, Springer, Kitakyushu, Japan, August 2007.
View at: Google Scholar
S. Vahid Moravvej, M. Joodaki, and M. Salimi Sartakhti, “A method based on an attention mechanism to measure the similarity of two sentences,” in Proceedings of the 2021 7th International Conference on Web Research (ICWR), pp. 238–242, IEEE, Tehran, Iran, May 2021.
View at: Google Scholar
S. Vahid Moravvej, M. J. Maleki Kahaki, M. Salimi Sartakhti, and A. Mirzaei, “A method based on attention mechanism using bidirectional long-short term memory (blstm) for question answering,” in Proceedings of the 2021 29th Iranian Conference on Electrical Engineering (ICEE), pp. 460–464, IEEE, Tehran, Iran, May 2021.
View at: Google Scholar
H. Guo, Y. Li, and J. Shang, “Learning from class-imbalanced data: review of methods and applications,” Expert Systems with Applications, vol. 73, pp. 220–239, 2017.
View at: Google Scholar
C. Drummond, R. C. Holte et al., “C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling,” Workshop on learning from imbalanced datasets II, Citeseer, vol. 11, pp. 1–8, 2003.
View at: Google Scholar
H. Han, W.-Y. Wang, and B.-H. Mao, “Borderline-smote: a new over-sampling method in imbalanced data sets learning,” in Proceedings of the International Conference on Intelligent Computing, pp. 878–887, Springer, Hefei, China, August 2005.
View at: Publisher Site | Google Scholar
I. Mani and I. Zhang, “Knn approach to unbalanced data distributions: a case study involving information extraction,” in Proceedings of the workshop on learning from imbalanced datasets, vol. 126, ICML, United States, June 2003.
View at: Google Scholar
G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM SIGKDD explorations newsletter, vol. 6, no. 1, pp. 20–29, 2004.
View at: Publisher Site | Google Scholar
E. Lin, Q. Chen, and X. Qi, “Deep reinforcement learning for imbalanced classification,” Applied Intelligence, vol. 50, no. 8, pp. 2488–2502, 2020.
View at: Publisher Site | Google Scholar
K. Veropoulos, C. Campbell, N. Cristianini et al., “Controlling the sensitivity of support vector machines,” in Proceedings of the international joint conference on AI, vol. 55, 60 pages, ACM, Stockholm, July 1999.
View at: Google Scholar
G. Wu and E. Y. Chang, “Kba: kernel boundary alignment considering imbalanced data distribution,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, pp. 786–795, 2005.
View at: Publisher Site | Google Scholar
Y. Tang, Y.-Q. Zhang, V. Nitesh, and S. Krasser, “Svms modeling for highly imbalanced classification,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 1, pp. 281–288, 2008.
View at: Google Scholar
B. Krawczyk and M. Woźniak, “Cost-sensitive neural network with roc-based moving threshold for imbalanced classification,” in Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, pp. 45–52, Springer, Guimaraes, Portugal, January 2015.
View at: Google Scholar
H. Yu, C. Sun, X. Yang, W. Yang, J. Shen, and Y. Qi, “Odoc-elm: optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data,” Knowledge-Based Systems, vol. 92, pp. 55–70, 2016.
View at: Publisher Site | Google Scholar
Y. Yan, M. Chen, M.-L. Shyu, and S.-C. Chen, “Deep learning for imbalanced multimedia data classification,” in Proceedings of the 2015 IEEE international symposium on multimedia (ISM), pp. 483–488, IEEE, Miami, FL, USA, December 2015.
View at: Google Scholar
S. H. Khan, M. Hayat, M. Bennamoun, F. A. Sohel, and R. Togneri, “Cost-sensitive learning of deep feature representations from imbalanced data,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 8, pp. 3573–3587, 2017.
View at: Google Scholar
D. Qi, S. Gong, and X. Zhu, “Imbalanced deep learning by minority class incremental rectification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 6, pp. 1367–1381, 2018.
View at: Google Scholar
S. Wang, W. Liu, J. Wu et al., “Training deep neural networks on imbalanced data sets,” in Proceedings of the 2016 international joint conference on neural networks (IJCNN), pp. 4368–4374, IEEE, Vancouver, BC, Canada, July 2016.
View at: Google Scholar
C. Huang, Y. Li, C. Change Loy, and X. Tang, “Learning deep representation for imbalanced classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5375–5384, IEEE, Las Vegas, NV, USA, June 2016.
View at: Google Scholar
K. Krishna and M. Narasimha Murty, “Genetic k-means algorithm,” IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol. 29, no. 3, pp. 433–439, 1999.
View at: Publisher Site | Google Scholar
D. Karaboga and B. Basturk, “On the performance of artificial bee colony (abc) algorithm,” Applied Soft Computing, vol. 8, no. 1, pp. 687–697, 2008.
View at: Publisher Site | Google Scholar
Y. Watanabe, M. Takaya, and A. Yamamura, “Fitness function in abc algorithm for uncapacitated facility location problem,” in Proceedings of the Information and Communication Technology-EurAsia Conference, pp. 129–138, Springer, Daejeon, Korea, October 2015.
View at: Google Scholar
L. Pack, M. L. Littman, and A. W. Moore, “Reinforcement learning: a survey,” Journal of Artificial Intelligence Research, vol. 4, pp. 237–285, 1996.
View at: Google Scholar
M. A. Wiering, H. Van Hasselt, A.-D. Pietersma, and S. Lambert, “Reinforcement learning algorithms for solving classification problems,” in Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp. 91–96, IEEE, Paris, France, April 2011.
View at: Google Scholar
J. Feng, M. Huang, Li Zhao, Y. Yang, and X. Zhu, “Reinforcement learning for relation classification from noisy data,” in Proceedings of the aaai conference on artificial intelligence, vol. 32, AAAI Press, August 2018.
View at: Google Scholar
C. Martinez, G. Perrin, E. Ramasso, and M. Rombaut, “A deep reinforcement learning approach for early classification of time series,” in Proceedings of the 2018 26th European Signal Processing Conference (EUSIPCO), pages 2030–2034, IEEE, Rome, Italy, September 2018.
View at: Google Scholar
T. Zhang, M. Huang, and Li Zhao, “Learning structured representation for text classification via reinforcement learning,” in Proceedings of the -Second AAAI Conference on Artificial Intelligence, AAAI Press, New Orleans, Louisiana, USA, February 2018.
View at: Google Scholar
D. Liu and T. Jiang, “Deep reinforcement learning for surgical gesture segmentation and classification,” in Proceedings of the International conference on medical image computing and computer-assisted intervention, pp. 247–255, Springer, Granada, Spain, September 2018.
View at: Google Scholar
D. Zhao, Y. Chen, and L Le, “Deep reinforcement learning with visual attention for vehicle classification,” IEEE Transactions on Cognitive and Developmental Systems, vol. 9, no. 4, pp. 356–367, 2016.
View at: Google Scholar
J. Janisch, T. Lisý, and V. Lisỳ, “Classification with costly features using deep reinforcement learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3959–3966, 2019.
View at: Publisher Site | Google Scholar
M. A. Wiering and M. Van Otterlo, “Reinforcement learning,” Adaptation, learning, and optimization, vol. 12, no. 3, 2012.
View at: Google Scholar
R. Amit, R. Meir, and K. Ciosek, “Discount factor as a regularizer in reinforcement learning,” in Proceedings of the International conference on machine learning, pp. 269–278, PMLR, Vienna, Austria, July 2020.
View at: Google Scholar
K. D. Avinash, J. J. F. Sherrerd et al., Optimization in Economic Theory, Oxford University Press on Demand, Oxford, England, 1990.
I. S. Syed, J. F. Glockner, D. Feng et al., “Role of cardiac magnetic resonance imaging in the detection of cardiac amyloidosis,” Journal of the American College of Cardiology: Cardiovascular Imaging, vol. 3, no. 2, pp. 155–164, 2010.
View at: Publisher Site | Google Scholar
M. Chetrit and M. G. Friedrich, “The unique role of cardiovascular magnetic resonance imaging in acute myocarditis,” F1000Research, vol. 7, 2018.
View at: Publisher Site | Google Scholar
M. D. Cornicelli, C. K. Rigsby, K. Rychlik, E. Pahl, and J. D. Robinson, “Diagnostic performance of cardiovascular magnetic resonance native t1 and t2 mapping in pediatric patients with acute myocarditis,” Journal of Cardiovascular Magnetic Resonance: Official Journal of the Society for Cardiovascular Magnetic Resonance, vol. 21, no. 1, pp. 40–49, 2019.
View at: Publisher Site | Google Scholar
J. A. Pan, Y. J. Lee, and M. Salerno, “Diagnostic performance of extracellular volume, native t1, and t2 mapping versus lake louise criteria by cardiac magnetic resonance for detection of acute myocarditis: a meta-analysis,” Circulation: Cardiovascular Imaging, vol. 11, no. 7, Article ID e007598, 2018.
View at: Publisher Site | Google Scholar
M. A. G. M. Olimulder, J. Van Es, and M. A. Galjee, “The importance of cardiac mri as a diagnostic tool in viral myocarditis-induced cardiomyopathy,” Netherlands Heart Journal, vol. 17, no. 12, pp. 481–486, 2009.
View at: Publisher Site | Google Scholar
C. Moenninghoff, L. Umutlu, C. Kloeters et al., “Workflow efficiency of two 1.5 t mr scanners with and without an automated user interface for head examinations,” Academic Radiology, vol. 20, no. 6, pp. 721–730, 2013.
View at: Publisher Site | Google Scholar
L. Lin, X. Li, J. Feng et al., “The prognostic value of t1 mapping and late gadolinium enhancement cardiovascular magnetic resonance imaging in patients with light chain amyloidosis,” Journal of Cardiovascular Magnetic Resonance: Official Journal of the Society for Cardiovascular Magnetic Resonance, vol. 20, no. 1, pp. 2–11, 2018.
View at: Publisher Site | Google Scholar
X. Xiao, D. Lo, X. Xia, and T. Yuan, “Evaluating defect prediction approaches using a massive set of metrics: an empirical study,” in Proceedings of the 30th Annual ACM Symposium on Applied Computing, pp. 1644–1647, ACM, April 2015.
View at: Google Scholar
Q. Qiu and Z. Song, “A nonuniform weighted loss function for imbalanced image classification,” in Proceedings of the 2018 international conference on image and graphics processing, pp. 78–82, ICIGP, Hong Kong, February 2018.
View at: Google Scholar
C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
View at: Publisher Site | Google Scholar
G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, “Knn model-based approach in classification,” in Proceedings of the OTM Confederated International Conferences CoopIS, DOA, and ODBASE 2003 Catania, pp. 986–996, Springer, Catania, Sicily, Italy, November 2003.
View at: Google Scholar
I. Rish et al., “An empirical study of the naive bayes classifier,” IJCAI 2001 workshop on empirical methods in artificial intelligence, vol. 3, pp. 41–46, 2001.
View at: Google Scholar
J. Tolles and W. J. Meurer, “Logistic regression,” JAMA, vol. 316, no. 5, pp. 533-534, 2016.
View at: Publisher Site | Google Scholar
A. Cutler, D. R. Cutler, and J. R. Stevens, “Random forests,” Ensemble Machine Learning, Springer, pp. 157–175, 2012.
View at: Publisher Site | Google Scholar
V. V. Phansalkar and P. S. Sastry, “Analysis of the back-propagation algorithm with momentum,” IEEE Transactions on Neural Networks, vol. 5, no. 3, pp. 505-506, 1994.
View at: Publisher Site | Google Scholar
M. T. Hagan, H. B. Demuth, and M. H. Beale, Neural Network Design (Pws, boston, Ma), Google Scholar Google Scholar Digital Library Digital Library, 1996.
C.-C. Yu and B.-Da Liu, “A backpropagation algorithm with adaptive learning rate and momentum coefficient,” in Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02 (Cat. No. 02CH37290), vol. 2, pp. 1218–1223, IEEE, Honolulu, HI, USA, May 2002.
View at: Google Scholar
R. Battiti, “First-and second-order methods for learning: between steepest descent and Newton’s method,” Neural Computation, vol. 4, no. 2, pp. 141–166, 1992.
View at: Google Scholar
F. D. Foresee and M. T. Hagan, “Gauss-Newton approximation to bayesian learning,” in Proceedings of the international conference on neural networks (ICNN’97), vol. 3, pp. 1930–1935, IEEE, Houston, TX, USA, June 1997.
View at: Google Scholar
S. Mirjalili, S. M. Mirjalili, and A. Lewis, “Grey wolf optimizer,” Advances in Engineering Software, vol. 69, pp. 46–61, 2014.
View at: Publisher Site | Google Scholar
X.-S. Yang, “A new metaheuristic bat-inspired algorithm,” Nature Inspired Cooperative Strategies for Optimization (NICSO 2010), Springer, New Yark, US, pp. 65–74, 2010.
View at: Publisher Site | Google Scholar
X.-S. Yang and S. Deb, “Cuckoo search via lévy flights,” in Proceedings of the 2009 World congress on nature & biologically inspired computing (NaBIC), pp. 210–214, IEEE, Coimbatore, India, December 2009.
View at: Google Scholar
S. Mirjalili and A. Lewis, “The whale optimization algorithm,” Advances in Engineering Software, vol. 95, pp. 51–67, 2016.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2022 Seyed Vahid Moravvej et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

701

Downloads

529

Citations