GGA-MLP: A Greedy Genetic Algorithm to Optimize Weights and Biases in Multilayer Perceptron

Bansal, Priti; Lamba, Rishabh; Jain, Vaibhav; Jain, Tanmay; Shokeen, Sanchit; Kumar, Sumit; Singh, Pradeep Kumar; Khan, Baseem

doi:https://doi.org/10.1155/2022/4036035

Contrast Media & Molecular Imaging

On this page

Abstract Introduction Related Work Results and Discussion Conclusion Data Availability Conflicts of Interest References Copyright Related Articles

Special Issue

Magnetomotive Photoacoustic in Biomedical Applications

View this Special Issue

Research Article | Open Access

Volume 2022 | Article ID 4036035 | https://doi.org/10.1155/2022/4036035

GGA-MLP: A Greedy Genetic Algorithm to Optimize Weights and Biases in Multilayer Perceptron

Priti Bansal,¹Rishabh Lamba,¹Vaibhav Jain,¹Tanmay Jain,¹Sanchit Shokeen,¹Sumit Kumar,²Pradeep Kumar Singh,³and Baseem Khan⁴

Academic Editor: Yuvaraja Teekaraman

Received02 Dec 2021

Accepted28 Jan 2022

Published24 Feb 2022

Abstract

The task of designing an Artificial Neural Network (ANN) can be thought of as an optimization problem that involves many parameters whose optimal value needs to be computed in order to improve the classification accuracy of an ANN. Two of the major parameters that need to be determined during the design of an ANN are weights and biases. Various gradient-based optimization algorithms have been proposed by researchers in the past to generate an optimal set of weights and biases. However, due to the tendency of gradient-based algorithms to get trapped in local minima, researchers have started exploring metaheuristic algorithms as an alternative to the conventional techniques. In this paper, we propose the GGA-MLP (Greedy Genetic Algorithm-Multilayer Perceptron) approach, a learning algorithm, to generate an optimal set of weights and biases in multilayer perceptron (MLP) using a greedy genetic algorithm. The proposed approach increases the performance of the traditional genetic algorithm (GA) by using a greedy approach to generate the initial population as well as to perform crossover and mutation. To evaluate the performance of GGA-MLP in classifying nonlinear input patterns, we perform experiments on datasets of varying complexities taken from the University of California, Irvine (UCI) repository. The experimental results of GGA-MLP are compared with the existing state-of-the-art techniques in terms of classification accuracy. The results show that the performance of GGA-MLP is better than or comparable to the existing state-of-the-art techniques.

1. Introduction

Artificial Neural networks (ANNs) are computing models inspired by the biological nervous system. An ANN consists of an interconnected network of nodes called artificial neurons which are organized in the form of layers, namely, input layer, hidden layers, and output layer [1]. A set of synaptic weights is used to interconnect the nodes that form these layers. ANNs have been applied to a broad range of problems like classification, regression, prediction, pattern recognition, and disease diagnosis [2–6]. Classification is one of the important areas of research in the field of data science. Many classification models exist, out of which ANNs are among the most widely used models.

In this paper, our focus is on multilayer perceptron (MLP) which is a multilayer feedforward neural network. Classification using MLP is basically a two-step process. The first step is the learning (training) phase in which a classifier is built to describe a predetermined set of data classes for a given dataset (training data). In the second step, the model which has been built in the training phase is used for the classification of the unclassified data (test data) for estimating the accuracy of the classifier. During the learning phase, MLP learns by adjusting synaptic weights and biases iteratively in an attempt to correctly predict the class labels of the input data. The process of weight and bias update continues until the acquired knowledge is sufficient and the network reaches a specified level of accuracy; i.e., a predefined error measure is minimized, or the maximum number of epochs is reached [7]. After the completion of the learning phase, it is mandatory to assess the performance of MLP, i.e., its generalization and predictive capabilities, using samples of data (test data) that are different from those used during the training phase for the given dataset. To achieve generalization, MLPs need to avoid the issues of both underfitting and overfitting during the training phase. To achieve the best results, it is therefore required that the number of training patterns should be sufficiently larger than the total number of connections in the neural network. The performance of MLP is highly dependent on the learning method used to train it during the training phase. Several learning algorithms exist in the literature with the aim of finding an optimal MLP. These learning algorithms can be broadly classified into three categories, namely, conventional methods [8–12], metaheuristic-based methods [13–37], and hybrid methods [20, 38–44].

Despite the existence of a large number of learning algorithms, researchers continue to apply new optimization techniques like multimean particle swarm optimization (MMPSO) [28], whale optimization algorithm (WOA) [23], multiverse optimizer (MVO) [34], grasshopper optimization algorithm (GOA) [35], and firefly algorithm [36] to generate an optimal set of synaptic weights in an attempt to further improve the accuracy and performance of MLP. As stated in No-Free-Lunch (NFL) theorem [45], there is no optimization technique that solves all optimization problems. It is quite possible that an existing learning algorithm may train an MLP well for some datasets while it fails to do the same for some other datasets. This makes the field of generating optimal connection weights a dynamic research area. This is the main motivation behind the work presented in this paper, in which we propose a hybrid learning algorithm to train MLP.

GA is an evolutionary algorithm (EA) and is one of the most widely investigated algorithms among the metaheuristic algorithms in designing neural networks. Over the years, GA and its variants have been successfully applied in several domains for ANN weight [13–20], topology [46–48], and feature set optimization [49, 50], as well as parameter tuning [51, 52]. A comprehensive review of optimization of neural networks using GA can be found in [53]. The efficiency, effectiveness, and ease of use of GA motivated us to further improve the performance of GA in optimizing weights of MLP by integrating greedy techniques with GA. The proposed algorithm Greedy Genetic Algorithm–Multilayer Perceptron (GGA-MLP) improves the performance of traditional GA by using a greedy approach to generate the initial population as well as to perform crossover and mutation. Some of the application areas of the proposed work are disease identification, e-mail spam identification, prediction of the stock market, and fruit classification. The main challenge with the proposed approach is that it may not work well with some of the datasets, as stated by No-Free-Lunch (NFL) theorem [45] mentioned above. Finally, the performance of GGA-MLP is compared with various classifiers as well as the existing state-of-the-art metaheuristic algorithms for training MLP. The key contributions of this paper are as follows:(1)A hybrid learning algorithm, GGA-MLP, that integrates greedy techniques with GA is proposed to train MLP(2)GGA-MLP is evaluated and compared with existing state-of-the-art algorithms on 10 datasets of different complexities

The paper is organized as follows. Related work is presented in Section 2. A brief overview of GA is given in Section 3. In Section 4, the proposed GGA-MLP for optimization of MLP weights and biases is presented. In Section 5, experiments conducted to evaluate the effectiveness of GGA-MLP are presented, and results are discussed. Finally, the conclusion and future work are discussed in Section 6.

In conventional methods, backpropagation (BP) is the most widely used algorithm to train multilayer feedforward networks (MLFFNs). BP uses a gradient descent rule that tries to minimize the error of the network by moving in a direction opposite to that of the gradient of the error function. However, BP has certain limitations. It has a tendency to converge toward the local optima, as it is good only at exploiting the current solution, which may result in unsatisfactory classification accuracies. It also has slow convergence as well as scaling problems [54]. To overcome these problems, many improvements of BP such as quickprop [8], RPROP [9], and improved BP [10] have been proposed by researchers in the past. Besides, conjugate gradient methods [11] and other derivative-based conventional methods such as Levenberg–Marquardt method [12] are also used for weight optimization, but sometimes these methods can be expensive. Conventional methods are computationally faster as compared to their metaheuristic counterparts because they operate on a single solution; however, they have certain limitations as discussed above.

Due to the global search capabilities of metaheuristic algorithms, they are being widely used by researchers to generate optimal weights and biases in MLP. In [13–21], GA was applied to train MLP, and its performance was compared to BP. Valian et al. [22] proposed an improved cuckoo search (ICS) to train MLFNN. Unlike cuckoo search, the proposed ICS tunes the parameters of CS. The performances of ICS and CS are compared on two datasets. A number of approaches have been proposed by researchers to train MLFNN using differential evolution (DE) and evolutionary strategies [23–26]. Apart from EA, bioinspired algorithms and their variants are proposed and used by researchers to generate optimal connection weights in MLFNN. Karaboga et al. [27] applied artificial bee colony (ABC) algorithm to train MLFNN and compared the performance of ABC with that of GA. In [28], multimean particle swarm optimization (MMPSO) is proposed by the authors to generate optimal connection weights of MLFNN. MMPSO is derived from PSO, and unlike PSO it uses multiple swarms. The performance of MMPSO is compared with PSO on 10 datasets, and the results prove the effectiveness of MMPSO. In [29], krill herd algorithm (KHA) is applied to train ANN and is compared with BP, GA, and harmony search (HS). Bolaji et al. [30] and Kattan et al. [31] used fireworks algorithm (FWA) and HS, respectively, to train ANN. Mirjalili [32] applied gray wolf optimizer (GWO) to train MLP, and the comparison results on 8 datasets show the GWO algorithm’s capability of avoiding local optima. Aljarah et al. [33] applied a whale optimization algorithm (WOA) to generate an optimal set of connection weights in MLP. The performance of the proposed WOA-based trainer is evaluated on 20 datasets by comparing it with the trainers obtained using ant colony optimization (ACO), GA, PSO, DE, ES, population-based incremental learning (PBIL), and BP. The results indicate that WOA-based trainer avoids premature convergence and generates the best optimal weights in most of the cases for binary pattern classification. In [34], nature-inspired multiverse optimizer is used to train MLP. Heidari et al. [35], proposed GOAMLP that uses GOA to train single hidden layer MLP and is applied on five datasets. When compared with state-of-the-art algorithms, MLP trained using GOAMLP resulted in improved classification accuracy. Elakkiya and Selvakumar [36] used enhanced step size firefly algorithm to generate optimal weights of feedforward neural network for spam detection. In [37], adaptive GA has been proposed for weight optimization of BPNN for capacitive accelerometers. The optimized BPNN is used in the capacitive accelerometer.

Sometimes, metaheuristic algorithms suffer from premature convergence. To overcome the problems faced by conventional methods and metaheuristic algorithms, hybrid approaches were proposed by researchers. In [38, 39], GA and PSO, respectively, have been combined with BP, which helped in fast convergence and avoidance of getting trapped in local optima. In [40], a hybrid approach that combines PSO and gravitational search algorithm is presented to train feedforward networks. In [41], a hybrid training algorithm, LPSONS, is proposed to train feedforward neural networks. It combines the velocity operator of PSO with Mantegna Levy distribution to increase the diversity of the population. To avoid local optima and premature convergence, Mantegna Levy distribution is further combined with neighborhood search. In [20], an improved GA coupled with BP neural network (IGA-BPNN) is proposed to improve the forecast performance of ANN. This model uses improved genetic adaptive strategies to avoid getting stuck in local optima. The experimental results show that IGA-BPNN performs better than traditional GA-BPNN. In [42], a hybrid algorithm, namely, constriction coefficient-based particle swarm optimization and gravitational search algorithm (CPSOGSA), is proposed to train MLP. It helps to avoid premature convergence and getting stuck in local optima problems of MLP. In [43], an optimized adaptive GA in the backpropagation neural network (OAGA-BPNN) is proposed to optimize BPNN for traffic flow prediction. In [44], a hybrid grasshopper and new cat swarm optimization algorithm was proposed for feature selection and weight and architecture optimization of MLP. In a similar way, other optimization approaches are also discussed by various researchers like MLP-LOA [55], improved teaching learning (TLB), and cat swarm optimization to get better results in respect of similar applications [56, 57].

3. Genetic Algorithm

Genetic algorithm (GA) is a metaheuristic algorithm proposed by Holland [58]. This algorithm imitates the process of natural selection where the chances of survival of fitter individuals are more as compared to other individuals in a competing environment. It is a global search technique characterized by evolution in every generation. GA starts with a randomly generated initial population of chromosomes where each chromosome represents a possible solution to the given problem. Each chromosome is associated with a ﬁtness value that is a measure of how good a solution is for the given problem. In each generation, the population evolves toward better fitness using evolutionary operators such as selection, crossover, and mutation. This process continues until a solution is found or the maximum number of iterations is reached.

4. Proposed Model: GGA-MLP

In this section, we present our proposed approach GGA-MLP which applies a greedy GA to generate an optimal set of synaptic weights and biases of MLP, keeping the architecture and activation function fixed. The various steps of GGA-MLP are explained below.

4.1. Representation of Candidate Solutions and Fitness Function

An important aspect that needs to be considered during the design of GGA-MLP is the representation of the possible solutions in the search space in the form of chromosomes and the encoding scheme used to encode the chromosomes. In GGA-MLP, each chromosome represents a candidate MLP. A chromosome is basically divided into different segments, where each segment contains the encoded weights between two layers (input-hidden, hidden-hidden (if any), hidden-output) and the last segment contains the encoded bias values for the MLP. Chromosome encoding for an MLP having two hidden layers is shown in Figure 1. However, the length of the chromosome can easily be changed to train MLP having one or more hidden layers. A real value encoding scheme is used to encode the chromosomes.

As it is clear from Figure 1, if there are input nodes, hidden layers with hidden nodes in each hidden layer, and output nodes, then the length of the chromosome will be calculated using

Each chromosome in the population is represented by where PS is the population size. Each in the population is associated with a fitness value which is the measure of its quality. In our case, mean square error (MSE) is chosen as the fitness function. To calculate the fitness of an MLP, the training data sample is made to run on it and the mean square error value is calculated usingwhere is the actual output, is the predicted output, and is the number of samples in the training dataset. This process is repeated for each . The goal of GGA is to find an that minimizes the objective function , where R⁺ represents a set of real numbers. The objective function can be calculated using (3), and it tells us about the quality of the solution.

Now, GGA tries to find the best MLP that minimizes the objective function as shown in

4.2. Generation of Initial Population

In evolutionary algorithms (EAs), the initial population plays a major role in determining the quality of the final solution as well as the convergence speed [59]. Several population initialization methods exist in literature, but in most cases, the initial population is generated randomly. However, due to the dependence of the final solution’s quality on the initial population, GGA uses a greedy population initialization method that uses domain-specific knowledge to generate good quality MLPs (chromosomes). Initially, the synaptic weights and biases are chosen randomly in the interval [−2, 2]. After this, GGA analyzes the features of the dataset on which the MLP needs to be trained. In most cases, it has been observed that certain features contribute more than others to determining the correct class of the input pattern. GGA exploits this property of the dataset and finds important features using domain-specific knowledge. The weights of these identified features are increased by a random number in the interval [0.0,1.0) in the entire initial population, thereby giving them a higher weightage as compared to other features from the very beginning.

4.3. Mean-Based Crossover (MBC)

After population initialization, the next step is the application of various operators such as selection, crossover, and mutation repeatedly to obtain an MLP with optimal weights and biases. Maintaining diversity is important, but sometimes it is also vital to retain the best individuals of one generation into the next. GGA-MLP uses elitism to the transfer best chromosome(s) from one generation to another. Crossover and mutation are performed to generate offspring by selecting chromosomes from the current generation. The crossover operator takes two chromosomes and combines them to produce new offsprings. It is based on the idea that the exchange of information between good chromosomes will generate even better offsprings. Extreme care should be taken while performing selection and crossover operation as it may reduce the genetic diversity, which may ultimately lead to premature convergence. To avoid premature convergence, we present a crossover technique, known as mean-based crossover (MBC), that aims at improving the fitness of the top individuals of the population with the help of the worst members of the population. The proposed crossover technique involves the calculation of the mean of the fittest chromosomes in the population, thereby generating offsprings that are closer to the solution having minimum losses. Before applying MBC, GGA-MLP sorts the chromosomes in ascending order based on their fitness values. MBC starts by selecting the top 30% of the chromosomes and calculates the gene-wise mean of these chromosomes. The mean chromosome is an indicator of the ideal gene value which minimizes the MSE. In order to move toward a global optimum, this mean chromosome is used as a comparison parameter against individuals having low fitness values in the population. From the top 30% chromosomes, a chromosome is selected randomly for crossover. Another parent for crossover is chosen from the worst 30% individuals in such a way that it can contribute the most toward the fitness of chromosome . The method of selection of is shown in Figure 2. After selecting and , MBC is performed by exchanging the genes of and as shown in Figure 2. Out of the two children obtained from MBC, the offspring having higher fitness improves the quality of the population. The other offspring adds randomness to the population, thereby decreasing the probability of the population converging to a local optimum.

After crossover, is inserted into a set S to prevent it from being selected again for MBC in the current iteration. This is done to ensure that a unique chromosome is selected from the population each time MBC is performed, thereby preventing the problem of generating duplicate children. This process continues till the desired number of offsprings is generated. The steps of MBC are shown in Figure 2.

4.4. Greedy Mutation

In GA, the mutation operator is vital for maintaining diversity in the population. Mutation operator introduces diversity in the evolving population. It randomly modifies one or more genes of a chromosome depending upon the mutation probability which avoids getting stuck in the local minima. In traditional GA, every chromosome has an equal probability of getting mutated irrespective of its fitness [60]. It means both the best and the worst chromosomes have an equal probability of getting disrupted by mutation. In this paper, we propose a greedy mutation that aims to (i) avoid disruption of good quality chromosomes and (ii) at the same time maintain diversity in the population by mutating low-quality chromosomes, thereby improving the quality of the overall population.

Greedy mutation starts by calculating the gene-wise mean of the top 30% (N) chromosomes to generate a mean chromosome . It then selects a chromosome randomly from the worst 30% (M) chromosomes in the population for mutation. A random number is generated for every gene of and is compared with the mutation probability . If , difference “d” between the value of the selected gene of and that of the corresponding gene of the mean chromosome is calculated, and a random number “r” is generated. The product of r and d is then subtracted from the corresponding gene value in . This helps the chromosome in approaching good gene values, thereby increasing its overall fitness.

Due to the use of greedy approaches at each step, diversity of the population may decrease leading to premature convergence. To avoid this, it is important to introduce diversity in the population. GGA-MLP introduces diversity in the population in each iteration by generating 30% of the population using elitism, 50% of the population using MBC and greedy mutation, and the remaining 20% randomly by choosing synaptic weights and biases within the range [−2, 2].

5. Results and Discussion

First, we present the datasets that are selected to evaluate the effectiveness of GGA-MLP, in terms of accuracy achieved in classifying the input data, in Section 5.1. The implementation details, experimental setup used for performing experiments, and results are presented in Section 5.2.

5.1. Datasets

To evaluate the effectiveness of the proposed approach GGA-MLP, ten standard binary classification datasets are selected from the UCI Machine Learning Repository [61]: Parkinson, Indian Liver Patient Dataset (ILPD), Diabetes, Vertebral Column, Spambase, QSAR Biodegradation, Blood Transfusion, HTRU2, Drug Consumption: Amyl Nitrite, and Drug Consumption: Ketamine. The description of the selected datasets is shown in Table 1. In each dataset, 80% of the instances are used for training (out of which 20% is used for validation), and the remaining 20% are used for testing. It can easily be seen from Table 1 that the selected datasets have different numbers of features ranging from 4 to 57 as well as instances ranging from 197 to 17898, which helps us to evaluate the proposed approach on datasets of varying complexities. It also makes the task of evaluating GGA-MLP even more challenging.

5.2. Experimental Design and Results

To evaluate the effectiveness of GGA-MLP, the performance of MLP trained using GGA-MLP is compared with the classification accuracy of MLP trained using existing algorithms, namely, GA [21], ABC [27], MMPSO [28], WOA [33], MVO [34], and GOA [35], and on each dataset given in Table 1. All the algorithms are implemented in Python 3.6.4 using the Anaconda framework. As these are randomized algorithms, 30 runs of each algorithm are performed on every dataset. After each run, the best MLP is selected, and its classification accuracy on the test dataset is calculated usingwhere NC is the number of correctly classified testing data samples and N is the total number of samples in the testing dataset.

Before the start of the training phase, it is required to decide the architecture of MLP for each dataset. To perform a fair comparison, the architecture of MLP is kept the same for each algorithm. Here, we take only one hidden layer, as one hidden layer is sufficient to classify the datasets shown in Table 1. The number of neurons in the hidden layer is decided by using the method proposed by [25]. The number of neurons in the hidden layer is calculated using the formula , where N is number of relevant features of the dataset. In some cases, the number of hidden neurons taken is . The architecture of MLP used for each dataset is shown in Table 2.

The values of the controlling parameters of ABC, WOA, MMPSO, MVO, GOA, GA, and GGA-MLP are listed in Table 3. Various performance metrics such as classification accuracy, specificity, and sensitivity are used to assess the performance of GGA-MLP with respect to the existing state-of-the-art algorithms. The average, best, and standard deviation of classification accuracy, specificity, and sensitivity of the best MLP trained using these metaheuristic algorithms during 30 runs for the given datasets are shown in Tables 4–6, respectively. Data is collected under Windows 10 on Intel core i5-7200U 3.1 GHz processor with 8.00 GB DDR4 and Nvidia GT 940MX 2 GB VRAM.

It is evident from Tables 4 and 5 that GGA-MLP gives the highest average and best accuracy as well as specificity for the datasets except Parkinson, QSAR Biodegradation, Drug Consumption: Amyl Nitrite, and Drug Consumption: Ketamine. Despite having low accuracies and specificity on the four datasets, GGA-MLP achieves higher sensitivity as compared to the existing algorithms, as evident from Table 6, which shows the superiority of GGA-MLP in classifying the positive samples correctly. GGA-MLP also has low standard deviation as compared to existing state-of-the-art algorithms. This shows the robustness of the proposed approach.

In Figure 3, MSE values of MLP trained using ABC, WOA, MMPSO, MVO, GOA, GA, and GGA-MLP for the given datasets are calculated at an interval of 10 iterations and plotted to visualize convergence rate. The convergence curves show that although GGA-MLP takes more time to converge as compared to other metaheuristic algorithms, it avoids getting trapped in local minima. In most of the cases, the performance of GGA-MLP is better than the existing algorithms. To assess the efficacy of MLP trained using GGA-MLP as a classifier, we compare the classification accuracy of GGA-MLP with that of the classifiers built using other machine learning algorithms such as logistic regression, Naïve Bayes, and decision tree, as well as the MLP trained using BP. Similar to decision tree algorithms, BP algorithms like GGA-MLP are also randomized algorithms; every dataset is run 30 times on each of them, and the average, best, and standard deviation of classification accuracy are reported in Table 7. To prevent overfitting, validation set is used for early stopping during training of logistic regression, Naïve Bayes, and decision tree as well as the MLP using BP. It is clear from Table 7 that GGA-MLP gives the best result in all the cases. However, the standard deviation over 30 runs is least in case of decision tree. From Tables 4–7, it is clear that GGA-MLP performance is better than or comparable to the existing algorithms in classifying input patterns correctly.

(a)

(b)

6. Conclusion and Future Work

In this paper, a greedy genetic algorithm, GGA-MLP, is presented to train MLP. The use of domain-specific knowledge enables the generation of good quality initial population. Mean-based crossover and greedy mutation help algorithm in moving toward global optima by exploring the search space thoroughly. Datasets of varying complexities are used to evaluate the performance of GGA-MLP and to compare it with existing state-of-the-art algorithms as well as existing classifiers such as Naïve Bayes, decision tree, logistic regression, and MLP trained using BP. The results show that although GGA-MLP takes more time to converge as compared to other metaheuristic algorithms, the performance of GGA-MLP is better than or comparable to the existing techniques in classifying datasets, especially large datasets, as GGA-MLP searches the solution space properly by maintaining a balance between exploration and exploitation.

In future, we plan to extend our work to train other types of ANNs and incorporate architecture optimization in it.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

X. Xin Yao, “Evolving artificial neural networks,” Proceedings of the IEEE, vol. 87, no. 9, pp. 1423–1447, 1999.
View at: Publisher Site | Google Scholar
L.-H. Chen and X.-Y. Zhang, “Application of artificial neural network to classify water quality of the yellow river,” Journal 0f Fuzzy Information and Engineering, pp. 15–23, 2009.
View at: Publisher Site | Google Scholar
H. Altun, A. Bilgil, and B. C. Fidan, “Treatment of multi-dimensional data to enhance neural network estimators in regression problems,” Expert Systems with Applications, vol. 32, no. 2, pp. 599–605, 2007.
View at: Publisher Site | Google Scholar
D. D. Silalahi, C. E. Reaño, F. P. Lansigan, R. G. Panopio, and N. C. Bantayan, “Using genetic algorithm neural network on near infrared spectral data for ripeness grading of oil palm (elaeis guineensis jacq.) fresh fruit,” Information Processing in Agriculture, vol. 3, no. 4, pp. 252–261, 2016.
View at: Publisher Site | Google Scholar
X. Ma and X. Gan, “Condition monitoring and faults recognizing of dish centrifugal separator by artificial neural network combined with expert system,” in 2009 Fifth International Conference on Natural Computation, vol. 2, pp. 203–207, Tianjian, China, 14-16 Aug. 2009.
View at: Google Scholar
F. Amato, A. López, E. M. Peña-Méndez, P. Vaňhara, A. Hampl, and J. Havel, “Artificial neural networks in medical diagnosis,” Journal of Applied Biomedicine, vol. 11, no. 2, pp. 47–58, 2013.
View at: Publisher Site | Google Scholar
B. A. Garro and R. A. Vázquez, “Designing artificial neural networks using particle swarm optimization algorithms,” Computational Intelligence and Neuroscience, vol. 2015, pp. 1–20, 2015.
View at: Publisher Site | Google Scholar
S. E. Fahlman and C. Lebiere, “The cascade-correlation learning architecture,” in Advances in Neural Information Processing Systems 2, D. S. Touretzky, Ed., pp. 524–532, Morgan Kaufmann, San Francisco, CA, USA, 1990.
View at: Google Scholar
M. Riedmiller and H. Braun, “A direct adaptive method for faster backpropagation learning: the rprop algorithm,” in IEEE International Conference on Neural Networks, pp. 586–591, San Francisco, CA, USA, 28 March-1 April 1993.
View at: Google Scholar
N. M. Nawi, R. S. Ransing, M. N. M. Salleh, R. Ghazali, and N. A. Hamid, “An improved back propagation neural network algorithm on classification problems,” in Proceedings of the Database Theory and Application, Bio-Science and Bio-Technology - International Conferences, DTA and BSBT 2010, Y. Zhang, A. Cuzzocrea, J. Ma, K.-I. Chung, T. Arslan, and X. Song, Eds., vol. 118, pp. 177–188, Springer, Jeju Island, Korea, December 2010.
View at: Publisher Site | Google Scholar
E. Barnard and R. A. Cole, “A neural-net training program based on conjugate-gradient optimization,” Oregon Graduate Institute of Science and Technology, Portland, Oregon, United States, 1989, Technical Report CSE 89-014.
View at: Google Scholar
D. W. Marquardt, “An algorithm for least-squares estimation of nonlinear parameters,” Journal of the Society for Industrial and Applied Mathematics, vol. 11, no. 2, pp. 431–441, 1963.
View at: Publisher Site | Google Scholar
U. Seiffert, “Multiple layer perceptron training using genetic algorithms,” in Proceedings of the European symposium on artificial neural networks, Bruges, Bélgica, 25-27 April 2001.
View at: Google Scholar
S. Ding, C. Su, and J. Yu, “An optimizing BP neural network algorithm based on genetic algorithm,” Artificial Intelligence Review, vol. 36, no. 2, pp. 153–162, 2011.
View at: Publisher Site | Google Scholar
A. Sedki, D. Ouazar, and E. El Mazoudi, “Evolving neural network using real coded genetic algorithm for daily rainfall-runoff forecasting,” Expert Systems with Applications, vol. 36, no. 3, pp. 4523–4527, 2009.
View at: Publisher Site | Google Scholar
S. Koçer and M. R. Canal, “Classifying epilepsy diseases using artificial neural networks and genetic algorithm,” Journal of Medical Systems, vol. 35, no. 4, pp. 489–498, 2011.
View at: Publisher Site | Google Scholar
H. Karimi and F. Yousefi, “Application of artificial neural network-genetic algorithm (ANN-GA) to correlation of density in nanofluids,” Fluid Phase Equilibria, vol. 336, pp. 79–83, 2012.
View at: Publisher Site | Google Scholar
Y.-T. Chang, J. Lin, J.-S. Shieh, and M. F. Abbod, “Optimization the initial weights of artificial neural networks via genetic algorithm applied to hip bone fracture prediction,” Advances in Fuzzy Systems, vol. 2012, Article ID 951247, 9 pages, 2012.
View at: Publisher Site | Google Scholar
D. Kim, H. Kim, and D. Chung, “A modified genetic algorithm for fast training neural networks,” in Advances in Neural Networks, Lecture Notes in Computer Science, J. Wang, X. Liao, and Z. Yi, Eds., p. 3496, Springer, Berlin, Heidelberg, 2005.
View at: Publisher Site | Google Scholar
N. Chen, C. Xiong, W. Du, C. Wang, X. Lin, and Z. Chen, “An improved genetic algorithm coupling a back-propagation neural network model (IGA-BPNN) for water-level predictions,” Water, vol. 11, no. 9, Article ID 1795, 2019.
View at: Publisher Site | Google Scholar
R. S. Sexton and J. N. D. Gupta, “Comparative evaluation of genetic algorithm and backpropagation for training neural networks,” Information Sciences, vol. 129, no. 14, pp. 45–59, 2000.
View at: Publisher Site | Google Scholar
E. Valian, S. Mohanna, and S. Tavakoli, “Improved cuckoo search algorithm for feed forward neural network training,” International Journal of Artificial Intelligence & Applications, vol. 2, no. 3, pp. 36–43, 2011.
View at: Publisher Site | Google Scholar
J. Ilonen, J.-K. Kamarainen, and J. Lampinen, “Differential evolution training algorithm for feed-forward neural networks,” Neural Processing Letters, vol. 17, no. 1, pp. 93–105, 2003.
View at: Publisher Site | Google Scholar
A. Slowik and M. Bialko, “Training of artificial neural networks using differential evolution algorithm,” in Conference on human system interactions, pp. 60–65, IEEE, Krakow, Poland, 25-27 May 2008.
View at: Publisher Site | Google Scholar
A. S. I. Wdaa, Differential Evolution for Neural Networks Learning Enhancement, Universiti Teknologi, Malaysia, 2008, Ph.D. thesis.
W. Wienholt, “Minimizing the system error in feedforward neural networks with evolution strategy,” Icann '93, pp. 490–493, 1993.
View at: Publisher Site | Google Scholar
D. Karaboga, B. Akay, and C. Ozturk, “Artificial bee colony (ABC) optimization algorithm for training feed-forward neural networks,” in Modeling Decisions for Artificial Intelligence, pp. 318–329, Springer, 2007.
View at: Google Scholar
M. Hacibeyoglu and M. H. Ibrahim, “A novel multimean particle swarm optimization algorithm for nonlinear continuous optimization: application to feed-forward neural network training,” Scientific Programming, vol. 2018, Article ID 1435810, 9 pages, 2018.
View at: Publisher Site | Google Scholar
P. A. Kowalski and S. Łukasik, “Training neural networks with krill herd algorithm,” Neural Processing Letters, vol. 44, no. 1, pp. 5–17, 2015.
View at: Publisher Site | Google Scholar
A. L. a. Bolaji, A. A. Ahmad, and P. B. Shola, “Training of neural network for pattern classification using fireworks algorithm,” International Journal of System Assurance Engineering and Management, vol. 9, no. 1, pp. 208–215, 2018.
View at: Publisher Site | Google Scholar
A. Kattan, R. Abdullah, and R. A. Salam, “Harmony search based supervised training of artificial neural networks,” in Proceedings of the 2010 International Conference on Intelligent Systems, Modelling and Simulation, Liverpool, UK, 27-29 Jan. 2010.
View at: Publisher Site | Google Scholar
S. Mirjalili, “How effective is the grey wolf optimizer in training multi-layer perceptrons,” Applied Intelligence, vol. 43, no. 1, pp. 150–161, 2015.
View at: Publisher Site | Google Scholar
I. Aljarah, H. Faris, and S. Mirjalili, “Optimizing connection weights in neural networks using the whale optimization algorithm,” Soft Computing, vol. 22, no. 1, pp. 1–15, 2016.
View at: Publisher Site | Google Scholar
H. Faris, I. Aljarah, and S. Mirjalili, “Training feedforward neural networks using multi-verse optimizer for binary classification problems,” Applied Intelligence, vol. 45, no. 2, pp. 322–332, 2016.
View at: Publisher Site | Google Scholar
A. A. Heidari, H. Faris, I. Aljarah, and S. Mirjalili, “An efficient hybrid multilayer perceptron neural network with grasshopper optimization,” Soft Computing, vol. 2013, 2018.
View at: Publisher Site | Google Scholar
E. Elakkiya and S. Selvakumar, Initial Weights Optimization Using Enhanced Step Size Firefly Algorithm for Feed Forward Neural Network Applied to Spam Detection, TENCON, IEEE, India, 2019.
Z. Han, L. Hong, J. Meng, Y. Li, and Q. Gao, “Temperature drift modeling and compensation of capacitive accelerometer based on AGA-BP neural network,” Measurement, vol. 164, Article ID 108019, 2020.
View at: Publisher Site | Google Scholar
F. Yin, H. Mao, and L. Hua, “A hybrid of back propagation neural network and genetic algorithm for optimization of injection molding process parameters,” Materials & Design, vol. 32, no. 6, pp. 3457–3464, 2011.
View at: Publisher Site | Google Scholar
J.-R. Zhang, J. Zhang, T.-M. Lok, and M. R. Lyu, “A hybrid particle swarm optimization-back-propagation algorithm for feedforward neural network training,” Applied Mathematics and Computation, vol. 185, no. 2, pp. 1026–1037, 2007.
View at: Publisher Site | Google Scholar
S. Mirjalili, S. Z. Mohd Hashim, and H. Moradian Sardroudi, “Training feedforward neural networks using hybrid particle swarm optimization and gravitational search algorithm,” Applied Mathematics and Computation, vol. 218, no. 22, pp. 11125–11137, 2012.
View at: Publisher Site | Google Scholar
O. Tarkhaneh and H. Shen, “Training of feedforward neural networks for data classification using hybrid particle swarm optimization, Mantegna Lévy flight and neighborhood search,” Heliyon, vol. 5, no. 4, Article ID e01275, 2019, https://doi.org/10.1016/j.heliyon.2019.e01275.
View at: Google Scholar
S. A. Rather and P. S. Bala, “A hybrid constriction coefficient-based particle swarm optimization and gravitational search algorithm for training multi-layer perceptron,” International Journal of Intelligent Computing and Cybernetics, vol. 13, pp. 125–169, 2020.
View at: Publisher Site | Google Scholar
J. Zhang and S. Qu, “Optimization of backpropagation neural network under the adaptive genetic algorithm,” Complexity, vol. 2021, Article ID 1718234, 9 pages, 2021.
View at: Publisher Site | Google Scholar
P. Bansal, S. Kumar, S. Pasrija, and S. Singh, “A hybrid grasshopper and new cat swarm optimization algorithm for feature selection and optimization of multi-layer perceptron,” Soft Computing, vol. 24, pp. 1–27, 2020.
View at: Publisher Site | Google Scholar
D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67–82, 1997.
View at: Publisher Site | Google Scholar
R. Perzina, “Self-learning genetic algorithm for neural network topology optimization,” in Agent and Multi-Agent Systems: Technologies and Applications. Smart Innovation, Systems and Technologies, G. Jezic, R. Howlett, and L. Jain, Eds., vol. 38, Springer, Cham, 2015.
View at: Publisher Site | Google Scholar
A. A. Alpaslan, “A combination of genetic algorithm, particle swarm optimization and neural network for palm print recognition,” Neural Computing & Applications, vol. 22, no. 1, pp. 27–33, 2013.
View at: Google Scholar
A. Shrestha and A. Mahmood, “Optimizing deep neural network architecture with enhanced genetic algorithm,” in 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1365–1370, Boca Raton, FL, USA, 16-19 Dec. 2019.
View at: Publisher Site | Google Scholar
A. E. Amin, “A novel classification model for cotton yarn quality based on trained neural network using genetic algorithm,” Knowledge-Based Systems, vol. 39, pp. 124–132, 2013.
View at: Publisher Site | Google Scholar
M. Inthachot, V. Boonjing, and S. Intakosum, “Artificial neural network and genetic algorithm hybrid intelligence for predicting Thai stock price index trend,” Computational Intelligence and Neuroscience, vol. 2016, Article ID 3045254, 8 pages, 2016.
View at: Publisher Site | Google Scholar
M. J. Er and F. Liu, “Parameter tuning of MLP neural network using genetic algprithms,” in The Sixth International Symposium on Neural Networks (ISNN 2009). Advances in Intelligent and Soft Computing, H. Wang, Y. Shen, T. Huang, and Z. Zeng, Eds., vol. 56, Springer, Berlin, Heidelberg, 2009.
View at: Google Scholar
F. H. F. Leung, H. K. Lam, S. H. Ling, and P. K. S. Tam, “Tuning of the structure and parameters of a neural network using an improved genetic algorithm,” IEEE Transactions on Neural Networks, vol. 14, no. 1, pp. 79–88, 2003.
View at: Publisher Site | Google Scholar
H. Chiroma, S. Abdulkareem, A. Abubakar, and T. Herawan, “Neural networks optimization through genetic algorithm searches: A review,” Applied Mathematics & Information Sciences, vol. 11, 2015.
View at: Google Scholar
M. Gori and A. Tesi, “On the problem of local minima in backpropagation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 1, pp. 76–86, 1992.
View at: Publisher Site | Google Scholar
P. Bansal, S. Gupta, S. Kumar, S. Sharma, and S. Sharma, “MLP-LOA: A metaheuristic approach to design an optimal multilayer perceptron,” Soft Computing, vol. 23, pp. 12331–12345, 2019.
View at: Publisher Site | Google Scholar
Y. Kumar and P. K. Singh, “A chaotic teaching learning based optimization algorithm for clustering problems,” Applied Intelligence, vol. 49, pp. 1036–1062, 2019, https://doi.org/10.1007/s10489-018-1301-4.
View at: Google Scholar
Y. Kumar and P. K. Singh, “Improved cat swarm optimization algorithm for solving global optimization problems and its application to clustering,” Applied Intelligence, vol. 48, pp. 2681–2697, 2018, https://doi.org/10.1007/s10489-017-1096-8.
View at: Google Scholar
J. H. Holland, Adaptation in Natural and Artiﬁcial Systems: An Introductory Analysis with Applications to Biology, Control and Artiﬁcial Intelligence, MIT press, 1992.
B. Kazimipour, X. Li, and A. Qin, “A review of population initialization techniques for evolutionary algorithms,” in IEEE Congress on Evolutionary Computation (CEC), pp. 2585–2592, IEEE, Beijing, China, 6-11 July 2014.
View at: Publisher Site | Google Scholar
S. M. Libelli and P. Alba, “Adaptive mutation in genetic algorithms,” Soft Computing, vol. 4, no. 2, pp. 76–80, 2000.
View at: Publisher Site | Google Scholar
D. Dua and C. Graff, UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, CA, 2019, http://archive.ics.uci.edu/ml.

Copyright

Copyright © 2022 Priti Bansal et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2044

Downloads

909

Citations

Contrast Media & Molecular Imaging

Magnetomotive Photoacoustic in Biomedical Applications

GGA-MLP: A Greedy Genetic Algorithm to Optimize Weights and Biases in Multilayer Perceptron

Abstract

1. Introduction

2. Related Work

3. Genetic Algorithm

4. Proposed Model: GGA-MLP

4.1. Representation of Candidate Solutions and Fitness Function

4.2. Generation of Initial Population

4.3. Mean-Based Crossover (MBC)

4.4. Greedy Mutation

5. Results and Discussion

5.1. Datasets

5.2. Experimental Design and Results

6. Conclusion and Future Work

Data Availability

Conflicts of Interest

References

Copyright