Research Article | Open Access
Waqas Haider Bangyal, Abdul Hameed, Wael Alosaimi, Hashem Alyami, "A New Initialization Approach in Particle Swarm Optimization for Global Optimization Problems", Computational Intelligence and Neuroscience, vol. 2021, Article ID 6628889, 17 pages, 2021. https://doi.org/10.1155/2021/6628889
A New Initialization Approach in Particle Swarm Optimization for Global Optimization Problems
Particle swarm optimization (PSO) algorithm is a population-based intelligent stochastic search technique used to search for food with the intrinsic manner of bee swarming. PSO is widely used to solve the diverse problems of optimization. Initialization of population is a critical factor in the PSO algorithm, which considerably influences the diversity and convergence during the process of PSO. Quasirandom sequences are useful for initializing the population to improve the diversity and convergence, rather than applying the random distribution for initialization. The performance of PSO is expanded in this paper to make it appropriate for the optimization problem by introducing a new initialization technique named WELL with the help of low-discrepancy sequence. To solve the optimization problems in large-dimensional search spaces, the proposed solution is termed as WE-PSO. The suggested solution has been verified on fifteen well-known unimodal and multimodal benchmark test problems extensively used in the literature, Moreover, the performance of WE-PSO is compared with the standard PSO and two other initialization approaches Sobol-based PSO (SO-PSO) and Halton-based PSO (H-PSO). The findings indicate that WE-PSO is better than the standard multimodal problem-solving techniques. The results validate the efficacy and effectiveness of our approach. In comparison, the proposed approach is used for artificial neural network (ANN) learning and contrasted to the standard backpropagation algorithm, standard PSO, H-PSO, and SO-PSO, respectively. The results of our technique has a higher accuracy score and outperforms traditional methods. Also, the outcome of our work presents an insight on how the proposed initialization technique has a high effect on the quality of cost function, integration, and diversity aspects.
Optimization is considered the most productive field of research for many decades. Advanced optimization algorithms are required, as the problems of the real world evolve time towards complexity. The key purpose is to obtain the fitness function’s optimum value . The classification is an attempt to identify groups of certain categories of data. Moreover, the training data have many features that play a significant role in segregating the knowledge according to the classes’ prearranged categories. Globally, a massive growth is recognized in various data classification applications, such as organic compound analysis, television audience share prediction, automatic abstraction, credit card fraud detection, financial projection, targeted marketing, and medical diagnosis . In evolutionary computation, data classification builds its model based on the genetic process and natural evolution . These techniques are adaptive and robust, which perform global exploration instead of candidate solutions for the extraction of information on large datasets.
The fundamental domain of artificial intelligence is swarm intelligence (SI), which discusses the developmental methods that govern the multiagent mechanism by systemic architecture and are influenced by the behaviour of social insects such as ants, wasps, bees, and termites. They are also encouraged by other social animal colonies, such as bird flocking or fish schooling . In the research of cellular robotic systems, first, the word SI is defined by Beni and Wang . Researchers have been associated with social insect communities for decades, but for a long time, researchers have not established the composition of their collective behaviour. Moreover, the society’s autonomous agent is preserved as a nonsophisticated single, as it can deal with complicated issues. Complex tasks are accomplished effectively through an association with the single members of society as it strengthens the capacity to perform actions. In the field of optimization, different techniques of swarm intelligence are used.
Particle swarm optimization (PSO) is considered the most efficient population-based stochastic algorithm, suggested by Kennedy and Eberhart in 1995 , employed to deal with the global optimization problems. It has become the most successful technique to solve the optimization problems listed in the diversified domain of engineering due to simplicity and effectiveness. PSO includes the increment of the population in the candidate solution known as the swarm, which is investigating the new search spaces to aggregate the transformation of “flock of birds” while seeking the food. The communication of the information among all individuals is known as particles and all individuals lodged with findings of the rest of the swarm. Each individual follows the two essential rules for seeking: to return its old best point and ensure the best location of its swarm. With the advent of PSO, new methods were also encouraged to face the global problems with optimization in terms of solutions for fuzzy systems, artificial neural networks (ANNs) design, and evolutionary computing. ANNs’ design  and function minimizations  are the most promising applications of evolutionary computing for solving complex optimization problems. PSO and evolutionary algorithms (EAs) have been efficiently used to measure the learning parameters, weight factors, and design of artificial neural networks [9, 10].
In the field of swarm evolutionary computing, the performance of PSO and other EAs are affected by the generation of random numbers during the initialization of the population into the multidimensional search space. PSO tends to achieve maximum performance when executed in the low dimensional search space. Therefore, the performance is expected to be low when the dimensionality of the problem is too high, which causes the particles to stick in the local solution [1, 11, 12]. Perseverance of the aforesaid behaviour becomes intolerable for a variety of real-life applications that contain a lot of local and global minima. Immature performance explains the reason for an inadequate population distribution of the swarm. It often implies that optimum solutions are more difficult to find if the particles do not accurately cover the entire search space, which could omit the global optimum [13–15]. This issue can be resolved by introducing a well-organized random distribution to initialize the swarm. These distributions can vary in structural design depending upon the family. Examples include pseudorandom sequences, probability sequences, and quasirandom sequences.
One of the classical ways of generating random numbers is by an inbuilt library (implemented in most programming languages, e.g., C or C++). The numbers are allocated uniformly by this inbuilt library. Research has proved that this technique is not useful for the uniform generation of random numbers and does not appear to obtain the lowest discrepancy . Also, pseudorandom sequences of normal distributions reported better results compared to randomly distributed sequences . Based on the design of the problem, the output of probability sequences, quasirandom sequences, and pseudorandom sequences varies. Due to variance in the generation of random numbers, pseudorandom sequences are better than quasirandom sequences for globally optimal solutions.
At this point, after a brief analysis of genetic algorithms, evolutionary algorithms, and PSO, we can infer that there is an insufficient amount of research has been performed to implement the pseudorandom sequences for population initialization. Despite this fact, to initialize the particles in the search space, we have proposed a novel pseudorandom initialization strategy called the WELL generator translated as (Well Equi-distributed Long-period Linear). We have compared the novel techniques with the basic random distribution and low-discrepancy sequence families, such as Sobol and Halton sequences on several complex unimodal and multimodal benchmark functions. The experimental findings have shown that WELL-based PSO initialization (WE-PSO) exceeds the other traditional PSO, PSO with Sobol-based initialization (SO-PSO), and PSO with Halton-based initialization (H-PSO) algorithms. Moreover, we have conducted the ANN training on real-world classification problems with quasirandom sequences. To compare the classifier’s output, nine datasets were taken from the famous UCI repository. The results demonstrate that WE-PSO offered better results on real-world dynamic classification problems compared to PSO, SO-PSO, and H-PSO, respectively.
The remainder of the paper is structured as follows: in Section 2, related analysis is discussed. A general overview of the artificial neural network is found in Section 3. In Section 4, the standard PSO is packed. The proposed technique is described in Section 5. In Section 6, the findings are explained. Discussion, conclusion, and potential work are described in Section 7.
2. Related Work
2.1. Modified Initialization Approaches
Researchers have adopted various random number generators, i.e., pseudorandom, quasirandom, and probability sequences, to refine the efficiency of population-based evolutionary algorithms. The concept of using random number generator to initialize a swarm into multidimensional search space is not new. A comparison of low-discrepancy sequences with simple uniform distribution was carried out by the authors in  to assign the initial positions to particles in the search region. The study in  covers only the role of benchmark minimization function to verify the performance of different low-discrepancy sequence versions. Similarly, Kimura and Matsumura  optimized a genetic algorithm using the improved PSO variant to initialize the swarm based on the Halton sequence. The Halton series is under the umbrella of low-discrepancy sequences. The authors of  generated the comprehensive compression of Faure, Sobol, and Halton sequences, and after evaluation of the competitive outcomes, they declared a Sobol sequences as winner among others.
Van der Corput sequence associated with the quasirandom family was first carried out in . For the initial parameters d = 1 and b = 2, the van der Corput sequences were generated, where d represents the problem dimensions and b is the base. The experimental results showed that for the difficult multidimensional optimization problems, the van der Corput sequence-based PSO outperforms the other quasirandom sequences, such as Faure sequence, Sobol sequence, and Halton sequence, respectively. Although, Halton-based PSO and Faure-based PSO gave better performance, when the optimization problem was low in dimensionality. Moreover, many researchers used the probability distribution to tune the different parameters of evolutionary algorithms. The family of probability sequences falls under the Gaussian distribution, Cauchy distribution, beta distribution, and exponential distribution, respectively. The authors in  tuned the PSO parameters using random sequences followed the use of an exponential distribution. Also, a detailed comparison of probability distributions is present in . The experimental results revealed that the PSO based on exponential distribution performed well compared to the PSO based on Gaussian distribution and PSO based on beta distribution.
Similarly, the researchers applied a torus distribution  to initialize the improved Bat algorithm (I-BA). Torus-based initialization enhanced the diversity of swarm and showed better performance. In , the readers can find the source for applying several variations of probabilistic, quasirandom, and the uniform distribution in BA.
There are also other independent statistical methods to produce random numbers, apart from the probability distribution, pseudorandom distribution, and quasirandom distribution, used by various researchers to select an initial location of particles in multidimensional search space. The nonlinear simplex method (NSM) is an initialization method proposed by Parsopoulos and Vrahatis in . The initialization based on centroidal Voronoi tessellations (CVTs) was suggested by Richards Ventura in . The search region is divided into several blocks for the CVT process. In the first division of blocks, each particle gets a spot. The remaining particles, which have not been allocated a block yet, are further separated into subblocks. To allocate a block to a particle every time, the CVT generator used different permutations. The distance function is determined to disperse particles into blocks, and the less distant particles first reserve the entire block in the swarm. The initialization approach based on the CVT method is compared with the simple random distribution and the numerical results illustrated that PSO based on CVT was much better for the initialization of population.
A new technique called opposition-based initialization (O-PSO), inspired by opposition-based learning particles, was suggested by the authors in . Certain particles took their positions in the opposite direction of search space, and O-PSO contributed to increasing the probability of having a global optimum at the beginning. To discover the search field in the opposite direction, which was parallel to the same direction, O-PSO enhanced the diversity of particles. Since good behaviour and poor behaviour were experienced in the human world, it was not possible for the entities to be entirely good and bad at the same time. This natural phenomenon governed by the O-PSO to choose the initial position for the particles in the opposite direction, as well as, in the same direction. Within this theory, the entire swarm was symbolized by the same and opposite particles. The experimental results revealed that proposed O-PSO performed well on many multidimensional dynamic benchmark functions compared to the simple PSO that implemented the uniform distribution for initializing the particles, and the experimental results depicts that O-PSO performed better on several multidimensional complex benchmark functions. Gutiérrez et al.  conducted a research of three distinct PSO initialization methods: the opposition-based initialization, the initialization of orthogonal array, and the chaotic initialization.
2.2. Artificial Neural Network Training Using PSO
The processing of real-world problem with the initialization of various strategies using the ANN classifier produced a high effect on the performance of the evolutionary algorithms. The classifier with the prearranged initialization techniques was shown to have precision compared to the one using the random distribution.
In [4, 5], optimization of the hidden layer in the neural network was performed. For the optimization process, the author manipulated the uniform distribution-based initialization of feedforward neural networks. Subasi in  classified the EMG signals using the uniform random distribution-based PSO along with SVM to diagnose the neuromuscular anarchy. Similarly, the improved swarm optimized functional link artificial neural network (ISO-FLANN) was proposed by Dehuri in  using random number initialization following uniform distribution. Optimal Latin Hypercube Design (OLHD) initialization approach was proposed by the authors in  and evaluated on several data mining problems with the other quasirandom sequences, such as Faure, Halton, and Sobol sequences. The proposed OLHD was better than quasirandom sequences in terms of efficiency measures.
In , the authors introduced the training of NN with particle swarm optimization (NN-PSO) for anticipating the structural failure in reinforced concrete (RC) buildings. The weight vectors for NN was calculated by incorporating PSO on the basis of minimum root mean square error. The introduced NN-PSO classifier was sufficient to handle the structural failure in RC buildings. Xue et al.  presented a new strategy for the feedforward neural network (FNN) classifier, in which a self-adaptive parameter and strategy-based PSO (SPS-PSO) was integrated to reduce the dimensions of large-scale optimization problems. A new algorithm by using PSO was proposed in , which can spontaneously finalize the most appropriate architecture of deep convolutional neural networks (CNNs) for the classification of images, termed as psoCNN. A novel NN-based training algorithm by incorporating PSO is proposed in  called LPSONS. In the LPSONS algorithm, the velocity parameter of PSO was embedded with Mantegna Levy flight distribution for improved diversity. Additionally, the proposed algorithm is used to train feedforward multilayered perceptron ANNs. In , PSO was used for feature engineering of diabetic retinopathy, and after it, the NN classifier was applied for the classification of diabetic retinopathy disease.
After conducting a thorough literature review, we can infer that the particle efficiency and convergence velocity are highly dependent on the swarm initialization process. If all the particles with a proper pattern cover the entire search space, there are more chances that the global optimum will be found at an early stage of PSO.
3. Particle Swarm Optimization
PSO is a global optimization technique that plays an important role in the fields of applied technology and has been widely deployed in numerous engineering applications, such as preparation of heating systems, data mining, power allocation of cooperative communication networks, pattern recognition, machine learning, optimizing route selection, and information security to name a few. PSO works on the application of candidates. To maximize a problem, the optimal solution is represented by each candidate who is designated as a particle. The current location of the particle is defined by the n-dimensional search space and is represented by the vector solution x. In the form of a fitness score carried out by particles, each solution is translated. In the n-dimensional search space at the kth direction, position vector x can be calculated by provoking each particle p. Velocity vector can be defined as the motion of particles and the step size of an entire swarm in the search space is other than position vector p.
PSO begins with the population, consisting of n particles that fly at the iteration ki in the d-dimensional search space to look for the optimal solution. Swarm mutation can transform the objective feature into the desired candidate solution. For updating the position and velocity of the particles, the following two equations are used:
In the above equations, the position vector and velocity vectors are and xz, respectively. shows the local best solution of the entire swarm acquired using its own previous experience, and reflects the global best solution acquired using the -dimension experience of its neighbour. While and , and are the acceleration factors that influence the acceleration weights and and are two random numbers produced by using the random number generator. is an updated position vector that guides the novel point at the kth iteration for the current particle, where is the newly updated velocity. It is possible to drive three different factors from equation (1). The “momentum factor ⟶ ” represents the old velocity. The “cognitive factor ⟶ ” gives local best fitness that has taken from all the previous finesses. The “social factor ⟶ ” provides the best global solution amplified by the intact neighbour particles. The pseudocode of fundamental PSO is present in Algorithm 1.
4. Training of the Neural Networks
The artificial neural network (ANN) is perceived as the most effective technique of approximation, which is used to approximate the nonlinear functions and their relationships. The ANN model is capable of generalizing, learning, organizing, and adapting data. The ANN architecture is based on an interlined series of synchronized neurons, whereas the multiprocessing layer is used to compute the encoding of information . ANN is a computational mathematical model that regulates the relationship between the input and output layers of different nonlinear functions . In this study, we have used the feedforward neural network present in Figure 1, which is the most frequently used and popular architecture of the ANN. The feedforward neural network is defined by the three layers, i.e., input layer, sandwich layer, and output layer, respectively. Input layer served as NN gateway, where the information frame is inserted. The intermediate task of the sandwich layer is to execute the data frame using the input layer. The outcomes are derived from the output layer . Both layers’ units are connected with the serial layer nodes, and the link between the nodes is structured in the feedforward neural network. Bias is a component of each unit and has a value of −1 as present in .
For weight optimization of NN, the position of each particle in swarm shows a set of weight for the current epoch or iteration. The dimensionality of each particle is the number of weights associated with the network. The particle moves within the weight space attempting to minimize learning error (mean squared error (MSE) or sum of squared error (SSE)). In order to change the weights of the neural network, change in Position occurs that will reduce the error in current epoch. There is no backpropagation concept in PSONN where the feedforward NN produced the learning error (particle fitness) based on set of weight and bias (PSO positions).
The challenge of premature convergence is addressed in the problem of weight optimization of ANN [40, 41]. The primary objective of the ANN model is to achieve a set of optimum parameters and weights. The two major classification approaches used to segregate the positive entities from the negative entities are gradient descent and error correction, respectively. Gradient descent-based techniques are low in performance, where the concerns are high dimensional and the parameters are exclusively dependent on the structure. Due to this fact, it stuck in local minima. Backpropagation is one of the gradient decent techniques, which is most commonly used to train the neural network models and solve complex multimodal problems in the real-world as mentioned in .
5. Random Number Generator
The built-in library function is used to construct the mesh of numbers randomly at uniform locations through Rand (x_(min) x_max) in . A continuous uniform distribution probability density function describes the effect of uniformity on any sequence. It is possible to characterize the probability density function as given in the following equation:where and q represent the maximum likelihood parameter. Due to the zero impact on the f (t) dt integrals over any length, the value of f (t) is useless at the boundary of and q. The calculation of maximum probability parameter is determined by the estimated probability function, which is given in
6. The Sobol Sequence
The Sobol distribution was undertaken for the reconstruction of coordinates in . The relation of linear recurrences is included for each dimension dz coordinate, and the binary expression for linear recurrence can be defined for the nonnegative instance az as present in
For dimension dz, the instance i can be generated using
denotes the direction binary function of an instance at the dimension , and can be computed usingwhere cz describes polynomial coefficient where k > z.
7. The Halton Sequence
In , the authors proposed the Halton sequence as an improved variant of the van der Corput sequence. For generating random points, Halton sequences use a coprime base. Algorithm 2 shows the pseudocode for generating the Halton sequences.
8. The WELL Sequence
Panneton et al.  suggested the Well Equi-distributed Long-period Linear (WELL) sequence. Initially, it was performed as a modified variant of the Mersenne Twister algorithm. The WELL distribution algorithm is given as in Algorithm 3.
For the WELL distribution, the algorithm mentioned above describes the general recurrence. The algorithm definition is as follows: x and r are two integers with an interval of r > 0 and 0 < x < k and , and is the weight factor of distribution. The binary matrix of size having the r bit block is expressed by A0 to A7. mx describes the bitmask that holds the first —x bits. t0 to t7 are temporary vector variables.
The random points in Figures 2–5 are the uniform, and Sobol, Halton, and WELL distributions are represented by the bubble plot in which the y-axis is represented by the random values and the x-axis is shown in the table by the relevant index of the point concerned.
The objective of this paper is to work out the purity of one of the proposed pseudorandom sequences. Pseudorandom sequences are much more random than quasirandom sequences. PSO is random in nature, so it does not have a specific pattern to guarantee the global optimum solution. Therefore, we have suggested the WELL distribution-based PSO (WE-PSO) by taking advantage of randomness in the PSO. We have compared the WE-PSO with the uniform distribution-based PSO and other quasirandom distributions-based PSO, i.e., Sobol distribution (SO-PSO) and Halton distribution (H-PSO) to ensure the integrity of the proposed approach. Moreover, by training the nine real-world NN problems, we have tested the proposed technique over NN classifiers. The experimental outcomes reflect an unusual improvement over standard PSO with uniform distribution. WE-PSO approach also outperforms SO-PSO and H-PSO approaches as evident in results. Numerical results have shown that the use of WELL distribution to initialize the swarm enhances the efficiency of population-based algorithms in evolutionary computing. In Algorithm 4, the pseudocode for the proposed technique is presented.
10. Results and Discussion
WELL-PSO (WE-PSO) technique was simulated in C++ and applied to a computer with the 2.3 GHz Core (M) 2 Duo CPU processor specification. A group of fifteen nonlinear benchmark test functions are used to compare the WE-PSO with standard PSO, SO-PSO, and H-PSO for measuring the execution of the WELL-based PSO (WE-PSO) algorithm. Normally, these functions are applied to investigate the performance of any technique. Therefore, we used it to examine the optimization results of WE-PSO in our study. A list of such functions can be found in Table 1. The dimensionality of the problem is seen in Table 1 as D, S represents the interval of the variables, and fmin denotes the global optimum minimum value. The simulation parameters are used in the interval [0.9, 0.4] where c1 = c2 = 1.45, inertia weight is used, and swarm size is 40. The function dimensions are D = 10, 20, and 30 for simulation, and a cumulative number of epochs is 3000. All techniques have been applied to similar parameters for comparatively effective results. To check the performance of each technique, all algorithms were tested for 30 runs.
The purpose of this study is to observe the unique characteristics of the standard benchmark functions based on the dimensions of the experimental results. Three simulation tests were performed in the experiments, where the following TW-BA characteristics were observed:(i)Effect of using different initializing PSO approaches(ii)Effect of using different dimensions for problems(iii)A comparative analysis
The objective of this study was to find the most suitable initialization approach for the PSO and to explore WE-PSO with other approaches, such as SO-PSO, H-PSO, and standard PSO during the first experiment. The purpose of the second simulation is to define the essence of the dimension concerning the standard function optimization. Finally, the simulation results of WE-PSO were compared with the standard PSO, SO-PSO, and H-PSO, respectively. Simulation effects have been addressed in depth in the remainder of the article.
The graphical representation of the similarities of WE-PSO with PSO, H-PSO, and SO-PSO is shown in Figures 6 to 20. For WE-PSO, we can observe that majority of the estimates have a better convergence curve. The dimensions 10, 20, and 30 of the problem are described in the x-axis, while the y-axis represents the mean best against each dimension of the problem.
10.1.1. Effect of Using Different Initializing PSO Approaches
In this simulation, PSO is initialized with WELL sequence (WE-PSO) instead of the uniform distribution. The variant WE-PSO is compared with the other initialized approaches including Sobol sequence (SO-PSO), Halton Sequence (H-PSO), and standard PSO. The experimental findings indicate that the higher dimensions are better.
10.1.2. Effect of Using Different Dimensions for Problems
The core objective of this simulation setup is to find the supremacy of the outcomes based on the dimension of the optimization functions. Three dimensions were used for bench mark functions such as D = 10, D = 20, and D = 30 in experiments. In Table 2, the simulation results were presented. From these simulation results, it was observed that the optimization of higher-dimensional functions is more complex, which can be seen from Table 2 where the dimension size is D = 20 and D = 30.
Note: “‘Mean”’ shows mean value and “Std. dev” indicates the standard deviation. The best results among the four PSO algorithms are presented in bold.
10.1.3. A Comparative Analysis
WE-PSO is compared to the other approaches, namely, SO-PSO, H-PSO, and the standard PSO, where the true value of each technique with the same nature of the problem is provided for comparison purposes. Table 1 shows the standard benchmark functions and their parameter settings. Table 2 reveals that WE-PSO is better than the standard PSO, SO-PSO, and H-PSO with dimension D-30 and outperforms in convergence. The comparative analysis can be seen from Table 2 in which the standard PSO of the smaller dimension size (D = 10, 20) performs well, while the proposed WE-PSO considerably performs well in convergence as the dimension size increases. Hence, WE-PSO is appropriate for higher dimensions. Simulation runs were carried out on HP Compaq with the Intel Core i7-3200 configuration, with a speed of 3.8 GHz with RAM of 6 GB.
In contrast with the findings of SO-PSO, H-PSO, and traditional PSO, the experimental results from Table 2 reveal that WE-PSO surpasses the results of the aforementioned variants of PSO. It can be observed that the WE-PSO outperforms in all functions when compared to other techniques, while the other approaches perform as follows: H-PSO performs better on functions F4, F1, and F2 for 20D, but H-PSO gives overall poor results on higher dimensions, and SO-PSO gives slightly better results on the functions F8, F9, and F15 on 10-D but gives worst result on larger dimensions. Figures from Figures 7 to 15 depict that WE-PSO outperforms in simulation results than other approaches for solving the dim size D = 10, D = 20, and D = 30 on the standard benchmark test functions.
10.1.4. Statistical Test
To objectively verify the consistency of the findings, the Student T-test is performed statistically. For the success of the competing algorithms, the T value is computed using
T value can be positive or negative in the above equation, where and reflect the mean value of the first and second samples. The sample size is referred to as n1 and n2 for both samples. The standard deviations for both samples are and . Positive and negative values indicate that WE-PSO outperforms other approaches. Student’s T-test results are presented in Table 3.
11. Experiments for Data Classification
A comparative analysis on the real-world benchmark dataset problem is evaluated for the training of neural networks to validate the efficiency of the WE-PSO. Using nine benchmark datasets (Iris, Diabetes, Heart, Wine, Seed, Vertebral, Blood Tissue, Horse, and Mammography) from the world-famous UCI machine-learning repository, we conducted experiments. Training weights are initialized randomly within the interval [−50, 50]. Feedforward neural network accuracy is tested in the form of root mean squared error (RMSE). The features of the datasets that are used can be seen in Table 4.
Backpropagation algorithms using standard PSO, SO-PSO, H-PSO, and WE-PSO are trained in the multilayer feedforward neural network. Comparison of these training approaches is tested on real classification datasets that are taken from the UCI repository. The cross-validation method is used to assess the efficiency of various classification techniques. The k-fold cross-validation method is used in this paper for the training of neural networks with the standard PSO, SO-PSO, H-PSO, and proposed algorithm WE-PSO. The k-fold is used with the value k = 10 in the experiments. The dataset has been fragmented into 10 chunks; each data chunk comprises the same proportion of each class of dataset. One chunk is used for the testing phase, while nine chunks were used for the training phase. Nine well-known real-world datasets which were taken from UCI were compared with the experimental results of algorithms: standard PSO, SO-PSO, H-PSO, and WE-PSO are used for evaluating the performance. After the simulation, the results showed that the training of neural networks with the WE-PSO algorithm is better in terms of precision and its efficiency is much higher than the traditional approaches. The WE-PSO algorithm can also be used successfully in the future for data classification and statistical problems. The findings of classification accuracy are summarized in Table 5.
The performance of PSO depends on the initialization of the population. In our work, we have initialized the particles of PSO by using a novel quasirandom sequence called the WELL sequence. However, the velocity and position vector of particles are modified in a random sequence fashion. The importance of initializing the particles by using a quasirandom sequence is highlighted in this study. The experimental results explicitly state that the WELL sequence is optimal for the population initialization, due to its random nature. Moreover, the simulation results have shown that WE-PSO outperforms the PSO, S-PSO and H-PSO approaches. The techniques are also applied to neural network training and provide significantly better results than conventional training algorithms, including standard PSO, S-PSO, and H-PSO approaches, respectively. The solution provides higher diversity and increases the potential to search locally. The experimental results depict that our approach has excellent accuracy of convergence and prevents the local optima. Our technique is much better when it is compared to the traditional PSO and other initialization approaches for PSO as evident in Figure 21. The use of mutation operators with the initialization technique may be evaluated on large-scale search spaces in the future. The core objective of this research is universal but relevant to the other stochastic-based metaheuristic algorithm, which will establish our future direction.
The data used to support the findings of this study are available from the corresponding author upon reasonable request.
This work is part of the PhD thesis of the student.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
- K. Deb, “Multi-objective optimization,” in Search Methodologies, pp. 403–449, Springer, Berlin, Germany, 2014.
- R. Vandenberghe, N. Nelissen, E. Salmon et al., “Binary classification of 18F-flutemetamol PET using machine learning: comparison with visual reads and structural MRI,” NeuroImage, vol. 64, pp. 517–525, 2013.
- V. Ganganwar, “An overview of classification algorithms for imbalanced datasets,” International Journal of Emerging Technology and Advanced Engineering, vol. 2, no. 4, pp. 42–47, 2012.
- J. Kennedy, “Swarm intelligence,” in Handbook of Nature-Inspired and Innovative Computing, pp. 187–219, Springer, Berlin, Germany, 2006.
- G. Beni and J. Wang, “Swarm intelligence in cellular robotic systems,” in Robots and Biological Systems: Towards a New Bionics? pp. 703–712, Springer, Berlin, Germany, 1993.
- J. Kennedy and R. C. Eberhart, “Particle swarm optimization,” Proceedings of ICNN’95—International Conference on Neural Networks, pp. 1942–1948, 1995.
- J. Salerno, “Using the particle swarm optimization technique to train a recurrent neural model,” in Proceedings of the Ninth IEEE International Conference on Tools with Artificial Intelligence, pp. 45–49, Newport Beach, CA, USA, November 1997.
- R. Storn and K. Price, “Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces,” Journal of Global Optimization, vol. 11, no. 4, pp. 341–359, 1997.
- P. P. Palmes, T. Hayasaka, and S. Usui, “Mutation-based genetic neural network,” IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 587–600, 2005.
- W. H. Bangyal, J. Ahmad, and H. T. Rauf, “Optimization of neural network using improved bat algorithm for data classification,” Journal of Medical Imaging and Health Informatics, vol. 9, no. 4, pp. 670–681, 2019.
- A. Cervantes, I. M. Galván, and P. Isasi, “AMPSO: a new particle swarm method for nearest neighborhood classification,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 5, pp. 1082–1091, 2009.
- W. H. Bangyal, J. Ahmed, and H. T. Rauf, “A modified bat algorithm with torus walk for solving global optimisation problems,” International Journal of Bio-Inspired Computation, vol. 15, no. 1, pp. 1–13, 2020.
- C. Grosan, A. Abraham, and M. Nicoara, “Search optimization using hybrid particle sub-swarms and evolutionary algorithms,” International Journal of Simulation Systems Science & Technology, vol. 6, no. 10, pp. 60–79, 2005.
- M. Junaid, W. H. Bangyal, and J. Ahmed, “A novel bat algorithm using sobol sequence for the initialization of population,” in IEEE 23rd International Multitopic Conference (INMIC), pp. 1–6, Bahawalpur, Pakistan, November 2020.
- W. H. Bangyal, J. Ahmed, H. T. Rauf, and S. Pervaiz, “An overview of mutation strategies in bat algorithm,” International Journal of Advanced Computer Science and Applications (IJACSA), vol. 9, pp. 523–534, 2018.
- D. E. Knuth, Fundamental Algorithms: The Art of Computer Programming, Addison-Wesley, Boston, MA, USA, 1973.
- J. E. Gentle, Random Number Generation and Monte Carlo Methods, Springer Science & Business Media., Berlin, Germany, 2006.
- N. Q. Uy, N. X. Hoai, R. I. McKay, and P. M. Tuan, “Initialising PSO with randomised low-discrepancy sequences: the comparative results,” in Proceedings of the IEEE Congress on Evolutionary Computation CEC 2007, pp. 1985–1992, Singapore, September 2007.
- S. Kimura and K. Matsumura, “Genetic algorithms using low-discrepancy sequences,” in Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation ACM, pp. 1341–1346, Washington, DC, USA, June 2005.
- R. Brits, A. P. Engelbrecht, and F. Van den Bergh, “A niching particle swarm optimizer,” in Proceedings of the 4th Asia-Pacific Conference on Simulated Evolution and Learning, pp. 692–696, Orchid Country Club., Singapore, November 2002.
- J. Ander Coput, “Verteilungsfunktionen I & II,” Nederl. Akad. Wetensch. Proc., vol. 38, pp. 1058–1066, 1935.
- R. A. Krohling and L. dos Santos Coelho, “PSO-E: particle swarm with exponential distribution,” in Proceedings of the IEEE Congress on Evolutionary Computation CEC 2006, pp. 1428–1433, Vancouver, Canada, July 2006.
- R. Thangaraj, M. Pant, and K. Deep, “Initializing pso with probability distributions and low-discrepancy sequences: the comparative results,” in Proceedings of the World Congress on Nature & Biologically Inspired Computing NaBIC 2009, pp. 1121–1126, IEEE, Coimbatore, India, December 2009.
- D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986.
- K. E. Parsopoulos and M. N. Vrahatis, “Initializing the particle swarm optimizer using the nonlinear simplex method,” Advances in Intelligent Systems, Fuzzy Systems, Evolutionary Computation, World Scientific and Engineering Academy and Society Press, Stevens Point, WI, USA, 2002.
- M. Richards and D. Ventura, “Choosing a starting configuration for particle swarm optimization,” in Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 2309–2312, Budapest, Hungary, July 2004.
- H. Jabeen, Z. Jalil, and A. R. Baig, “Opposition based initialization in particle swarm optimization (O-PSO),” in Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, pp. 2047–2052, Montreal Québec, Canada, July 2009.
- A. L. Gutiérrez, “Comparison of different pso initialization techniques for high dimensional search space problems: a test with fss and antenna arrays,” in Proceedings of the 5th European Conference on Antennas and Propagation (EUCAP), pp. 965–969, IEEE, Rome, Italy, April 2011.
- A. Subasi, “Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders,” Computers in Biology and Medicine, vol. 43, no. 5, pp. 576–586, 2013.
- S. Dehuri, R. Roy, S.-B. Cho, and A. Ghosh, “An improved swarm optimized functional link artificial neural network (ISO-FLANN) for classification,” Journal of Systems and Software, vol. 85, no. 6, pp. 1333–1345, 2012.
- Z. Liu, P. Zhu, W. Chen, and R.-J. Yang, “Improved particle swarm optimization algorithm using design of experiment and data mining techniques,” Structural and Multidisciplinary Optimization, vol. 52, no. 4, pp. 813–826, 2015.
- S. Chatterjee, S. Sarkar, S. Hore, N. Dey, A. S. Ashour, and V. E. Balas, “Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings,” Neural Computing and Applications, vol. 28, no. 8, pp. 2005–2016, 2016.
- Y. Xue, T. Tang, and A. X. Liu, “Large-scale feedforward neural network optimization by a self-adaptive strategy and parameter based particle swarm optimization,” IEEE Access, vol. 7, pp. 52473–52483, 2019.
- F. E. F. Junior and G. G. Yen, “Particle swarm optimization of deep neural networks architectures for image classification,” Swarm and Evolutionary Computations, vol. 49, pp. 62–74, 2019.
- O. Tarkhaneh and H. Shen, “Training of feedforward neural networks for data classification using hybrid particle swarm optimization, Mantegna Levy flight and neighbourhood search,” Heliyon, vol. 5, no. 4, Article ID e01275, 2019.
- A. Herliana, T. Arifin, S. Susanti, and A. B. Hikmah, “Feature selection of diabetic retinopathy disease using particle swarm optimization and neural network,” in Proceedings of the 2018 6th International Conference on Cyber and IT Service Management (CITSM), pp. 1–4, Parapat, Indonesia, August 2018.
- M. K. Sarkaleh and A. Shahbahrami, “Classification of ECG arrhythmias using discrete wavelet transform and neural networks,” International Journal of Computer Science, Engineering and Applications, vol. 2, no. 1, pp. 1–13, 2012.
- R. J. Schalkoff, Artificial Neural Networks, McGraw-Hill, New York, NY, USA, 1997.
- G. P. Zhang, “Neural networks for classification: a survey,” IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), vol. 30, no. 4, pp. 451–462, 2000.
- M. Castellani, “Evolutionary generation of neural network classifiers-An empirical comparison,” Neurocomputing, vol. 99, pp. 214–229, 2013.
- G. E. Hinton, J. L. Mcclelland, and D. Rumelhart, “Distributed representations,” Parallel Distributed Processing:explorations in the Microstructure of Cognition: Foundation, MIT Press, Cambridge, MA, USA, 1986.
- M. Matsumoto and T. Nishimura, “Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator,” ACM Transactions on Modeling and Computer Simulation (TOMACS), vol. 8, no. 1, pp. 3–30, 1995.
- I. Y. M. Sobol’, “On the distribution of points in a cube and the approximate evaluation of integrals,” Zhurnal Vychislitel’noi Matematiki I Matematicheskoi Fiziki, vol. 7, no. 4, pp. 784–802, 1967.
- J. H. Halton, “Algorithm 247: radical-inverse quasi-random point sequence,” Communications of the ACM, vol. 7, no. 12, pp. 701-702, 1964.
- F. Panneton, P. L’ecuyer, and M. Matsumoto, “Improved long-period generators based on linear recurrences modulo 2,” ACM Transactions on Mathematical Software (TOMS), vol. 32, pp. 11–16, 2006.
Copyright © 2021 Waqas Haider Bangyal et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.