Computational Intelligence and Neuroscience

Volume 2015, Article ID 369298, 20 pages

http://dx.doi.org/10.1155/2015/369298

## Designing Artificial Neural Networks Using Particle Swarm Optimization Algorithms

^{1}Instituto en Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Ciudad Universitaria, 04510 Mexico City, DF, Mexico^{2}Intelligent Systems Group, Faculty of Engineering, La Salle University, Benjamín Franklin 47, Colonia Condesa, 06140 Mexico City, DF, Mexico

Received 4 March 2015; Revised 1 June 2015; Accepted 2 June 2015

Academic Editor: Cheng-Jian Lin

Copyright © 2015 Beatriz A. Garro and Roberto A. Vázquez. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Artificial Neural Network (ANN) design is a complex task because its performance depends on the architecture, the selected transfer function, and the learning algorithm used to train the set of synaptic weights. In this paper we present a methodology that automatically designs an ANN using particle swarm optimization algorithms such as Basic Particle Swarm Optimization (PSO), Second Generation of Particle Swarm Optimization (SGPSO), and a New Model of PSO called NMPSO. The aim of these algorithms is to evolve, at the same time, the three principal components of an ANN: the set of synaptic weights, the connections or architecture, and the transfer functions for each neuron. Eight different fitness functions were proposed to evaluate the fitness of each solution and find the best design. These functions are based on the mean square error (MSE) and the classification error (CER) and implement a strategy to avoid overtraining and to reduce the number of connections in the ANN. In addition, the ANN designed with the proposed methodology is compared with those designed manually using the well-known Back-Propagation and Levenberg-Marquardt Learning Algorithms. Finally, the accuracy of the method is tested with different nonlinear pattern classification problems.

#### 1. Introduction

Artificial Neural Networks (ANNs) are system composed of neurons organized in input, output, and hidden layers. The neurons are connected to each other by a set of synaptic weights. An ANN is a powerful tool that has been applied in a broad range of problems such as pattern recognition, forecasting, and regression. During the learning process, the ANN continuously changes their synaptic values until the acquired knowledge is sufficient (until a specific number of iterations is reached or until a goal error value is achieved). When the learning process or the training stage has finished, it is mandatory to evaluate the generalization capabilities of the ANN using samples of the problem, different to those used during the training stage. Finally, it is expected that the ANN can classify with an acceptable accuracy the patterns from a particular problem during the training and testing stage.

Several classic algorithms to train an ANN have been proposed and developed in the last years. However, many of them can stay trapped in nondesirable solutions; that is, they will be far from the optimum or the best solution. Moreover, most of these algorithms cannot explore multimodal and noncontinuous surfaces. Therefore, other kinds of techniques, such as bioinspired algorithms (BIAs), are necessary for training an ANN.

BIAs have a good acceptance by the Artificial Intelligence community because they are powerful optimization tools and can solve very complex optimization problems. For a given problem, BIAs can explore big multimodal and noncontinuous search spaces and can find the best solution, near the optimum value. BIAs are based on nature’s behavior described as* swarm intelligence*. This concept is defined in [1] as a property of systems composed of unintelligent agents with limited individual capabilities but with an intelligent collective behavior.

There are several works that use evolutionary and bioinspired algorithms to train ANN as another fundamental form of learning [2]. Metaheuristic methods for training neural networks are based on local search, population methods, and others such as cooperative coevolutionary models [3].

An excellent work where the authors show an extensive literature review of evolutionary algorithms that are used to evolve ANN is [2]. However, most of the reported researches are focused only on the evolution of the synaptic weights, parameters [4], or involve the evolution of the neuron’s numbers for hidden layers, but the number of hidden layers is established previously by the designer. Moreover, the researches do not involve the evolution of transfer functions, which are an important element of an ANN that determines the output of each neuron.

For example, in [5], the authors proposed a method that combines Ant Colony Optimization (ACO) to find a particular architecture (the connections) for an ANN and Particle Swarm Optimization (PSO) to adjust the synaptic weights. Other researches like [6] implemented a modification of PSO mixed with Simulated Annealing (SA) to obtain a set of synaptic weights and ANN thresholds. In [7], the authors use Evolutionary Programming to get the architecture and the set of weights with the aim to solve classification and prediction problems. Another example is [8] where Genetic Programming is used to obtain graphs that represent different topologies. In [9], the Differential Evolution (DE) algorithm was applied to design an ANN to solve a weather forecasting problem. In [10], the authors use a PSO algorithm to adjust the synaptic weights to model the daily rainfall-runoff relationship in Malaysia. In [11], the authors compare the back-propagation method versus basic PSO to adjust only the synaptic weights of an ANN for solving classification problems. In [12], the set of weights are evolved using the Differential Evolution and basic PSO.

In other works like [13], the three principle elements of an ANN are evolved at the same time: architecture, transfer functions, and synaptic weights. The authors proposed a New Model of a PSO (NMPSO) algorithm, while, in [14], the authors solve the same problem by means of a Differential Evolution (DE) algorithm. Another example is [15], where the authors used an Artificial Bee Colony (ABC) algorithm to evolve the design of an ANN with two different fitness functions.

This research has significant contributions in comparison with these last three works. First of all, eight fitness functions are proposed to deal with three common problems that emerge during the design of the ANN: accuracy, overfitting, and reduction of the ANN. In that sense, to handle better the problems that emerge during the design of the ANN, the fitness functions take into account the classification error, mean square error, validation error, reduction of architectures, and a combination of them. Furthermore, this research explores the behavior of three bioinspired algorithms using different values for their parameters. During the experimentation phase, the best parameter’s values for these algorithms are determined to obtain the best results. In addition, the best configuration is used to generated a set of statistically valid experiments for each selected classification problem. Moreover, the results obtained with the proposed methodology in terms of the connection’s number, the neuron’s number, and the transfer functions selected for each ANN are presented and discussed. Another contribution of this research is related to a new metric that allows comparing efficiently the results provided by an ANN generated with the proposed methodology. This metric takes into account the recognition rate obtained during training and testing stages where testing accuracy is more weighted in comparison to training accuracy. Finally, the results achieved by the three bioinspired algorithms are compared against those achieved with two classic learning algorithms. The selection of the three bioinspired algorithms was done because NMPSO is a relatively new algorithm (proposed in 2009) which is based on the metaphor of basic PSO technique so it is important to compare its performance with others inspired in the same phenomenon.

In general, it is possible to define the problem to be solved as giving a set of input patterns , , and a set of desired patterns , , and finding the ANN represented by such that a function defined by is minimized and defined the maximum number of neurons. It is important to remark that the search space involves three different domains (architecture, synaptic weight, and transfer functions).

This research provides a complete study about how an ANN can be automatically designed by applying bioinspired algorithms, particularly using the Basic Particle Swarm Optimization (PSO), Second Generation PSO (SGPSO), and New Model of PSO (NMPSO). The proposed methodology evolves at the same time the architecture, the synaptic weights, and the kind of transfer functions in order to design the ANNs that provide the best accuracy for a particular problem. Moreover, a comparison of the Particle Swarm algorithm performance versus classic learning methods (back-propagation and Levenberg-Marquardt) is presented. In addition, in this research is presented a new way to select the maximum number of neurons (MNN). The accuracy of the proposed methodology is tested solving some real and synthetic pattern recognition problems. In this paper, we show the results obtained with ten classification problems of different complexities.

The basic concepts concerning the three PSO algorithms and ANN are presented in Sections 2 and 3, respectively. In Section 4 the methodology and the strategy used to design the ANN automatically are described. In Section 5 the eight fitness functions used in this research are described. In Section 6, the experimental results about tuning the parameters for PSO algorithms are described. Moreover, the experimental results are outlined in Section 7. Finally, in Sections 8 and 9 the general discussion and conclusions of this research are given.

#### 2. Particle Swarm Optimization Algorithms

In this section, three different algorithms based on PSO metaphor are described. The first one is the original PSO algorithm. Then, two algorithms which improve the original PSO are shown: the Second Generation of PSO and a New Model of PSO.

##### 2.1. Original Particle Swarm Optimization Algorithm

The Particle Swarm Optimization (PSO) algorithm is a method for the optimization of continuous nonlinear functions proposed by Eberhart et al. [16]. This algorithm is inspired by observations of social and collective behavior on the movements of bird flocks in search of food or survival as well as fish schooling. A PSO algorithm is inspired on the movements of the best member of the population and at the same time also on their own experience. The metaphor indicates that a set of solutions is moving in a search space with the aim to achieve the best position or solution.

The population is considered as a cumulus of particles where each represents a position , in a multidimensional space. These particles are evaluated in a particular optimization function to recognize their fitness value and save the best solution. All the particles change their position in the search space according to a velocity function which takes into account the best position of a particle in a population (i.e., social component) as well as their own best position (i.e., cognitive component). The particles will move in each iteration to a different position until they reach an optimum position. At each time , the particle velocity is updated usingwhere is the inertia weight and typically set up to vary linearly from to during the course of an iteration run; and are acceleration coefficients; and are uniformly distributed random numbers between . The velocity is limited to the range . Updating velocity in this way enables the particle to search for its best individual position , and the best global particle position is computed as in

##### 2.2. Second Generation of PSO Algorithm

The SGPSO algorithm [17] is an improvement of the original PSO algorithm that considers three aspects: the local optimum solution of each particle, the global best solution, and a new concept, the geometric center of optimum swarm. The authors explain that the birds keep a certain distance from the swarm center (food). On the other hand, no bird accurately calculates the position of the swarm center every time. Bird flocking always stays in the same area for a specified time, during which the swarm center will be kept fixed in every bird eyes. Afterward, the swarm moves to a new area. Then all birds must keep a certain distance in the new swarm center. This fact is the basis of the SGPSO.

The position of the geometric centre of the optimum swarm is updated according towhere is the number of particles in the swarm, CI is the current iteration number, and is the geometric centre updating time of optimum swarm with a value between .

In SGPSO the velocity is updated by (4) and the position of each particle by (5):where , , and are constants called acceleration coefficients, , , and are random numbers in the range , and is the velocity inertia.

##### 2.3. New Model of Particle Swarm Optimization

This algorithm was proposed by Garro et al. [13] and is based on some ideas that other authors proposed to improve the basic PSO algorithm [4]. These ideas are described in next paragraphs.

Shi and Eberhart [18] proposed a linearly varying inertia weight over the course of generations, which significantly improves the performance of Basic PSO. The following equation shows us how to compute the inertia:where and are the initial and final values of the inertia weight, respectively, iter is the current iteration number, and is the maximum number of allowable iterations. The empirical studies in [18] indicated that the optimal solution could be improved by varying the value of from 0.9 at the beginning of the evolutionary process to 0.4 at the end of the evolutionary process.

Yu et al. [4] developed a strategy that when the global best position is not improving with the increasing number of generations, each particle will be selected by a predefined probability from the population, and then a random perturbation is added to each velocity vector dimension of the selected particle . The velocity resetting is computed as inwhere is a uniformly distributed random number in the range and is the maximum random perturbation magnitude to each selected particle dimension.

Based on some evolutionary schemes of Genetic Algorithms (GA), several effective mutation and crossover operators have been proposed for PSO. Løvberg et al. [19] proposed a crossover operator in terms of a certain crossover rate defined inwhere is a uniformly distributed random number in the range , is the offspring, and and are the two parents randomly selected from the population.

The offspring velocity is calculated in the following equation as the sum of the two parents velocity vectors, normalized to the original length of each parent velocity vector:

Higashi and Iba [20] proposed a Gaussian mutation operator to improve the performance of PSO in terms of a certain mutation rate defined in where is the offspring, is the parent randomly selected from the population, is the current iteration number and is the maximum number of allowable iterations, and is a Gaussian distribution. Utilization of these operators in PSO has the potential to achieve faster convergence and find better solutions.

Mohais et al. [6, 21] used random neighborhoods in PSO, together with dynamism operators.

In the NMPSO, the use of dynamic random neighborhoods that change in terms of certain rates is proposed. First of all, a maximum number of neighborhoods is defined in terms of population size divided by 4. With this condition at least each neighborhood , , will have 4 members. Then, the members of each neighborhood are randomly selected, and the best particle is computed. Finally, the velocity of each particle is updated as infor all , .

The NMPSO combines the varying schemes of inertia weight and acceleration coefficients and , velocity resetting, crossover and mutation operators, and dynamic random neighbourhoods [13]. The NMPSO algorithm is described in Algorithm 1.