Abstract

Quantum-behaved particle swarm optimization (QPSO) is an improved version of particle swarm optimization (PSO) and has shown superior performance on many optimization problems. But for now, it may not always satisfy the situations. Nowadays, problems become larger and more complex, and most serial optimization algorithms cannot deal with the problem or need plenty of computing cost. Fortunately, as an effective model in dealing with problems with big data which need huge computation, MapReduce has been widely used in many areas. In this paper, we implement QPSO on MapReduce model and propose MapReduce quantum-behaved particle swarm optimization (MRQPSO) which achieves parallel and distributed QPSO. Comparisons are made between MRQPSO and QPSO on some test problems and nonlinear equation systems. The results show that MRQPSO could complete computing task with less time. Meanwhile, from the view of optimization performance, MRQPSO outperforms QPSO in many cases.

1. Introduction

With the development of information science, more and more data is stored, such as web content and bioinformatics data. For this reason, many basic problems have become more and more complex, which makes great troubles to current intelligent algorithm. As one of the most important issues in artificial intelligence, optimization problem in real-world applications also becomes harder and harder to be solved.

In the past 30 years, evolutionary algorithm is becoming one of the most effective intelligent optimization methods. In order to face the new challenge, distributed evolutionary algorithms (dEAs) have been blossomed rapidly. The paper [1] provides a comprehensive survey of the distributed EAs and some models are summarized. Master-slave, island, cellular, hierarchical, pool, coevolution, and multiagent models are listed and introduced. And the different models are analyzed from aspects like parallelism level, communication cost, scalability, and fault-tolerance. And some hotspots of dEAs, such as cloud and MapReduce-based implementations and GPU and CUDA-based implementations, are listed. But no results of dEAs on distributed computing devices are reported. Cloud can be applied in many aspects, and [28] have realized various specific applications of cloud. The paper [9] gives a review of the parallel and distributed genetic algorithms in graphics processing unit (GPU). Some works along this idea are reported, such as [1012]. Cloud and MapReduce is a new and effective technology to deal with big data, which is proposed by Google in 2004 [13]. To respond to the requirement of parallelization and distribution, this physical platform is very convenient to deploy an algorithm to update to be parallel. The programmers only need to consider the map function and reduce function, and the other details are provided by the model itself. Many practical problems are solved with MapReduce model and cluster of servers, such as path problem in large-scale networks [14], seismic signal analysis [15], image segmentation [16], and location recommendation [17]. But the study about MapReduce-based EAs is still in initial stage. Although some genetic algorithms [1823] and particle swarm optimization realized by MapReduce [24] have been proposed. There are still many kinds of EAs which are not implemented with distributed model and parallel potential of these algorithms is not released. Based on these considerations, in our previous work [25], MapReduce is combined with coevolutionary particle swarm optimization, which shows that MapReduce-based CPSO obtain much better performance than CPSO. In another work [26], the quantum-behaved particle swarm optimization is transformed on MapReduce successfully. And the idea of this paper is based on it and continues extending that the background is introduced and a practical application is added.

Quantum mechanics and trajectory analysis gained extensive attention of scholars recently and sparkled in many areas, such as image segmentation [27], neural network [28], and population-based algorithms [29, 30]. In [31], Zhang presents a systematic review of quantum-inspired evolutionary algorithms. Quantum-behaved particle swarm optimization is a kind of PSO proposed by Sun et al. in 2004 [32]. Inspired by movement of particle in quantum space, a new reproduction operator of solution is proposed in this algorithm. Because a particle could arrives at any location in quantum space with a certain probability, a new solution at any location in feasible space also could be generated with a certain probability in QPSO. This mechanism is helpful for particles to avoid falling in local optimum. Some more detailed analysis has been reported in [33]. Unfortunately, when the algorithm faces large-scale and complex problem, the increasing computational cost becomes the bottleneck of this algorithm and without enough computing resource premature phenomenon could not been avoided, which urges the original algorithm to be parallel.

In order to follow this trend and enhance the capabilities of a standard QPSO, the MapReduce quantum-behaved particle swarm optimization is developed. The MRQPSO transplants the QPSO on MapReduce model and makes the QPSO parallel and distributed through partitioning the search space. Through the comparisons of MRQPSO and standard QPSO, it could be found that the proposed MRQPSO decreases the time of same function evaluations. And on some test problems QPSO increases the performance of solution and is more robust than QPSO.

The rest of this paper is organized as follows. Section 2 introduces the PSO and QPSO. Section 3 gives a brief presentation of MapReduce model. Section 4 describes the details of QPSO implementing on MapReduce. In Section 5, we show and analyze results of experiments, including the comparison with QPSO. Finally Section 6 concludes the work in this paper.

2. PSO and QPSO

2.1. The Particle Swarm Optimization Algorithm

Inspired by bird and fish flocks, Kennedy and Eberhart proposed PSO algorithm in 1995 [34]. This algorithm is a population-based intelligent search algorithm. In order to find the food as quickly as possible, the birds in a flock would trace their companions that are near to the food firstly. Then they would determine accurate area of food. The individual of PSO searches the optimum like the bird in a flock. Each particle has velocity and position. And the two parameters would be updated according to best value and global best value of the particles. The velocity and position of particle at the dimension are presented by and , respectively. The updating equation could be described aswhere and are the velocity and position. represents the th iteration. and are best value and global best value of the particle, respectively. and are random number uniformly distributed in . , , and are three parameters of the algorithm. is initial weight proposed by Shi and Eberhart in 1998 [35] to control the balance of local and global optimum. and are the accelerated coefficients or learning factors. Usually, .

From the above equations, it could be found that few parameters are used in PSO, which makes PSO easy to be controlled and used. Meanwhile, it has better convergence performance and quicker convergence speed. These advantages make the PSO algorithm gain a lot of research attention. However, the PSO is not a global optimization algorithm [36]. The limited velocity constrains the search space in a limited area. So the PSO could not always find the global optimum. In other words, the premature convergence is the most serious drawback of the PSO.

2.2. The Quantum-Behaved Particle Swarm Optimization Algorithm

To overcome the shortcoming of the original PSO algorithm, Sun et al. proposed the quantum-behaved particle swarm optimization (QPSO) in 2004 [32]. This algorithm has a more superior performance comparing to the PSO. QPSO algorithm transfers the search space from classical space to quantum space. Particles can appear at any position, which implement the full search in the solution space.

According to uncertainty principle, the velocity and position of a particle cannot be determined simultaneously. In quantum space, a probability function of the position where a particle appears could be obtained from Schrodinger equation. The true position of one particle could be measured by Monte Carlo method. Based on these ideas, in QPSO, a local attractor is constructed by particle best solutions and global best solution as (2) for each particle. where is a local attractor of the particle at the dimension is random number distributed in . is the particle best solution. is the current global best solution.

The position of the particle is updated bywhere is the only parameter in the algorithm called creativity coefficient, which is a positive real number, to adjust the balance of local and global search. The definition of refers to (4). is the maximum number of iterations. is random number distributed in , and is the mean position and defined as follows: where is the size of population. is the global extremum of particle .

In QPSO, first step is initializing the population randomly, which concludes the position of each particle, particle best value, and global best value. Next, calculate the mean position of th dimension according to (5). Then, particle is evaluated again and the best and global best solution of the particles would be updated according to the fitness value. After that, the particle is updated as (2) and (3). When the number of iterations or accuracy requirements is satisfied, stop running and output the optimum.

Although the QPSO algorithm is superior to PSO, it still has some disadvantages. Because the particles in the QPSO fly discretely, the narrow area where the optimum is may be missed. In the case of too much computation, QPSO may spend too much time.

3. MapReduce

MapReduce [13] is a programming model proposed by Dean and Ghemawat. Inspired by the map and reduce primitives present in Lisp and many other functional languages, this model is created for processing the large-scale data in parallel. The infrastructure of MapReduce provides detailed implementations of communications, load balancing, fault-tolerance, resource allocation, file distribution, and so forth [1]. Programmers do not need a lot of knowledge and experiments about parallel and distributed programming. They only need to pay attention to map and reduce function which the model consists of and then can implement algorithm to parallelization easily.

In this model, the computation takes a set of key/value pairs. The map function processes the input key/value pairs and then emits new lists of key/value pairs, called intermediate key/value pairs. The type of these two lists may be from different domain. The map function is called independently, and the parallelization is implemented in this way. After all map functions’ processions are completed, the reduce function is called. Intermediate key/value pairs are grouped and passed to reduce function. The reduce phase merges and integrates these intermediate key/value pairs and outputs the output key/value pairs finally. And the type of intermediate and output pairs must be the same. The type of map and reduce functions can be written as follows:

Because Google has not released the system to the public, Hadoop, the Apache Lucene project developed, has been used generally. This Java-based open-source platform is a clone of MapReduce infrastructure, and we can use it to design and implement our distributed computation.

4. The MRQPSO Algorithm

The particle swarm optimization algorithm [34] is one of the popular evolutionary algorithms. It has attracted much attention because of the merits of simple concept, rapid convergence, and good quality of solution. However, this algorithm is bothered by some weakness, such as premature phenomenon. Focusing on the shortcoming of original PSO, Sun et al. proposed an uncertain and global random algorithm named quantum-behaved particle swarm optimization (QPSO) in 2004 [32]. The new one puts the search space into quantum space to let the particle move to any location with different probability. Through this strategy, the premature phenomenon could be solved to a certain degree.

Although the QPSO has satisfying progress on premature phenomenon, it has not been prepared to challenge of problems with complex landscape or needing huge computation to be solved. Due to the particles of the QPSO flying discretely, they may miss the narrow area where the global optimum is. And as the problem is getting complex, the computational cost increases. So we implement the QPSO parallel and distributed by transplanting the algorithm on MapReduce model and we name this algorithm MRQPSO. The framework of MRQPSO could be described as in Algorithm 1, and the flowchart is shown as in Figure 1.

Step 1. Divide feasible space into several subspace;
Step 2. Construct Mapper which performs QPSO on one subspace and outputs the obtained optimal
solution on this subspace;
Step 3. Construct Reducer which selects the best optimal solution on different subspace from mapper;
Step 4. Output the best optimal solution and its functional value.

The proposed MRQPSO partitions the search space into many subspaces. For -dimension search space, the range of each dimension is cut to be parts ; then the space-partition is completed, and subspace is obtained [25]. Then using several servers, several mappers perform QPSO on different subspace in parallel and independently. After all the mappers finished the calculation, the reducer merges and integrates the immediate value and outputs the best solution. The space-partition helps the particle get distributed uniformly, which ensures all areas have the particles fall in at the initialization phase. It is effective to avoid the particles overflying the narrow zone where the optimum may lie. And the parallel mappers could help MRQPSO to save time cost.

4.1. MRQPSO Map Function

Algorithm 1 shows the pseudo code of the map function of proposed MRQPSO. is the position of particle with best value and is the position of solution with global best value. Several subspaces are saved as records and form data block. The mapper is called when a block starts a QPSO procedure. The input key/value pairs denote the massages of data block. The key is the ID of one record and the value is the string of search space. Then the mappers start to process the QPSO in every block independently. Once a block has been explored, another block will follow up immediately. Under ideal conditions, the larger the number of mappers is, the fewer process a single mapper processes, and the fuller the parallelization is. However, the mapper would spend time to be started in fact. If the data is big enough, the starting time can be neglected. But in our experiments, it will influence the results more or less.

After being processed by mappers, the immediate key/value pairs change to denote the information of and global optimum of current data block, showed as Algorithm 2. And then the immediate key/value pairs are ready to transport to the reduce phase.

function mapper (key, value)
initialize the positions of all particles
evaluate the function values of positions then select thebest andbest
// update the particle
while the termination condition is not met
    calculate thebest and
    for each particle
      update thebest andbest
      for each dimension
        update position
      
    
    calculation + 1
emit a message (ID of best, string of best and fitness)
4.2. MRQPSO Reduce Function

The reduce function is in charge of merging and integrating the information which the mapper emitted. As Algorithm 3 shows, the reducer of MRQPSO is used to select the minimum from all subspaces. The mappers produced and transported the immediate key/value pairs which are received by the reducer after all mappers completed their work. At the reduce phase, all best and corresponding fitness of blocks are compared with each other. And the minimum of them is selected and outputted finally.

function reducer (key, value)
  combine the message emitted from mapper
  // get thebest and global optimum
  for each data block
    if fitness < global best
      global best = fitness
    
  
  emitbest and global optimum

5. Experiment Result and Analysis

5.1. Performance of MRQPSO on Test Problems

To validate the proposed MRQPSO algorithm, we selected 8 functions to evaluate the ability of solving complex problems firstly. The scalable optimization problems are proposed in the CEC 2013 Special Session on Real-Parameter Optimization [37] and listed in Table 1. All the test composition functions are in the same search range:. And they are all minimization problem with global optimum zeros. is the number of kinds of basic benchmark functions.

Some parameter settings and environment are listed as follows.

We compared our proposed MRQPSO with original QPSO algorithm to test the optimization performance. Each function is run for 20 independent times and all the results are recorded in Tables 25. All experiments are run for 213×900 function evaluations. .(1)QPSO: this algorithm transforms search space from classic space to quantum space. In quantum space, particles can appear at any position and avoid premature convergence to some degrees. The population size of QPSO is 10.(2)MRQPSO: this algorithm is a QPSO implementation on the MapReduce model, which achieves the parallel and distributed QPSO. The population sizes of MRQPSO were 10, 20, and 30, respectively, denoted by s on Table 2. The search space is partitioned into 211 blocks averagely.

All experiments are run on VMware Workstation virtual machines version 12.0.0: one processor, 1.0 GB RAM. Hadoop version 1.1.2 and Java 1.7 were used in MapReduce experiments; we used three virtual machines while serial algorithm used one. CPU is core i7. Programming language is Java.

In Table 2, the best, worst, mean value, standard derivation, and running time of MRQPSO with different population sizes are listed. According to the results, as the population size becomes larger, the solutions become worse. It may be because that when the number of particles increases, the number of iterations of each particle decreased for the same function evaluations, which is not helpful to improve accuracy.

5.2. Comparison with QPSO on Test Problems

The results of MRCPSO are compared with QPSO algorithm in Tables 3 and 4. The population size is set to 10. We show two columns for each item to compare two algorithms clearly. From Table 3, MRQPSO has a better solution almost on all items. For F2 and F3, although the best value of QPSO is lower, MRQPSO meet this value nearly. Since these two values are close, one can consider that two algorithms are trapped into the same local optimum. And due to the less number of iterations of each particle, the MRQPSO cannot converge to a lower point like QPSO. In general, the MRQPSO has a better performance on mean value and standard derivation; this suggests the MRQPSO is more capable of searching for the optimum and overcoming the premature phenomenon and is more robust and steady than original QPSO.

The notable advantage in time is presented in Table 4. From this chart, the MRQPSO is more effective in saving time cost. And it seems that the more time QPSO spend, the more advantage MRQPSO has. Normally, it takes some time to start mapper. When a problem is so simple that the serial algorithm processes fast, the outstanding benefit of rapid convergence may weaken, such as F1–F3. But when search time gets longer, the mapper starting time even can be negligible, such as F4–F7, where the MRQPSO programs’ running time reduced to half compared to the QPSO.

To summarize, we can discover that the MRQPSO has better solution performance and cost less running time. The proposed MRQPSO is more suitable and effective for dealing with complex problems.

5.3. Comparisons on Nonlinear Equation Systems

Nonlinear equation systems arise in many areas, such as economics [38], engineering [39], chemistry [40], mechanics [41], medicine [42], and robotics [43], widely.

Generally, a nonlinear equation system could be described as [44] is the number of equations. is the dimension of variable. is the th equation in the system. Usually, at least one equation is nonlinear. If one solution could give all the equations in the system true statement, this solution is an optimal solution of this equation system.

In order to obtain the optimal solutions of a nonlinear equation system, an optimization problem like (8) or (9) could be constructed. Optimal solutions of (8) or (9) are the optimal solutions of nonlinear equation system (7)

or

In this article, optimization problem like (8) is used to deal with one nonlinear equation system. And MRQPSO are compared with QPSO on Fun 1–Fun 3. The details of these three problems are listed as follows.

Fun 1:where is 20. Feasible area is . Both equations are nonlinear. And 2 theoretical optimal solutions exist.

Fun 2:where is 6. Feasible area is . All the six equations are nonlinear. And infinite theoretical optimal solutions exist.

Fun 3:where is 20. Feasible area is . In the system, one equation is linear and other nineteen equations are nonlinear. Infinite theoretical optimal solutions exist.

Some parameters and environment used on solving nonlinear equation system are listed as follows. Each algorithm is performed 20 times on each problem independently. All experiments are run for 210×1000 function evaluations. The population size is 10. And search space of MRQPSO is partitioned into 210 blocks. Results from QPSO and QPSO are compared and reported in Table 4.

Two aspects are considered in the comparisons. One is running time of the two algorithms. The other is the obtained minimized objective function value. The results are reported in Table 4. The better results are marked with blackbody.

From Table 5, it could be found that both the two algorithms have good performance on objective function value. On Fun 1 and Fun 2, MRQPSO have slight advantage than QPSO on mean and max value. On Fun 3 and min value, QPSO has advantage than QPSO. Here it seems that QPSO obtained much better result on min value of Fun 1. But actually, both of the solutions obtained by MRQPSO and QPSO are very close to the theoretical optimal solutions. But in MRQPSO, the computing resource is assigned on different areas. So during the latter search of MRQPSO, no computing resource as much as QPSO could be used to improve accuracy. This may be the reason for the worse performance of MRQPSO on min value.

But from the view of time cost, it is clear that MRQPSO outperformed QPSO on all the cases. And the advantage is significant. Because three virtual devices are used to evaluate solutions in feasible space at the same time, the computing task could be completed with less time.

6. Conclusion

This paper developed a MRQPSO algorithm and implemented serial QPSO into the MapReduce model, speculative parallelization, and distribution of QPSO. The proposed method was applied to solve the composition benchmark functions and nonlinear equation systems and got satisfactory solutions basically. Moreover, from the comparisons between MRQPSO and QPSO, the results showed us the parallel one outperformed the serial one on both quality of solution and time cost. MRQPSO can be considered as a suitable algorithm to solve large-scale and complex problems. In order to solve more complex practical problems, a cluster with more servers is needed to be constructed and used to test the performance of MRQPSO, which would be a further work.

Informed consent was obtained from all individual participants included in the study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (nos. 61272279, 61272282, 61371201, and 61203303), the Program for New Century Excellent Talents in University (no. NCET-12-0920), the Program for New Scientific and Technological Star of Shaanxi Province (no. 2014KJXX-45), the National Basic Research Program (973 Program) of China (no. 2013CB329402), the Program for Cheung Kong Scholars and Innovative Research Team in University (no. IRT_15R53), and the Fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project) (no. B07048).