Computational Intelligence and Metaheuristic Algorithms with ApplicationsView this Special Issue
Towards the Novel Reasoning among Particles in PSO by the Use of RDF and SPARQL
The significant development of the Internet has posed some new challenges and many new programming tools have been developed to address such challenges. Today, semantic web is a modern paradigm for representing and accessing knowledge data on the Internet. This paper tries to use the semantic tools such as resource definition framework (RDF) and RDF query language (SPARQL) for the optimization purpose. These tools are combined with particle swarm optimization (PSO) and the selection of the best solutions depends on its fitness. Instead of the local best solution, a neighborhood of solutions for each particle can be defined and used for the calculation of the new position, based on the key ideas from semantic web domain. The preliminary results by optimizing ten benchmark functions showed the promising results and thus this method should be investigated further.
Searching for the optimal solutions of the hardest real-world problems is an active field especially in computer science. An eternal desire of computer scientists is to develop a general problem solver that will be able to cope with all classes of real-world problems. Unfortunately, the most of the so-called clever algorithms are subject of the No Free Lunch Theorem . Regarding this theorem, if one algorithm is good on one class of problems, it does not mean that it will also be good on the other classes of problems. Especially, three domains of algorithms have recently been appeared in the role of general problem solver, as follows: Artificial Intelligence (AI) , evolutionary algorithms (EA) , and Swarm Intelligence (SI) . While the former mimics operating a human brain, the latter domains are inspired by nature. Evolutionary algorithms are inspired by Darwinian principles of natural evolution  according to which the fittest individuals have the greater possibilities for survival and pass on their characteristics to their offspring during a process of reproduction.
Nowadays, evolutionary computation (AC)  captures the algorithms involved in evolutionary domain and it considers genetic algorithms (GA) , genetic programming , evolution strategies (ES) , evolutionary programming , and differential evolution (DE) [11–13]. The mentioned algorithms differ between each other according to representation of individual. As a result, these kinds of algorithms have been applied to various optimization, modeling, and simulation problems.
However, this paper concentrates on the SI domain that is concerned with the design of multiagent systems with applications, for example, in optimization and in robotics . Inspiration for the design of these systems is taken from the collective behavior of social insects, like ants, termites, and bees, as well as from the behavior of other animal societies, like flocks of birds or schools of fish. Recently, there exist a lot of different algorithms from this domain that is still being developed. Let us mention only the most important members of the SI algorithms, as follows: the particle swarm optimization (PSO) , the firefly algorithm (FA) [15, 16], cuckoo search , the bat algorithm (BA) [18, 19], and so forth.
The PSO is population-based algorithm that mimics movement of the swarm of particles (e.g., birds) by flying across a landscape, thus searching for food. Each particle in PSO represents the candidate solution of the problem to be solved. Position of the particle consists of the problem parameters that are modified when the virtual particle is moved in the search space. The motion depends on the current particle position and the current position of local best and global best solutions, respectively. The local best solutions denote the best solutions that are whenever arisen on the definite location in the population, while the current best is the best solution whenever found in the whole population. This solution has the main impact on the direction of moving the swarm towards the optimal solution. When this solution is not improved anymore, the population gets stuck into a local optimum.
Mainly, we focused on the new definition of the neighborhood within the PSO algorithm. In place of the local best solutions, the neighborhood is defined using the predefined radius of fitness values around each candidate solution, thus capturing all candidate solutions with the fitness value inside the predefined virtual radius. The size of this neighborhood can be variable. Therefore, at least one but maximum three candidate solutions can be permitted to form this neighborhood. Although this is not the first try how to define the variable neighborhood within the PSO algorithm [20–22], in this paper, such neighborhood is defined using the Resource Description Framework (RDF), SPARQL Protocol, and RDF query language (SPARQL) tools taken from semantic web domain. As a result, the modified RDF-PSO algorithm was developed. Both web tools are appropriate for describing and manipulating decentralized and distributed data. On the other hand, the original PSO algorithm maintains a population of particles that are also decentralized in their nature. An aim of using these web tools was to simulate a distributed population of particles, where each particle is placed on the different location in the Internet.
The remainder of this paper is structured as follows. In Section 2, we outline a short description of the PSO algorithm. The section is finished by introducing the semantic web tools, that is, RDF and SPARQL. Section 3 concentrates on development of the modified RDF-PSO algorithm. Section 4 presents the conducted experiments and results obtained with the RDF-PSO algorithms. Finally, Section 5 summarizes our work and potential future directions for the future work are outlined.
2.1. Particle Swarm Optimization
Particle swarm optimization (PSO) was one of the first SI algorithms to be presented at an International Conference on Neural Networks by Kennedy and Eberhart in 1995 . PSO is inspired by the social foraging behavior of some animals such as flocking behavior of birds (Figure 1) and schooling behavior of fish . In nature, there are some individuals with better developed instinct for finding food. According to these individuals, the whole swarm is directed into more promising regions in the landscape.
The PSO is a population-based algorithm that consists of particles representing their position in a -dimensional search space. These particles move across this space with velocity according to the position of the best particle towards the more promising regions of the search space. However, this movement is also dependent on the local best position of each particle and is mathematically expressed, as follows: where , denote the random numbers drawn from the interval , and , are constriction coefficients that determine the proportion, with which the local and global best solutions influence the current solution. Then, the new particle position is calculated according to the following expression: Pseudocode of the PSO algorithm is illustrated in Algorithm 1.
After finishing the initialization in function “init_particles” (Algorithm 1), the PSO algorithm optimizes a problem by iteratively improving the candidate solution [24, 25]. Thus, two functions are applied. The function “evaluate_the_new_solution” calculates the fitness value of particles obtained after initialization or movement. The movement according to (1) and (2) is implemented in the function “generate_new_solution.”
The RDF is an XML application devoted to encoding, exchanging, and reusing structural metadata . It enables the knowledge to be represented in symbolical form. Fortunately, this knowledge is human readable. On the other hand, it is understandable to machines. The main characteristic of this framework is that RDF data can be manipulated on decentralized manner and distributed among various servers on the Internet. Resources identified by Uniform Resource Identifier (URI) are described in RDF graphs, where each resource representing the node has many properties that are associated with the resource using the property-type relationship. This relationship represents an edge in RDF graph. Thus, attributes may be atomic in nature (e.g., numbers, text strings, etc.) or represent other resources with their own properties .
The resource, property-type relation, and attribute present a triplet suitable for presentation in RDF graph. A sample of this graph is presented in Figure 2, from which two triples (also 2-triples) can be translated from the diagram. These 2-triples are written into a RDF database. In general, the format of this RDF data is serialization of N-triples obtained from the RDF graphs. For instance, the description of RDF database obtained from the RDF graph in Figure 2 is presented in Algorithm 2.
RDF enables data to be decentralized and distributed across the Internet. On the other hand, the SPARQL Protocol has been developed for accessing and discovering RDF data. SPARQL is an RDF query language that has its own syntax very similar to SQL queries. The SPARQL query consists of two parts . The former SELECT clause identifies the variables that appear in the query results, while the latter WHERE clause provides the basic patterns that match against the RDF graph. Usually, these query patterns consist of three parts denoting the resource name, property-type relation, and attribute. As a result, the matched patterns are returned by the query. A sample of SPARQL query is illustrated in Algorithm 3.
As a result of query presented in Algorithm 3, the name “John” and surname “Smith” are returned.
3. The Modified RDF-PSO Algorithm
The modified RDF-PSO algorithm implements two features:(i)using the variable neighborhood of candidate solutions in place of the local best solutions,(ii)using the RDF for describing and SPARQL for manipulating this neighborhood.
The main reason for applying these well-known tools from the semantic web domain was to develop a distributed population model that could later be used in other SI algorithms. On the other hand, we try to use the semantic web tools for optimization purposes as well. Fortunately, RDF is suitable tool for describing the distributed population models, in general. In the PSO algorithm, it is applied for describing the relations between particles in population. For our purposes, a relation “is_neigbour_of” is important that each particle determines its neighborhood. Furthermore, SPARQL is used for determining the particles in its neighborhood. As a result, the RDF-PSO algorithm has been established, whose pseudocode is presented in Algorithm 4.
The three main differences distinguish the proposed RDF-PSO with the original PSO algorithm, as follows:(i)no local best solutions that are maintained by the RDF-PSO (lines 10–12 omitted in the Algorithm 1),(ii)defining the neighborhood of candidate solution (line 10 in Algorithm 4),(iii)generating the new solution according to the defined variable neighborhood relation (line 11 in Algorithm 4).
The relation (line 10 in Algorithm 4) is defined according to the following relation: where radius defines the necessary maximum fitness distance of two candidate solutions that can be in neighborhood. In fact, this parameter regulates the number of candidate solutions in the neighborhood.
Here, the radius is expressed as . Indeed, the neighborhood captures all solutions with the fitness differences less than the radius . Typically, when the radius is small, the size of neighborhood can also be small. However, this assertion holds if the population diversity is higher enough. When the particles are scattered across the search space, no particles are located in the vicinity of each other. Consequently, the size of neighborhood becomes zero. On the other hand, when the particles are crowded around some fitter individuals, the number of its neighbors can be increased enormously. In order to prevent this undersizing and oversizing, the neighborhood size is defined in such a manner that it cannot exceed the value of three and cannot be zero; in other words, .
For each observed particle , the new solution is generated according to the number of neighbors in “generate_new_solution” function. The following modified equation is used in RDF-PSO for calculating the velocity: where, , and are the real numbers randomly drawn from the interval , and denote the constriction coefficients, , , and . Thus, it is expected that the movement of more crowded neighborhood depends on more neighbors. Furthermore, the term between square parenthesis ensures that the proportion of each neighbor as determined by constriction coefficients never exceeded the value of one.
3.1. Representation of a Distributed Population
The rapid growth of the Internet means that new kinds of application architectures have been emerged. The Internet applications are suitable to exploit enormous power of the computers connected to this huge network. Typically, these applications search for data distributed on many servers. These data need to be accessed easily, securely, and efficiently.
This paper proposes the first steps of developing the distributed population model within the PSO algorithm. In line with this, the RDF tool is applied that introduces a description of relations between particles in the population. These relations make us possible to manipulate population members on a higher abstraction level. At the moment, only the relation “is_neighbor_of” is implemented that determines the neighborhood of a specific particle in the population.
For this purpose, RDF is devoted for defining the various resources on different Internet servers. In our case, each particle in the population represents the resource that is defined with corresponding property-type relation (e.g., “is_neighbor_of”) and attributes. The RDF graph of the distributed population is illustrated in Figure 3.
The definition of a distributed population in RDF is presented in Algorithm 5, from which it can be seen that two kinds of attributes are encountered in this definition, that is, the references to neighbors of specific particle and its sequence number. Some details are omitted in this algorithm because of the space limitation of this paper. The missing parts of code are denoted by punctuation marks.
3.2. Accessing the Distributed Population
The distributed population in RDF can be accessed using the SPARQL query language, whose syntax is similar to the standard SQL syntax. An example of SPARQL query for returning the neighborhood of fourth particle is represented in Algorithm 6. Note that the SPARQL query from the mentioned algorithm will return all attributes that are related to the “resource4” with the relation “is_neighbor_of.”
3.3. Implementation Details
The proposed RDF-PSO algorithm was implemented in Python programming language and executed on Linux operating system. Additionally, the following libraries were used:(i)rdflib which is a python library for working with RDF ,(ii)NumPy that is the fundamental package for scientific computing with Python (iii)matplotlib that is a python 2D plotting library .
The decision for using Python has been taken because there already existed a lot of the PSO implementation. Furthermore, the RDF and SPARQL semantic tools are also supported in this language and ultimately, programming in Python is easy.
4. Experiments and Results
The goal of our experimental work was to show that the semantic web tools, that is, RDF and SPARQL can be useful for the optimization purposes as well. Moreover, we want to show that using the variable neighborhood in RDF-PSO can also improve the results of the original PSO.
In line with this, the RDF-PSO algorithm was applied to the optimization of ten benchmark functions taken from literature. The function optimization belongs to a class of continuous optimization problems, where the objective function is given and is a vector of design variables in a decision space . Each design variable is limited by its lower and upper bounds. The task of optimization is to find the minimum of the objective functions.
In the remainder of this section, the benchmark suite is described; then, the experimental setup is presented and finally, the results of experiments are illustrated in detail.
4.1. Test Suite
The test suite consisted of ten functions, which were selected from the literature. However, the primary reference is the paper by Yang  that proposed a set of optimization functions suitable for testing the newly developed algorithms. The definitions of the benchmark functions are represented in Table 1, while their properties are illustrated in Table 2.
Table 2 consists of five columns that contain the function identifications (tag ), their global optimum (tag ), the values of optimal design variables (tag ), the lower and upper bounds of the design variables (tag Bound), and their characteristics (tag Characteristics). The lower and upper bounds of the design variables determine intervals that limit the size of the search space. The wider is the interval, the wider is the search space. Note that the intervals were selected, so that the search space was wider than those proposed in the standard literature. The functions within the benchmark suite can be divided into unimodal and multimodal. The multimodal functions have two or more local optima. Typically, the multimodal functions are more difficult to solve. The most complex functions are those that have an exponential number of local optima randomly distributed within the search space.
4.2. Experimental Setup
This experimental study compares the results of the RDF-PSO using different kind of distributed populations within the original PSO algorithm. All PSO algorithms used the following setup. The parameter was randomly drawn from the interval , while the constriction coefficients were set as . As a termination condition, the number of fitness function evaluations was considered. It was set to , where denotes dimension of the problem. In this study, three different dimensions of functions were applied; that is, , , and . However, the population size is a crucial parameter for all population-based algorithms that have a great influence on their performance. In line with this, extensive experiments had been run in order to determine the most appropriate setting of this parameter by all algorithms in the test. As a result, the most appropriate setting of this parameter was considered for the experiments. Parameters, like the termination condition, dimensions of the observed functions, and the population size were also used by the other algorithms in experiments.
The PSO algorithms are stochastic in nature. Therefore, statistical measures, like minimum, maximum, average, standard deviation, and median, were accumulated after 25 runs of the algorithms in order to fairly estimate the quality of solutions.
The comparative study was conducted in which we would like to show, firstly, that the semantic web tools can be successfully applied to the optimization purposes as well and, secondly, that using the distributed population affects the results of the original PSO algorithm. In the remainder of this section, a detailed analysis of RDF-PSO algorithms is presented.
4.3.1. Analysis of the RDF-PSO Algorithms
In this experiment, the characteristics of the RDF-PSO algorithm were analyzed. In line with this, the RDF-PSO with neighborhood size of one (RDF1), the RDF-PSO with neighborhood size of two (RDF2), and the RDF-PSO with neighborhood size of tree (RDF3) were compared with the original PSO algorithm (PSO) by optimizing ten benchmark functions with dimensions , , and . The obtained results by the optimization of functions with dimension are aggregated in Table 3. Note that the best average values are for each function presented bold in the table.
From Table 3, it can be seen that the best average values were obtained by the RDF-1 algorithm eight times, that is, by , , , and . The best results were two times observed also by the original PSO algorithm, that is, and . On average, the results of the other two RDF-PSO algorithms, that are, RDF-2 and RDF-3, were better than the results of the original PSO algorithm.
In order to statistically estimate the quality of solution, the Friedman nonparametric test was conducted. Each algorithm enters this test with five statistical measures for each of observed functions. As a result, each statistical classifier (i.e., various algorithms) consists of different variables. The Friedman test [33, 34] compares the average ranks of the algorithms. The closer the rank to one, the better is the algorithm in this application. A null hypothesis states that two algorithms are equivalent and, therefore, their ranks should be equal. If the null hypothesis is rejected, that is, the performance of the algorithms is statistically different, the Bonferroni-Dunn test  is performed that calculates the critical difference between the average ranks of those two algorithms. When the statistical difference is higher than the critical difference, the algorithms are significantly different. The equation for the calculation of critical difference can be found in .
Friedman tests were performed using the significance level . The results of the Friedman nonparametric test are presented in Figure 4 where the three diagrams show the ranks and confidence intervals (critical differences) for the algorithms under consideration. The diagrams are organized according to the dimensions of functions. Two algorithms are significantly different if their intervals do not overlap.
The first diagram in Figure 4 shows that the RDF-1 algorithm significantly outperforms the RDF-3 algorithm. Interestingly, the results of the original PSO are also better than the results of the RDF-2 and RDF-3 algorithm. The situation is changed in the second (by ) and third diagram (by ), where RDF-3 improves the results of the RDF-3 and the original PSO, but not the RDF-2 algorithm. Additionally, the RDF-2 is significantly better than the original PSO also by .
In summary, the RDF-1 exposes the best results between all the other algorithms in tests by all observed dimensions of functions. On the other hand, the original PSO algorithm is only comparable with the modified PSO algorithms by optimizing the low dimensional functions (). The question why the RDF-PSO with neighborhood size of one outperformed the other RDF-PSO algorithms remains open for the future work. At this moment, it seems that here the primary role plays the constriction coefficients that determine an influence of specific neighbors.
The aim of this paper was twofold. First is to prove that the semantic web tools, like RDF and SPARQL, can also be used for the optimization purposes. Second is to show that the results of the modified RDF-PSO using the variable neighborhood are comparable with the results of the original PSO algorithm.
In line with the first hypothesis, a distributed population model was developed within the PSO algorithm that is suitable for describing the variable neighborhood of particles in the population. Furthermore, moving particles across the search space depends on all the particles in the neighborhood in place of the local best solutions as proposed in the original PSO algorithm.
In order to confirm the second hypothesis, the benchmark suite of ten well-known functions from the literature was defined. The results of extensive experiments by optimization of benchmark functions showed that the optimal neighborhood size within the RDF-PSO algorithm is one (RDF1). This variant of the RDF-PSO also outperformed the original PSO algorithm.
The distributed population model extends the concept of population in SI. This means that the population is no longer a passive data structure for storing particles. Not only can the particles now be distributed, but also some relations can be placed between the population members. In this proof of concept, only one relation was defined, that is, “is_neighbor_of.” Additionally, not the whole definition of the distributed population was put onto Internet at this moment. Although we are at the beginning of the path of how to make an intelligent particle in swarm intelligence algorithms, the preliminary results are encouraging and future researches would investigate this idea of distributed population models in greater detail.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, New York, NY, USA, 2009.
A. E. Eiben and J. E. Smith, Introduction to Evolutionary Computing, Springer, Berlin, Germany, 2003.
C. Blum and D. Merkle, Swarm Intelligence: Introduction and Applications, Springer, Berlin, Germany, 2008.
C. Darwin, The Origin of Species, John Murray, London, UK, 1859.
D. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Massachusetts, Mass, USA, 1996.
J. Koza, Genetic Programming 2—Automatic Discovery of Reusable Programs, The MIT Press, Cambridge, Mass, USA, 1994.
T. Bck, Evolutionary Algorithms in Theory and Practice—Evolution Strategies, Evolutionary Programming, Genetic Algorithms, University Press, Oxford, UK, 1996.
L. Fogel, A. Owens, and M. Walsh, Artificial Intelligence through Simulated Evolution, John Wiley & Sons, New York, NY, USA, 1996.
R. Storn and K. Price, “Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces,” Journal of Global Optimization, vol. 11, no. 4, pp. 341–359, 1997.View at: Google Scholar
X. -S. Yang, “A new metaheuristic bat-inspired algorithm,” in Nature Inspired Cooperative Strategies for Optimization (NICSO 2010), pp. 65–74, Springer, New York, NY, USA, 2010.View at: Google Scholar
I. Fister Jr, D. Fister, and I. Fister, “Differential evolution strategies with random forest regression in the bat algorithm,” in Proceeding of the 15th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 1703–1706, ACM, 2013.View at: Google Scholar
H. Liu, A. Abraham, O. Choi, and S. H. Moon, “Variable neighborhood particle swarm optimization for multi-objective flexible job-shop scheduling problems,” in Simulated Evolution and Learning, vol. 4247 of Lecture Notes in Computer Science, pp. 197–204, Springer, New York, NY, USA, 2006.View at: Publisher Site | Google Scholar
J. Kennedy, “Particle swarm optimization,” in Encyclopedia of Machine Learning, pp. 760–766, Springer, New York, NY, USA, 2010.View at: Google Scholar
N. Chakraborti, R. Jayakanth, S. Das, E. D. Çalişir, and Ş. Erkoç, “Evolutionary and genetic algorithms applied to Li+-C system: calculations using differential evolution and particle swarm algorithm,” Journal of Phase Equilibria and Diffusion, vol. 28, no. 2, pp. 140–149, 2007.View at: Publisher Site | Google Scholar
Y. Shi and R. Eberhart, “Modified particle swarm optimizer,” in Proceedings of the IEEE International Conference on Evolutionary Computation (ICEC '98), pp. 69–73, IEEE, May 1998.View at: Google Scholar
E. Miller, “An introduction to the resource description framework,” D-Lib Magazine, vol. 4, no. 5, pp. 14–25, 1998.View at: Google Scholar
D. Allemang and J. Hendler, Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL, Morgan Kaufmann, Amsterdam, The Netherlands, 2nd edition, 2011.
S. Harris and A. Seaborne, Sparql 1.1 Query Language, 2013.
rdflib: A python library for working with RDF, 2013, http://code.google.com/p/rdflib/.
Numpy, 2013, http://www.numpy.org/.
Matplotlib, 2013, http://matplotlib.org/.
X. -S. Yang, “Appendix A: test problems in optimization,” in Engineering Optimization, X. -S. Yang, Ed., pp. 261–266, John Wiley & Sons, Hoboken, NJ, USA, 2010.View at: Google Scholar
J. Demšar, “Statistical comparisons of classifiers over multiple data sets,” Journal of Machine Learning Research, vol. 7, pp. 1–30, 2006.View at: Google Scholar