Centre for Informatics
and Systems of the University of Coimbra, Polo II - University of Coimbra, Coimbra 3030-290, Portugal
Department of Computing and Electronic Systems, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK
Dalle Molle Institute for Artificial Intelligence (IDSIA), Galleria 2, Manno-Lugano 6928, Switzerland
Recommended by T. Blackwell
Abstract
Using a geometric framework for the interpretation of crossover of recent introduction, we show an intimate connection between particle swarm optimisation (PSO) and evolutionary algorithms. This connection enables us to generalise PSO to virtually any solution representation in a natural and straightforward way. The new Geometric
PSO (GPSO) applies naturally to both continuous and combinatorial spaces. We demonstrate this for the cases of Euclidean, Manhattan and Hamming spaces and report extensive experimental results. We also demonstrate the applicability of GPSO to more challenging combinatorial spaces. The Sudoku puzzle is a perfect candidate to test new
algorithmic ideas because it is entertaining and instructive as well as being a nontrivial constrained combinatorial problem. We apply GPSO to solve the Sudoku puzzle.
1. Introduction
Particle swarm optimization (PSO) is a relatively
recently devised population-based stochastic global optimization algorithm
[1]. PSO has many similarities with evolutionary algorithms, and has also proven to have
robust performance over a variety of difficult optimization problems. However,
the original formulation of PSO requires the search space to be continuous and
the individuals to be represented as vectors of real numbers.
There is a number of extensions of PSO to combinatorial spaces with
various degrees of success [2, 3]. (Notice that applications of traditional PSO
to combinatorial optimization problems cast as continuous optimization problems
are not extensions of the PSO algorithm.) However, every time a new solution representation is
considered, the PSO algorithm needs to be rethought and adapted to the new
representation. In this article, we extend PSO to richer spaces by making use
of a rigorous mathematical generalization of the notion (and motion) of
particles to a general class of spaces. This approach has the advantage that a
PSO can be derived in a principled way for any search space belonging to the
given class.
In particular, we show formally how a general
form of PSO (without the inertia term) can be obtained by using theoretical
tools developed for evolutionary algorithms with geometric crossover and
geometric mutation. These are representation-independent operators that
generalize many pre-existing search operators for the major representations,
such as binary strings [4], real vectors [4], permutations [5], syntactic trees
[6], and sequences [7]. (The inertia weight was not part of the original
proposal of PSO, it was later introduced by Shi and Eberhart [8].)
Firstly, we formally derive geometric PSOs (GPSOs) for
Euclidean, Manhattan, and Hamming spaces and discuss how to derive GPSOs for
virtually any representation in a similar way. Then, we test the GPSO theory
experimentally: we implement the specific GPSO for Euclidean, Manhattan, and
Hamming spaces and report extensive experimental results showing that GPSOs
perform very well.
Finally, we also demonstrate that GPSO can be
specialized easily to nontrivial combinatorial spaces. In previous work [9],
we have used the geometric framework to design an evolutionary algorithm to
solve the Sudoku puzzle and obtained very good experimental results. Here, we
apply GPSO to solve the Sudoku puzzle.
In Section 2, we introduce the geometric framework and
introduce the notion of multiparental geometric crossover. In Section 3, we
recast PSO in geometric terms and generalize it to generic metric spaces. In
Section 4, we apply these notions to the Euclidean, Manhattan, and Hamming
spaces. In Section 5, we discuss how to specialize the general PSO
automatically to virtually any solution representation using geometric
crossover. Then, in Section 6, we report experimental results with the GPSOs
for Euclidean, Manhattan, and Hamming spaces, and we compare them with a
traditional PSO. In Section 7, we apply GPSO to Sudoku, and we describe the
results in Section 8. Finally, in Section 9, we present conclusions and future
work.
2. Geometric Framework
Geometric operators are defined in geometric terms
using the notions of line segment and ball. These notions and the corresponding
genetic operators are well defined once a notion of distance in the search
space is defined. Defining search operators as functions of the search space is
opposite to the standard way [10] in which the search space is seen as a
function of the search operators employed.
2.1. Geometric Preliminaries
In the following, we give necessary preliminary
geometric definitions and extend those introduced in [4]. For more details on
these definitions, see [11].
The terms distance and metric denote any
real valued function that conforms to the axioms of identity, symmetry, and
triangular inequality. A simple connected graph is naturally associated to a
metric space via its path metric: the distance between two nodes in the
graph is the length of a shortest path between
the nodes. Distances arising from graphs via their path metric are called graphic
distances. Similarly, an edge-weighted graph with strictly positive weights
is naturally associated to a metric space via a weighted path metric.
In a metric space
, a closed ball is a set of the form
, where
and
is a positive
real number called the radius of the ball. A line segment is a set of
the form
, where
are called
extremes of the segment. Metric ball and metric segment generalize the familiar
notions of ball and segment in the Euclidean space to any metric space through
distance redefinition. In general, there may be more than one shortest path (geodesic)
connecting the extremes of a metric segment: the metric segment is the union of
all geodesics.
We assign a structure to the solution set by endowing
it with a notion of distance
.
is, therefore,
a solution space and
, where
is the fitness
function, is the corresponding fitness landscape.
2.2. Geometric Crossover
Definition 1 (geometric crossover).
A binary operator is a geometric
crossover under the metric
if all
offsprings are in the segment between its parents.
The definition is representation-independent and, therefore, crossover is well defined for any representation. Being based
on the notion of metric segment, crossover is a function only of the metric
associated with the search space.
The class of geometric crossover operators is very
broad. For vectors of reals, various types of blend or line crossovers, box
recombinations, and discrete recombinations are geometric crossovers [4]. For
binary and multary strings, all homologous crossovers are geometric [4, 12].
For permutations, PMX, Cycle crossover, merge crossover, and others are
geometric crossovers [5]. We describe this in more detail in Section 2.3 since
we will use the permutation representation in this paper. For syntactic trees,
the family of homologous crossovers are geometric [6]. Recombinations for
several more complex representations are also geometric [4, 5, 7, 13].
2.3. Geometric Crossover for Permutations
In previous work, we have studied various crossovers
for permutations, revealing that PMX [14], a well-known crossover for
permutations, is geometric under swap distance. Also, we found that Cycle
crossover [14], another traditional crossover for permutations, is geometric
under swap distance and under Hamming distance (geometricity under Hamming
distance for permutations implies geometricity under swap distance, but not vice
versa). Finally, we showed that geometric crossovers for permutations based
on edit moves are naturally associated with sorting algorithms: picking
offspring on a minimum path between two parents corresponds to picking
partially sorted permutations on the minimal sorting trajectory between the
parents.
2.4. Geometric Crossover Landscape
Geometric operators are defined as functions of the
distance associated with the search space. However, the search space does not
come with the problem itself. The problem consists of a fitness function to
optimize and a solution set, but not a neighbourhood relationship. The act of
putting a structure over the solution set is part of the search algorithm
design and it is a designer's choice. A fitness landscape is the fitness
function plus a structure over the solution space. So for each problem, there
is one fitness function but as many fitness landscapes as the number of
possible different structures over the solution set. In principle, the designer
could choose the structure to assign to the solution set completely
independently from the problem at hand. However, because the search operators
are defined over such a structure, doing so would make them decoupled from the
problem at hand, hence turning the search into something very close to random
search.
In order to avoid this, one can exploit problem
knowledge in the search. This can be achieved by carefully designing the
connectivity structure of the fitness landscape. For example, one can study the
objective function of the problem and select a neighbourhood structure that
couples the distance between solutions and their fitness values. Once this is
done, the problem knowledge can be exploited by search operators to perform
better than random search, even if the search operators are problem independent
(as in the case of geometric crossover and mutation).
Under which conditions is a landscape well searchable
by geometric operators? As a rule of thumb, geometric mutation and geometric
crossover work well on landscapes where the closer pairs of solutions are, the
more correlated their fitness values. Of course this is no surprise: the
importance of landscape smoothness has been advocated in many different
contexts and has been confirmed in uncountable empirical studies with many
neighborhood search metaheuristics [15, 16]. To summarize, consider
the following.(i) Rule of thumb
1: If we have a good distance for the problem at hand, then we have good
geometric mutation and good geometric crossover.(ii) Rule of thumb
2: A good distance for the problem at hand is a distance that makes the
landscape “smooth.”
2.5. Product Geometric Crossover
In recent work [12], we have introduced the notion of
product geometric crossover.
Theorem 1.
Cartesian
product of geometric crossover is geometric under the sum of distances.
This theorem is very useful because it allows one to
build new geometric crossovers by combining crossovers that are known to be
geometric. In particular, this applies to crossovers for mixed representations.
The elementary geometric crossovers do not need to be independent, to form a valid product geometric crossover.
2.6. Multiparental Geometric Crossover
To extend geometric crossover to the case of multiple
parents, we need the following definitions [17].
Definition 2.
A family
of subsets of a
set
is called convexity on
if(C1)the empty
set
and the
universal set
are in
,(C2)
is nonempty,
then
, and(C3)
is nonempty and
totally ordered by inclusion, then
.
The pair
is called convex
structure. The members of
are called convex
sets. By the axiom (C1), a subset
of
of the convex
structure is included in at least one convex set, namely,
.
From axiom (C2),
is included in
a smallest convex set, the convex hull of
:
(1) The convex hull of a finite set is called a polytope.
The axiom (C3) requires domain finiteness of
the convex hull operator: a set
is convex if it
includes
for each finite
subset
of
.
The convex hull operator applied to
a set of cardinality two is called segment
operator. Given a metric space
, the segment between
and
is the set
. The abstract geodetic convexity
on
induced by
is obtained as
follows: a subset
of
is geodetically
convex, provided
for all
,
in
. If
denotes the convex
hull operator of
, then for all
. The two operators need not to be equal: there are
metric spaces in which metric segments are not all convex.
We can now provide the following extension.
Definition 3 (multiparental geometric crossover).
In a multiparental geometric crossover,
given
parents
, their offspring are contained in the metric convex
hull of the parents
for some metric
.
Theorem 2 (decomposable three-parent recombination).
Every multiparental recombination
that can be
decomposed as a sequence of 2-parental geometric crossovers under the same
metric
and
, so that
, is a three-parental geometric crossover.
Proof.
Let
be the set of
parents and
their metric
convex hull. By definition of metric convex hull, for any two points
, their offspring are in the convex hull
. Since
, any two parents
have offspring
. Then, any other parent
, when recombined with
, produces offspring
in the convex
hull
. So the three-parental recombination equivalent to
the sequence of geometric crossover
and
is a
multiparental geometric crossover.
3. Geometric PSO
3.1. Canonical PSO Algorithm and Geometric Crossover
Consider the canonical PSO in Algorithm 1. It is well
known [18] that one can write the equation of motion of the particle without
making explicit use of its velocity.
Algorithm 1: Standard PSO algorithm.
Let
be the position
of a particle and
its velocity.
Let
be the current
best position of the particle and let
be the global
best. Let
and
be the velocity
of the particle and
and
its position at
the next two time ticks. The equation of velocity update is the linear
combination:
, where
,
, and
are scalar
coefficients. To eliminate velocities, we substitute the identities
and
in the equation
of velocity update and rearrange it to obtain an equation that expresses
as function of
and
as follows:
(2)
If we set
(i.e., the
particle has no inertia),
becomes
independent on its position
two time ticks
earlier. If we call
,
, and
, the equation of motion becomes
(3)
In these conditions, the main feature that allows the
motion of particles is the ability to perform linear combinations of points in
the search space. As we will see in the next section, we can achieve this same
ability by using multiple (geometric) crossover operations. This makes it
possible to obtain a generalization of PSO to generic search spaces.
In the following, we illustrate the parallel between
an evolutionary algorithm with geometric crossover and the motion of a particle
in PSO (see Figure 1). Geometric crossover picks offspring
on a line
segment between parents
and
. Geometric crossover can be interpreted as a motion
of a particle: consider a particle
that moves in
the direction of a point
reaching, in
the next time step, position
. If one equates parent
with the
particle
and parent
with the
direction point
, the offspring
is, therefore,
the particle at the next time step
. The distance between parent
and offspring
is the
magnitude of the velocity of the particle
. Notice that the particle moves from
to
: this means that the particle
is replaced by
the particle
in the next
time step. In other words, the new position of the particle replaces the
previous position. Coming back to the evolutionary algorithm with geometric
crossover, this means that the offspring
replaces its
parent
in the new
population. The fact that at a given time all particles move is equivalent to
say that each particle is selected for “mating.” Mating is a weighted
multirecombination involving the memory of the particle and the best in the
current population.
Figure 1: Geometric crossover and particle motion.
In the standard PSO, weights represent the propensity
of a particle towards memory, sociality, and cognition. In the GPSO, they
represent the attractions towards the particle's previous position, the
particle's best position, and the swarm's best position.
Naturally, particle motion based on geometric
crossover leads to a form of search that cannot extend beyond the convex hull
of the initial population. Mutation can be used to allow nonconvex search. We
explain these ideas in detail in the following sections.
3.2. Geometric Interpretation of Linear Combinations
Definition 4.
A convex
combination of vectors
is a linear
combination
, where all coefficients
are nonnegative
and add up to 1.
It is called “convex combination” because when
vectors represent points in space, the set of all convex combinations
constitutes the convex hull.
A special case is
, where a point formed by the convex combination will
lie on a straight line between two points. For three points, their convex hull
is the triangle with the points as vertices.
Theorem 3.
In a PSO with
no inertia (
) and where
acceleration coefficients are such that
, the next position
of a particle
is within the convex hull formed by its current position
, its local best
, and the swarm best
.
Proof.
As we
have seen in Section 3.1, when
, a particle's update equation becomes the linear
combination in (3). Notice that this is an affine combination since the
coefficients of
,
, and
add up to 1.
Interestingly, this means that the new position of the particle is coplanar
with
,
, and
. If we restrict
and
to be positive
and their sum to be less than 1, (3) becomes a convex combination. Geometrically,
this means that the new position of the particle is in the convex hull formed
by (or more informally, is between) its previous position, its local best, and
the swarm best.
In the next section, we generalize this simplified
form of PSO from real vectors to generic metric spaces. As mentioned above,
mutation will be required to extend the search beyond the convex hull.
3.3. Convex Combinations in Metric Spaces
Linear combinations are well defined for vector
spaces: algebraic structures endowed with scalar product and vectorial sum. A
metric space is a set endowed with a notion of distance. The set underlying a
metric space does not normally come with well-defined notions of scalar product
and sum among its elements. Therefore, a linear combination of its elements is
not defined. How can we then define a convex combination in a metric space?
Vectors in a vector space can easily be understood as points in a metric space.
However, the interpretation of scalars is not as straightforward: what do the
scalar weights in a convex combination mean in a metric space?
As seen in Section 3.2, a convex combination is an
algebraic description of a convex hull. However, even if the notion of convex
combination is not defined for metric spaces, convexity in metric spaces is
still well defined through the notion of metric convex set that is a
straightforward generalization of traditional convex set. Since convexity is well
defined for metric spaces, we still have hope to generalize the scalar weights
of a convex combination trying to make sense of them in terms of distance.
The weight of a point in a convex combination can be
seen as a measure of relative linear “attraction” toward its corresponding
point, versus attractions toward the other points of the combination. The
closer a weight to 1, the stronger the attraction to the corresponding point.
The point resulting from a convex combination can be seen as the equilibrium
point of all the attraction forces. The distance between the equilibrium point
and a point of the convex combination is, therefore, a decreasing function of
the level of attraction (weight) of the point: the stronger the attraction, the
smaller its distance to the equilibrium point. This observation can be used to
reinterpret the weights of a convex combination in a metric space as follows:
with
,
, and
greater than
zero and
is generalized
to mean that
is a point such
that
,
, and
, where
is a decreasing
function.
This definition is formal and valid for all metric
spaces, but it is nonconstructive. In contrast, a convex combination not only
defines a convex hull, but also tells how to reach all its points. So how can
we actually pick a point in the convex hull respecting the above distance
requirements? Geometric crossover will help us with this, as we show in the
next section.
To summarize, the requirements for a convex
combination in a metric space are as follows.(1)Convex
weights: the weights respect the form of a convex combination:
and
.(2)Convexity:
the convex combination operator combines
,
, and
and returns a
point in their metric convex
hull (or simply triangle) under the metric of the space considered.(3)Coherence
between weights and distances: the distances to the equilibrium point are
decreasing functions of their weights.(4)Symmetry:
the same value assigned to
,
, or
has the same
effect (e.g., in a equilateral triangle, if the coefficients have all the same
value, the distances to the equilibrium point are the same).
3.4. Geometric PSO Algorithm
The generic GPSO algorithm is illustrated in Algorithm
2. This differs from the standard PSO (Algorithm 1), in that:(i) there is no velocity,(ii)the equation of position update is the convex
combination,(iii)there is mutation, and(iv)the parameters
,
, and
are nonnegative
and add up to one.
Algorithm 2: Geometric PSO
algorithm.
The specific PSOs for the Euclidean, Manhattan, and
Hamming spaces use the randomized convex combination operators described in
Section 4 and space-specific mutations. The randomization introduced by the
randomized convex combination and by the mutation are of different types. The
former allows us to pick points at random exclusively within the convex hull.
The latter, as mentioned in Section 3.1, allows us to pick points outside the
convex hull.
4. Geometric PSO for Specific Spaces
4.1. Euclidean Space
The GPSO for the Euclidean space is not an
extension of the traditional PSO. We include it to show how the general notions
introduced in the previous section materialize in a familiar context. The
convex combination operator for the Euclidean space is the traditional convex
combination that produces points in the traditional convex hull.
In Section 3.3, we have mentioned how to interpret the
weights in a convex combination in terms of distances. In the following, we
show analytically how the weights of a convex combination affect the relative
distances to the equilibrium point. In particular, we show that the relative
distances are decreasing functions of the corresponding weights.
Theorem 4.
In a convex
combination, the distances to the equilibrium point are decreasing functions of
the corresponding weights.
Proof.
Let
,
, and
be three points
in
and let
be a convex
combination. Let us now decrease
to
such that
,
, and
still form a
convex combination and that the relative proportions of
and
remain
unchanged:
. This requires
and
to be
and
. The equilibrium point for the new convex combination
is
(4) The distance between
and
is
(5) and the distance between
and the new
equilibrium point is
(6) So when
decreases (
) and
and
maintain the
same relative proportions, the distance between the point
and the equilibrium
point
increases (
). Hence, the
distance between
and the
equilibrium point is a decreasing function of
. For symmetry, this applies to the distances between
and
and the
equilibrium point: they are decreasing functions of their corresponding weights
and
, respectively.
The traditional convex combination in the Euclidean
space respects the four requirements for a convex combination presented in
Section 3.3.
4.2. Manhattan Space
In the following, we first define a multiparental
recombination for the Manhattan space and then prove that it respects the four
requirements for being a convex combination presented in Section 3.3.
Definition 5 (box recombination family).
Given two parents
and
in
, a box recombination operator returns offspring
such that
for
.
Theorem 5 (geometricity of box recombination).
Any box
recombination is a geometric crossover under Manhattan distance.
Proof.
Theorem 5 is an immediate consequence of the product geometric crossover
(Theorem 1).
Definition 6 (three-parent box recombination family).
Given three
parents
,
, and
in
, a box recombination operator returns offspring
such that
for
.
Theorem 6 (geometricity of a three-parent box recombination).
Any
three-parent box recombination is a geometric crossover under Manhattan
distance.
Proof.
We
prove it by showing that any multiparent box recombination
can be
decomposed as a sequence of two simple box recombinations. Since the simple box
recombination is geometric (Theorem 5), this theorem is a simple corollary of
the multiparental geometric decomposition theorem (Theorem 2).
We will show that
followed by
can reach any
offspring
. For each
, we have
. Notice that
. We have two cases: (i)
in which case
is reachable by
the sequence
; (ii)
, then it must be in
in which case
is reachable by
the sequence
.
Definition 7 (weighted multiparent box recombination).
Given three parents
,
, and
in
and weights
,
, and
, a weighted box recombination operator returns
offspring
such that
for
, where
,
, and
are a convex
combination of randomly perturbed weights with expected values
,
, and
.
The difference between box recombination and linear
recombination (Euclidean space) is that in the latter, the weights
,
, and
are randomly
perturbed only once and the same weights are used for all the dimensions,
whereas the former one has a different randomly perturbed version of the
weights for each dimension.
The weighted multiparent box recombination belongs to
the family of multiparent box recombination because
for
, hence, it is geometric.
Theorem 7 (coherence between weights and distances).
In
weighted multiparent box recombination, the distances of the parents to the
expected offspring are decreasing functions of the corresponding weights.
Proof.
The
proof of Theorem 7 is a simple variation of that of Theorem 4.
In summary, in this section, we have introduced the
weighted multiparent box recombination and shown that it is a convex
combination operator satisfying the four requirements of a metric convex combination
for the Manhattan space: convex weights (Definition 6), convexity
(geometricity, Theorem 6), coherence (Theorem 7), and symmetry (self evident).
4.3. Hamming Space
In this section, we first define a multiparental
recombination for binary strings, that is, a straightforward generalization of
mask-based crossover with two parents and then prove that it respects the four
requirements for being a convex combination in the Hamming space presented in
Section 3.3.
Definition 8 (three-parent mask-based crossover family).
Given three parents
,
, and
in
, generate randomly a crossover mask of length
with symbols from the alphabet
. Build the offspring
filling each
position with the bit from the parent appearing in the crossover mask at the
corresponding position.
The weights
,
, and
of the convex
combination indicate, for each position in the crossover mask, the probability
of having the symbols
,
, or
.
Theorem 8 (geometricity of three-parent mask-based crossover).
Any three-parent mask-based crossover is a geometric crossover under
Hamming distance.
Proof.
We
prove it by showing that any three-parent mask-based crossover can be
decomposed as a sequence of two simple mask-based crossovers. Since
the simple mask-based crossover is geometric, this
theorem is a simple corollary of the multiparental geometric decomposition theorem
(Theorem 2).
Let
be the mask to
recombine
,
, and
, producing the offspring
. Let
be the mask
obtained by substituting all occurrences of
in
with
, and let
be the mask
obtained by substituting all occurrences of
in
with
. First, recombine
and
using
obtaining
. Then, recombine
and
using
where the
's in the mask
stand for alleles in
. The offspring produced by the second crossover is
, so the sequence of the two simple crossovers is
equivalent to the three-parent crossover. This is because the first crossover
passes to the offspring all genes it needs to take from
according to
, and the rest of the genes are all from
; the second crossover corrects those genes that
should have been taken from parent
according to
, but were taken from
instead.
Theorem 9 (coherence between weights and distances).
In
the weighted three-parent mask-based crossover, the
distances of the parents to the expected offspring are decreasing functions of
the corresponding weights.
Proof.
We
want to know the expected distance from parent
,
, and
to their
expected offspring
as a function
of the weights
,
, and
. To do so, we first determine, for each position in
the offspring, the probability of it being the same as
. Then from that, we can easily compute the expected
distance between
and
. We have that
(7) where
is the
probability of a bit of
at a certain
position to be the same as the bit of
at the same
position;
,
, and
are the
probabilities that a bit in
is taken from
parents
,
, and
, respectively (these coincide with the weights of the
convex combination
,
, and
);
and
are the
probabilities that a bit taken from
or
coincides with
the one in
at the same
location. These last two probabilities equal the number of common bits in
and
(and
and
) over the
length of the strings
. So
and
, where
is the Hamming
distance. So (7) becomes
(8) Hence, the
expected distance between the parent
and the
offspring
is
.Notice that this is a decreasing function of
because
increasing
forces
or
to decrease
since the sum of the weights is constant, hence,
decreases.
Analogously,
and
are decreasing
functions of their weights
and
, respectively.
In summary, in this section, we have introduced the
weighted multiparent mask-based crossover and shown that it is a convex
combination operator satisfying the four requirements of a metric convex
combination for the Hamming space: convex weights (Definition 8), convexity
(geometricity, Theorem 8), coherence (Theorem 9), and symmetry (self evident).
5. Geometric PSO for Other Representations
Before looking into how we can extend GPSO to other
solution representations, we will discuss the relation between
the three-parental geometric crossover and the
symmetry requirement for a convex combination.
For each of the spaces considered in Section 4, we
have first considered (or defined) a three-parental recombination and then
proven that it is a three-parental geometric crossover by showing that it can
actually be decomposed into two sequential applications of a geometric
crossover for the specific space.
However, we could have skipped the explicit
definition of a three-parental recombination altogether. In fact, to obtain
the three-parental recombination, we could have used two sequential
applications of a known two-parental geometric crossover for the specific
space. This composition is indeed a three-parental recombination (it combines
three parents) and it is decomposable by construction. Hence, it is a
three-parental geometric crossover. This, indeed, would have been simpler than
the route we took.
The reason we preferred to define explicitly a
three-parental recombination is that the requirement of symmetry of the convex
combination is true by construction: if the roles of any two parents are
swapped by exchanging in the three-parental recombination both positions and
the respective recombination weights, the resulting recombination operator is
equivalent to the original operator.
The symmetry requirement becomes harder to enforce and
prove for a three-parental geometric crossover obtained by two sequential
applications of a two-parental geometric crossover. We illustrate this in the
following.
Let us consider three parents
,
, and
with positive
weights
,
, and
which add up to
one. If we have a symmetric three-parental weighted geometric crossover
, the symmetry of the recombination is guaranteed by
the symmetry of the operator. So
is equivalent
to
. Hence, the requirement of symmetry on the weights of
the convex combination holds. If we consider a three-parental recombination
defined by using twice a two-parental genetic crossover
, we have
(9) with the
constraint that
and
are positive
and add up to one, and
and
are positive
and add up to one. Notice the inherent asymmetry in this expression: the
weights
and
are not
directly comparable with
because they
are relative weights between
and
. Moreover, there is the extra weight
. This asymmetry makes the requirement of symmetry
problematic to meet: given the desired
,
, and
, what values of
,
,
, and
should we
choose to obtain an equivalent symmetric three-parental weighted recombination
expressed as a sequence of two two-parental geometric crossovers?
For the Euclidean space, it is easy to answer this
question using simple algebra as follows:
(10) Since the convex combination of two points in the
Euclidean space is
, and
and
, then
(11)
However, the question may be less straightforward to
answer for other spaces, although, we could use the equation above as a
rule-of-thumb to map the weights of
and the weights
in the sequential
decomposition
to obtain a nearly symmetric convex combination.
Where does this discussion leave us in relation to the
extension of GPSO to other representations? We have seen that there are two
alternative ways to produce a convex combination for a new representation: (i)
explicitly define a symmetric three-parental recombination for the new
representation and then prove its geometricity by showing that it is
decomposable into a sequence of two two-parental geometric crossovers (explicit
definition), or (ii) use twice the simple geometric crossover to produce a
symmetric or nearly symmetric three-parental recombination (implicit
definition). The second option is also very interesting because it
allows us to extended automatically GPSO to all representations we have
geometric crossovers for (such as permutations, GP trees, and
variable-length sequences, to mention but a few), and
virtually to any other complex solution representation.
6. Experimental Results for Euclidean, Manhattan, and Hamming Spaces
We have run two groups of experiments: one for the
continuous version of the GPSO (EucliadeanPSO, or EPSO for short, and ManhattanPSO, or MPSO), and one for the
binary version (HammingPSO, or HPSO).
For the Euclidean and Manhattan versions, we have
compared the performances with those of a continuous PSO (BasePSO, or BPSO)
with constriction and inertia, whose parameters are as in Table 1. We have run
the experiments for dimensions 2, 10, and 30 on the following five-benchmark
functions: F1C Sphere [19], F2C Rosenbrock [19], F3C Ackley [20], F4C Griewangk [21], and F5C Rastrigin [22]. The Hamming version
has been tested on the De Jong's test suite [19]: F1 Sphere (30), F2 Rosenbrock (24), F3 Step (50), F4 Quartic (240), and F5 Shekel (34), where the numbers in
brackets are the dimensions of the problems in bits. All functions in the test
bed have global optimum 0 and are to be maximized.
Table 1: Parameters for
BPSO.
Since there is no equivalent GPSO with
(
, which does not respect the conditions in Theorem 3),
we have decided to set
,
, and
proportional to
,
, and
, respectively, and summing up to one (see Table 2).
Table 2: Parameters for
EPSO and MPSO.
For the binary version, the parameters of population
size, number of iterations, and
,
, and
have been tuned
on the sphere function and are as in Table 3. From the parameters tuning, it
appears that there is a preference for values of
close to zero.
This means that there is a bias towards the swarm and particle bests, and less
attraction towards the current particle position.
Table 3: Selected
parameters for HPSO.
For each set up, we performed 20 independent runs.
Table 4 shows the best and the mean fitness value (i.e., the fitness value at
the position where the population converges) found by the swarm when exploring
continuous spaces. This table summarizes the results for the three algorithms
presented, over the five test functions, for the two choices of population
size, giving an immediate comparison of the performances. Overall the GPSOs
(EPSO and MPSO), compare very favourably with BPSO, outperforming it in many
cases. This is particularly interesting since it suggests that the inertia term
(not present in GPSO) is not necessary for good performance.
Table 4: Test results for continuous versions: best and mean fitness values found by the swarm over 20
runs at last iteration (iteration 200).
In two dimensions, the results for all the functions
(for PSOs both with 20 and 50 particles) are nearly
identical, with BPSO and the two GPSOs both performing equally well. In higher
dimensions, it is interesting to see how dimensionality does not seem to affect
the quality of the results of GPSOs (while there is a fairly obvious decline in
the performance of BPSO when dimension increases).
Also, EPSO's and MPSO's results are virtually
identical. Let us recall from Section 4.2 the
difference between Euclidean and Manhattan spaces, that is, “the difference
between box recombination and linear recombination (Euclidean space) is that in
the latter, the weights are randomly perturbed only once and the same weights
are used for all the dimensions, whereas the former one has a different
randomly perturbed version of the weights for each dimension.” The results
obtained show, therefore, that at least in this context, randomly perturbing
the weights once for all dimensions, or perturbing them for each dimension,
does not seem to affect the end result.
Table 5 shows the mean of the best fitness value and
the best fitness value over the whole population for the binary version of the
algorithm. The algorithm compares well with results reported in the literature,
with HPSO obtaining near optimal results on all functions. Interestingly, the
algorithm works at its best when
, the weight for
(the particle
position), is zero. This corresponds to a degenerated PSO that makes decisions
without considering the current position of the particle.
Table 5: Test results for HPSO with selected parameters for
De Jong's test suite.
7. Geometric PSO for Sudoku
In this section, we will put into practice the ideas
discussed in Section 5 and propose a GPSO to solve the Sudoku puzzle. In
Section 7.1, we introduce the Sudoku puzzle. In Section 7.2, we present a
geometric crossover for Sudoku. In Section 7.3, we present a three-parental crossover
for Sudoku and show that it is a convex combination.
7.1. Description of Sudoku
Sudoku is a logic-based placement puzzle. The aim of
the puzzle is to enter a digit from 1 through 9 in each cell of a
grid made up of
subgrids
(called “regions”), starting with various digits given in some cells (the
“givens”). Each row, column, and region must contain only one instance of
each digit. In Figure 2, we show an example of Sudoku puzzle. Sudoku puzzles
with a unique solution are called proper Sudoku, and the majority of published
grids are of this type.
Figure 2: Example of Sudoku puzzle.
Published puzzles often are ranked in terms of
difficulty. Perhaps surprisingly, the number of “givens” has little
or no bearing on a puzzle's difficulty. This is based on the relevance and the
positioning of the numbers rather than the quantity of the numbers.
The
Sudoku puzzle
of any difficulty can be solved very quickly by a computer. The simplest way is
to use some brute force trial-and-error search employing back tracking.
Constraint programming is a more efficient method that propagates the
constraints successively to narrow down the solution space until a solution is
found or until alternate values cannot otherwise be excluded, in which case,
backtracking is applied. A highly efficient way of solving such constraint
problems is the Dancing Links Algorithm by Donald Knuth [23].
The general problem of solving Sudoku puzzles on
boards of
blocks is known
to be NP complete [24]. This means that, unless
, the exact solution
methods that solve very quickly the
boards take
exponential time in the board size in the worst case. However, it is unknown
whether the general Sudoku problem restricted to puzzles with unique solutions
remains NP complete or becomes polynomial.
Solving Sudoku puzzles can be expressed as a graph
coloring problem. The aim of the puzzle in its standard form is to construct a
proper 9 coloring of a particular graph, given a partial 9 coloring.
A valid Sudoku solution grid is also a Latin square.
Sudoku imposes the additional regional constraint. Latin square completion is
known to be NP complete. A further relaxation of the problem allowing
repetitions on columns (or rows) makes it polynomially solvable.
Admittedly, evolutionary algorithms and
meta-heuristics in general are not the best
techniques to solve Sudoku because they do not
exploit systematically the problem constraints to narrow down the search.
However, Sudoku is an interesting study case because it is a relatively simple
problem but not trivial since is NP complete, and the different types of
constraints make Sudoku an interesting playground for search operator design.
7.2. Geometric Crossover for Sudoku
Sudoku is a constraint satisfaction problem with four
types of constraints:(1) fixed elements,(2)rows are
permutations,(3)columns are
permutations,(4) and boxes are
permutations.
It can be cast as an optimization problem by choosing
some of the constraints as hard constraints that all solutions have to respect,
and the remaining constraints as soft constraints that can be only partially
fulfilled, and the level of fulfilment is the fitness of the solution. We
consider a space with the following characteristics:(i)hard constraints: fixed positions and
permutations on rows,(ii)soft constraints: permutations on columns
and boxes,(iii)distance: sum of swap distances between
paired rows (row-swap distance),(iv)feasible geometric mutation: swap two
nonfixed elements in a row, and(v)feasible geometric crossover: row-wise PMX
and row-wise cycle crossover.
The chosen mutation preserves both fixed
positions and permutations on rows (hard constraints) because swapping elements
within a row, which is a permutation, returns a permutation. The mutation is
1-geometric under row-swap distance. Row-wise PMX and row-wise cycle crossover recombine
parent grids applying, respectively, PMX and cycle crossover to each pair of
corresponding rows. In case of PMX, the crossover points can be selected to be
the same for all rows, or be random for each row. In terms of offspring that
can be generated, the second version of row-wise PMX includes all the offspring
of the first version.
Simple PMX and simple cycle crossover applied to
parent permutations always return permutations.
They also preserve fixed positions. This is because both are geometric under
swap distance and, in order to generate offspring on a minimal sorting path
between parents using swaps (sorting one parent into the order of the other
parent), they have to avoid swaps that change common elements in both parents
(elements that are already sorted). Therefore, also row-wise PMX and row-wise
cycle crossover preserve both hard constraints.
Using the product geometric crossover Theorem 1, it is
immediate to see that both row-wise PMX and row-wise cycle crossover are
geometric under row-swap distance since simple PMX and simple cycle crossover
are geometric under swap distance. Since simple cycle crossover is also
geometric under Hamming distance (restricted to permutations), row-wise cycle
crossover is also geometric under Hamming distance.
To restrict the search to the space of grids with
fixed positions and permutations on rows, the initial population must be seeded
with feasible random solutions taken from this space. Generating such solutions
can be done still very efficiently.
The fitness function (to maximize) is the sum of
the number of unique elements in each row plus the
sum of the number of unique elements in each column
plus the sum of the number of unique elements in each
box. So for a
grid, we have a
maximum fitness of
for a
completely correct Sudoku grid, and a minimum fitness little more than
because for
each row, column, and square, there is at least one unique element type.
It is possible to show that the fitness landscapes
associated with this space is smooth, making the search operators proposed a
good choice for Sudoku.
7.3. Convex Combination for Sudoku
In this section, we first define a multiparental
recombination for permutations and then prove that it respects the four
requirements for being a convex combination presented in Section 3.3.
Let us consider the example in Figure 3 to illustrate
how the multiparental sorting crossover works.
Figure 3: Example of
multiparental sorting crossover.
The mask is generated at random and is a vector of the
same length of the parents. The number of
's,
's, and
's in the mask is
proportional to the recombination weights
,
, and
of the parents.
Every entry of the mask indicates to which parent the other two parents need to
be equal to for that specific position. In a parent, the content of a position
is changed by swapping it with the content of another position in the parent.
The recombination proceeds as follows. The mask is scanned from the left to the
right. In position 1, the mask has a
. This means that at position 1, parent
and parent
have to become
equal to parent
. This is done by swapping the
elements 1 and 3 in parent
and the elements
1 and 3 in parent
. The recombination now continues on the updated
parents: parent
is left
unchanged and the current parent
and parent
are the
original parents
and
after the swap.
At position 2, the mask has
. This means that at position 2,
the current parent
and current
parent
have to become
equal to the current parent
. So at position 2, parent
and parent
have to get 5.
To achieve this, in parent
, we need to swap elements 2 and 5, and in parent
, we need to swap elements 2 and 5. The recombination
continues on the updated parents for position 3 and so on, up to the last
position in the mask. At this point, the three parents are now equal because at
each position, one element of the permutation has been fixed in that position
and it is automatically not involved in any further swap. Therefore, after all
positions have been considered, all elements are fixed. The permutation to
which the three parents converged is the offspring permutation. This
recombination sorts by swaps the three parents
towards each other according to the contents of the crossover mask, and the
offspring is the result of this multiple sorting. This recombination can be
easily generalized to any number of parents.
Theorem 10 (geometricity of three-parental sorting crossover).
Three-parental sorting crossover is a geometric crossover under swap
distance.
Proof.
A
three-parental sorting crossover with recombination mask
is equivalent
to a sequence of two two-parental sorting crossovers: the first
is between parents
and
with
recombination mask
obtained by
substituting all 3's with 2's in
. The offspring
so obtained is
recombined with
with
recombination mask
obtained by
substituting all 1's with 2's in
. So for Theorem 2, the three-parental sorting
crossover is geometric.
Theorem 11 (coherence between weights and distances).
In a weighted multiparent sorting crossover, the swap
distances of the parents to the expected offspring are decreasing functions of
the corresponding weights.
Proof.
The
weights associated to the parents are proportional to their frequencies in the
recombination mask. The more occurrences of a parent in the recombination mask,
the smaller the swap distance between this parent and the offspring. This is
because the mask tells the parent to copy at each position. So the higher the
weight of a parent, the smaller its distance to the offspring.
The weighted multiparental sorting crossover is a
convex combination operator satisfying the four requirements of a metric convex
combination for the swap space: convex weights sum to 1 by definition,
convexity (geometricity, Theorem 10), coherence (Theorem 11), and symmetry
(self evident).
The solution representation for Sudoku is a vector of
permutations. For the product geometric crossover theorem, the compound
crossover over the vector of permutations that applies a geometric crossover to
each permutation in the vector is a geometric crossover. Theorem 12 extends to
the case of a multiparent geometric crossover.
Theorem 12 (product geometric crossover for convex combinations).
A convex combination operator, applied to
each entry of a vector, results in a convex combination operator for the entire
vector.
Proof.
The
product geometric crossover theorem (Theorem 1) is true because the segment of
a product space is the Cartesian product of the segments of its projections. A
segment is the convex hull of two points (parents). More generally, it holds
that the convex hull (of any number of points) of a product space is the
Cartesian product of the convex hulls of its projections [17]. The product
geometric crossover then naturally generalizes to the multiparent case.
8. Experimental Results for Sudoku
In order to test the efficacy of the GPSO algorithm on
the Sudoku problem, we ran several experiments in order to thoroughly explore
the parameter space and variations of the algorithm. The algorithm in itself is
a straightforward implementation of the GPSO algorithm given in Section 3.4
with the search operators for Sudoku presented in Section 7.3.
The parameters we varied were swarm sociality (
) and memory (
), each of
which were in turn set to 0, 0.2, 0.4, 0.6, 0.8, and 1.0. Since the attraction
to each particle's position is defined as
, the space of this parameter was implicitly explored.
Likewise, mutation probability was set to either 0, 0.3, 0.7, or 1.0. The swarm
size was set to be either 20, 100, or 500 particles, but the number of updates
was set so that each run of the algorithm resulted in exactly 100 000 fitness
evaluations (thus performing 5000, 1000, or 200 updates). Further, each
combination was tried with ring topology, von Neumann topology (or lattice
topology), and global topology, which are the most common topologies.
As explained in Section 5, there are two alternative
ways of producing a convex combination: either using a convex combination
operator or simply applying twice a two-parental
weighted recombination with appropriate weights to obtain the convex combination.
Both ways to produce convex combination operators, explicit and implicit, were
tried on preliminary runs and turned out to produce indistinguishable results.
In the end, we used the convex combination operator.
8.1. Effects of Varying Coefficients
The best population size is 100. The other two sizes
we studied (20 and 500) were considerably worse. The best topology is the
lattice (von Neumann) topology. The other two topologies we studied were worse
(see Table 9).
From Tables 6–8, we can see that mutation rates of 0.3
and 0.7 perform better than no mutation at all. We can also see that parameter
settings with
(i.e., the
attraction of the particle's previous position) set to more than 0.4 generally
perform badly. The best configurations generally have
(i.e.,
sociality) set to 0.2 or 0.4,
(i.e., memory)
set to 0.4 or 0.6, and
set to 0 or 0.2.
This gives us some indication of the importance of the various types of
recombinations in GPSO as applied at least to this particular problem.
Surprisingly, the algorithm works at its best when the weight of the particle
position (
) is zero or
nearly zero. In the case of
set to 0, GPSO,
in fact, degenerates to a type of evolutionary algorithm with deterministic
uniform selection, mating with the population best with local replacement
between parents and offspring.
Table 6: Average of bests of 50 runs with population
size 100, lattice topology, and mutation 0.0, varying sociality (vertical) and
memory (horizontal).
Table 7: Average of bests of 50 runs with population
size 100, lattice topology, and mutation 0.3, varying sociality (vertical) and
memory (horizontal).
Table 8: Average of bests of 50 runs with population
size 100, lattice topology and mutation 0.7, varying sociality (vertical) and
memory (horizontal).
Table 9: Success rate of various methods.
8.2. PSO versus EA
Table 9 compares the success rate of the best
configurations of various methods we have tried. Success is here defined as the
number of runs (out of 50) where the global optimum (243) is reached. All the
methods were allotted the same number of function evaluations per run.
From the table, we can see that the von Neumann
topology clearly outperforms the other topologies we tested, and that a GPSO
with this topology can achieve a respectable success rate on this tricky
noncontinuous problem. However, the best genetic algorithm still significantly
outperforms the best GPSO we have found so far. (Notice that this is true only
when considering GPSO as an optimizer. The approximation behavior of the GPSO
is very good: with the right parameter setting, the GPSO reaches on average a
fitness of 242 out of 243 (see Tables 6–8).) We believe this to be at least
partly the effect of the even more extensive tuning of parameters and operators
undertaken in our GA experiments.
9. Conclusions and Future Work
We have extended the geometric framework with the
notion of multiparent geometric crossover, that is, a natural generalization of
two-parental geometric crossover: offspring are in the convex hull of the
parents. Then, using the geometric framework, we have shown an intimate
relation between a simplified form of PSO (without the inertia term) and
evolutionary algorithms. This has enabled us to generalize, in a natural,
rigorous, and automatic way, PSO for any type of search space for which a
geometric crossover is known.
We specialized the general PSO to Euclidean,
Manhattan, and Hamming spaces, obtaining three instances of the general PSO for
these spaces, EPSO, MPSO, and HPSO, respectively. We have performed extensive
experiments with these new GPSOs. In particular, we applied EPSO, MPSO, and
HPSO to standard sets of benchmark functions and obtained a few surprising
results. Firstly, the GPSOs have performed really well, beating the canonical
PSO with standard parameters most of the time. Secondly, they have done so
right out of the box. That is, unlike the early versions of PSO which required
considerable effort before a good general set of parameters could be found,
with GPSO, we have done very limited preliminary testing and parameter tuning,
and yet the new PSOs have worked well. This suggests that they may be quite
robust optimisers. This will need to be verified in future research. Thirdly,
HPSO works at its best with only weak attraction toward the current position of
the particle. With this configuration, GPSO almost degenerates to a type of
genetic algorithm.
An important feature of the GPSO algorithm is that it
allows one to automatically define PSOs for all spaces for which a geometric
crossover is known. Since geometric crossovers are defined for all of the most
frequently used representations and many variations and combinations of those,
our geometric framework makes it possible to derive PSOs for all such
representations. GPSO is rigorous generalization of the classical PSO to
general metric spaces. In particular, it applies to combinatorial spaces.
We have demonstrated how simple it is to specify the
general GPSO algorithm to the space of Sudoku grids (vectors of permutations),
using both an explicit and an implicit
definitions of convex combination. We have tested
the new GPSO on Sudoku and have found that (i) the communication topology makes
a huge difference and that the lattice topology is by far the best; (ii) as for
HPSO, the GPSO on Sudoku works better with weak attraction toward the current
position of the particle; (iii) the GSPO on Sudoku finds easily near-optimal
solutions but it does not always find the optimum. Admittedly, GPSO is not the
best algorithm for the Sudoku puzzle where the aim is to obtain the correct
solution all the times, not a nearly correct one. This suggests that GPSO would
be much more profitably applied to combinatorial problems for which one would
be happy to find near-optimal solutions quickly.
In summary, we presented a PSO algorithm that has been
quite successfully applied to a nontrivial combinatorial space. This shows that
GPSO is indeed a natural and promising generalization of classical PSO. In
future work, we will consider GPSO for even more challenging combinatorial
spaces such as the space of genetic programs. Also, since the inertia term is a
very important feature of classical PSO, we want to generalize it and test the
GPSO with the inertia term on combinatorial spaces.
Acknowledgment
The second and
fourth authors would like to thank EPSRC, Grant no. GR/T11234/01 “Extended
Particle Swarms,” for the financial support.
References
- T. Bäck, Evolutinary Algorithms in Theory and in Practice, Oxford University Press, Oxford, UK, 1996.
- M. Clerc, “Discrete particle swarm optimization, Illustrated by the traveling salesman problem,” in New Optimization Techniques in Engineering, Springer, New York, NY, USA, 2004.
- M. Clerc and J. Kennedy, “The particle swarm-explosion, stability, and convergence in a multidimensional complex space,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 1, pp. 58–73, 2002.
- M. Clerc, “When nearer is better,” 2007, Hal Open Archive.
- M. Deza and M. Laurent, Geometry of Cuts and Metrics, Springer, New York, NY, USA, 1991.
- K. De Jong, An analysis of the behaviour of a class of genetic adaptive systems, Ph.D. thesis, University of Michigan, Ann Arbor, Mich, USA, 1975.
- D. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, Mass, USA, 1989.
- T. Jones, Evolutionary algorithms, fitness landscapes and search, Ph.D. thesis, University of New Mexico, Albuquerque, NM, USA, 1995.
- J. Kennedy and R. C. Eberhart, “A discrete binary version of the particle swarm algorithm,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (ICSMC '97), vol. 5, pp. 4104–4108, Orlando, Fla, USA, October 1997.
- J. Kennedy and R. C. Eberhart, Swarm Intelligence, Morgan Kaufmann, San Francisco, Calif, USA, 2001.
- D. E. Knuth, “Dancing links,” 2000, Preprint P159, Stanford University.
- A. Moraglio, J. Togelius, and S. Lucas, “Product geometric crossover for the Sudoku puzzle,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC '06), pp. 470–476, Vancouver, BC, Canada, July 2006.
- A. Moraglio and R. Poli, “Topological interpretation of crossover,” in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '04), vol. 3102, pp. 1377–1388, Seattle, Wash, USA, June 2004.
- A. Moraglio and R. Poli, “Topological crossover for the permutation representation,” in Proceedings of the Workshops on Genetic and Evolutionary Computation (GECCO '05), pp. 332–338, Washington, DC, USA, June 2005.
- A. Moraglio and R. Poli, “Product geometric crossover,” in Proceedings of the 9th International Conference on Parallel Problem Solving from Nature (PPSN '06), pp. 1018–1027, Reykjavik, Iceland, September 2006.
- A. Moraglio and R. Poli, “Geometric crossover for sets, multisets and partitions,” in Proceedings of the 9th International Conference on Parallel Problem Solving from Nature (PPSN '06), pp. 1038–1047, Reykjavik, Iceland, September 2006.
- A. Moraglio and R. Poli, “Geometric landscape of homologous crossover for syntactic trees,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC '05), pp. 427–434, Edinburgh, UK, September 2005.
- A. Moraglio, R. Poli, and R. Seehuus, “Geometric crossover for biological sequences,” in Proceedings of the 9th European Conference on Genetic Programming, pp. 121–132, Budapest, Hungary, April 2006.
- H. Müehlenbein, M. Schomisch, and J. Born, “The Parallel genetic algorithm as function optimizer,” Parallel Computing, vol. 17, pp. 619–632, 1991.
- P. M. Pardalos and M. G. C. Resende, Handbook of Applied Optimization, Oxford University Press, Oxford, UK, 2002.
- Y. Shi and R. C. Eberhart, “A modified particle swarm optimizer,” in Proceedings of the IEEE Congress on Evolutionary Computation, pp. 69–73, Anchorage, Alaska, USA, May 1998.
- A. Törn and A. Zilinskas, Global Optimization, vol. 350 of Lecture Notes in Computer Science, Springer, Berlin, Germany, 1989.
- T. Yato and T. Seta, “Complexity and completeness of finding another solution and its application to puzzles,” 2005, Preprint, University of Tokyo.
- M. L. J. van de Vel, Theory of Convex Structures, North-Holland, Amsterdam, The Netherlands, 1993.