Computational Intelligence and Neuroscience

Computational Intelligence and Neuroscience / 2011 / Article

Research Article | Open Access

Volume 2011 |Article ID 121787 | 11 pages |

Multistrategy Self-Organizing Map Learning for Classification Problems

Academic Editor: Francois Benoit Vialatte
Received12 Jan 2011
Revised21 Apr 2011
Accepted23 Jun 2011
Published16 Aug 2011


Multistrategy Learning of Self-Organizing Map (SOM) and Particle Swarm Optimization (PSO) is commonly implemented in clustering domain due to its capabilities in handling complex data characteristics. However, some of these multistrategy learning architectures have weaknesses such as slow convergence time always being trapped in the local minima. This paper proposes multistrategy learning of SOM lattice structure with Particle Swarm Optimisation which is called ESOMPSO for solving various classification problems. The enhancement of SOM lattice structure is implemented by introducing a new hexagon formulation for better mapping quality in data classification and labeling. The weights of the enhanced SOM are optimised using PSO to obtain better output quality. The proposed method has been tested on various standard datasets with substantial comparisons with existing SOM network and various distance measurement. The results show that our proposed method yields a promising result with better average accuracy and quantisation errors compared to the other methods as well as convincing significant test.

1. Introduction

In classification process; normally, large classes of objects are separated into smaller classes. This approach can be very complicated due to the challenge in identifying the criteria especially for procedures involving complex data structures. In this scenario; practically, the Machine Learning (ML) techniques will be used and introduced by many researchers as alternative solutions to solve the above problems. Among the ML methods and tools, Artificial Neural Network (ANN), Fuzzy Set, Genetic Algorithm (GA), Swarm Intelligence (SI), and rough set are commonly used by researchers.

However, the most popular ML method widely used by the practitioners is ANN [1]. Various applications of ANN which have been implemented in many practical problems such as meteorological forecasting, image processing, and agriculture are discussed in [24]. In ANN model, simple neurons are connected together to form series of connected network. While a neural network does not have to be adaptive, its advantages arise with proper algorithms to update the weights of the connections to produce a desired output.

ANN and evolutionary computation methodologies have each been proven effective in solving certain classes of problems. For example, neural networks are very efficient at mapping input to output vectors and evolutionary algorithms are very useful at optimization. ANN weaknesses could be solved either by enhancing the structures of ANN itself or by hybridizing it with evolutionary optimisation [5, 6]. Evolutionary computation is based on population of optimisation techniques such as evolutionary Algorithm (EA) and Swarm Intelligence (SI). One of the techniques used in EA is Genetic Algorithm (GA), inspired by biological evolution such as inheritance, mutation, selection, and crossover. On the other hand, SI methods such as Particle Swarm Optimisation (PSO) and Ant Colony Optimisation (ACO) are motivated by flock of birds, swarm of bees, ant colony, and school of fish.

The searching implementation with evolutionary method such as ANN learning may overcome the gradient-based handicaps. However, the convergence is in general much slower, since these are general purpose methods. Kennedy and Eberhart [7] proposed a very simple nonlinear optimisation technique called PSO which requires little computational costs. The authors argued that PSO could train Feedforward Neural Network (FNN) with a performance similar to the backpropagation (BP) method, for the XOR and Iris benchmarks. Since then, several researchers have adopted PSO for FNN learning. However, most of the studies focus on the hybridisation of PSO and FNN. Few studies have been conducted on the hybridisation of PSO with Self-Organizing Map (SOM) to solve complex problems.

Early studies have shown that the multistrategy learning of PSO-SOM approach was first introduced by Shi and Eberhart [8] with modified particle swarm optimizer. Subsequently, Xiao et al. [9, 10] used hybrid SOM-PSO approach to produce better clustering of gene datasets. The authors used SOM learning and PSO to optimise the weights of SOM. However, the merit for combination of SOM-PSO without conscience factor was poor than SOM alone. This is because this factor is valuable as a competitive learning technique, but it reduces the number of epochs necessary to produce a robust solution. In 2006, O’neill and Brabazon [11] adopted PSO as unsupervised SOM algorithm. The authors suggested using different distance metric in calculating the distance between input vectors and each member of the swarm to produce competitive result for data classification. However, in this study, types of SOM lattice structure were not considered.

Moreover, Chandramouli [12] used SOM and PSO for image classifier, and the author stated that SOM was dominant in image classification problems. However, the problem emerged in generating image classes which provided concise visualisation of the image dataset. Therefore, the author used dual layer of SOM structure and PSO to optimise the weights of SOM neurons. In addition, Forkan and Shamsuddin [13] introduced a new method for surface reconstruction based on hybridization of SOM and PSO. The authors used growing grid structure in the Kohonen network to learn the sample data through mapping grid and PSO to probe the optimum fitting points on the surface. In this study, the proposed Kohonen network was a 3D rectangular map and being enhanced using growing grid method. However, this study did not focus on the lattice structure of the Kohonen network.

Sharma and Omlin [14] utilized a U-matrix of SOM to determine cluster boundaries using PSO algorithm. The authors compared the results with other clustering techniques such as k-means and hierarchical clustering. However, this study did not focus on the structure of SOM architecture. Recently, Özçift et al. [15] proposed PSO in the optimisation of SOM algorithm to reduce the training time without loss of quality in clustering. The author stated that the size of lattice is related to the clustering quality of SOM. This optimisation technique has successfully reduced the numbers of nodes that finds the best-matching unit (BMU) for a particular input. Having a larger grid size in SOM will invite higher training time. Furthermore, the larger the lattice size, the more nodes should be considered for BMU calculation, thus causes higher operating cost for the algorithm.

Since 1993, Extension of SOM network topologies such as self-organization network has been implemented in many applications. Fritzke [16] introduced Growing Cell Structures (GCS) for unsupervised learning in data visualisation, vector quantization and clustering, while supervised GCS is suitable for data classification with Radial Basis Function (RBF). In 1995, Fritzke extended the growing neural gas to dynamic SOM known as Growing SOM (GSOM). Hsu et al. [17] stated that GSOM can provide balance performance in topology preservation, data visualization, and computational speed. Consequently, Chan et al. [18] used GSOM to improve binning process and later Forkan and Shamsuddin [13] for intelligent surface reconstruction.

Hybridization of SOM and evolutionary method was proposed by Créput et al. [19] to address the vehicle routing problem with time windows. In this study, the experimental result shows that the proposed method improves SOM-based neural network application. In addition, Khu et al. [20] implemented the combination of multiobjective GA and SOM to incorporate multiple observations for distributed hydrologic model calibration. SOM clustering has reduced the number of objective functions while multiobjective GA was implemented to give better solution in optimization problems. Furthermore, Eisuke [21] investigated GA performance by combining GA with SOM to improve search performance by real-coded GA (RCGA). The result shows that SOM-GA gives better solution in computation times rather than RCGA.

The quality of the Kohonen map is determined by its lattice structure. This is because the weights for each neuron in the neighborhood will be updated by these lattice structures. There are many types of SOM lattice structures: circle lattice structure, rectangular, hexagonal, spherical (Figure 1), and torus (Figure 2). Many studies have been done on comparing the lattice structure of SOM, for instance, between traditional SOM and Spherical [22, 23].

Spherical and Torus SOM representing the plane lattice give a better view of the input data as well as provide closer links to edge nodes. They make the 2D visualisation of multivariate data possible using SOM’s code vectors as data source [24]. Spherical and Torus SOM structures focus on topological grid mapping structures rather than improvement on lattice structure. This is due to the border effect issues that were highlighted in previous studies by Ritter [25] Marzouki and Yamakawa [26]. Furthermore, the usefulness of the Spherical SOM for clustering and visualization is discussed in [27, 28].

According to Middleton et al. [29], hexagonal lattice structure is effective for image processing since the structure can uniform the image pixel. Park et al. [30] has used the hexagonal lattice to provide better visualization. Hexagonal lattice was preferred because it does not favor horizontal or vertical directions [31]. The number of nodes was determined as 5×numberofsamples [32]. Basically, the two largest eigenvalues of the training data were calculated, and the ratio between the side lengths of the map grid was set to the ratio between the two maximum eigenvalues. The actual side lengths were set so that the product was close to the determined number of map units.

Astel et al. [33] has also implemented the hexagonal neighborhood lattice and compared the SOM classification approach with cluster and Principal Component Analysis (PCA) for large environmental dataset. The results obtained allowed detecting natural clusters of monitoring locations with similar water quality type and identifying important discriminant variables responsible for the clustering. SOM clustering allows simultaneous observation of both spatial and temporal changes in water quality.

Wu and Takatsuka [34] used fast spherical SOM for geodesic data structure. The proposed method was used to remove border effect in SOM, but the limitation was slower in computation times. Furthermore, Kihato et al. [24] implemented spherical and torus SOM for analysis, visualization, and prediction of the syndrome trends. The proposed method has been implemented by physicians to monitor patients’ current health trends and status.

Due to limitations of the previous studies focusing on the improvement of SOM lattice structure, this study enhanced SOM lattice structure with improved hexagonal lattice area. In SOM competitive learning process, wider lattice are needed for searching the winning nodes as well as for weights adjustment. This allows SOM to get a good set of weights for improving the quality of data classification and labeling. Particle Swarm Optimisation (PSO) is developed to optimize SOMs’ training weights accordingly. The hybridisation of SOM-PSO architecture, so-called Enhanced SOM with Particle Swarm Optimisation (ESOMPSO) is proposed with improvement on the lattice structure for better classification. The performance of the proposed ESOMPSO is validated based on the classification accuracy and quantization errors (QE). The error deviations between the proposed methods are computed to further illustrate the efficiency of these approaches accordingly.

2. The Proposed Method

In this study, we proposed multistrategy learning with the Enhancement of SOM with PSO (ESOMPSO) and improved formulation of hexagonal lattice structure. Unlike conventional hexagonal lattice (as given in (1)), a neighbourhood of the proposed formulation is given with the influence of 𝑁(𝑗,𝑡) instead of the neighbourhood width, 𝑁(𝑗). Since 𝐷(𝑡) is a threshold value, it will decrease gradually as training progresses. For this neighbourhood function, the distance is determined by considering the distance of each dimension. The dimension with the maximum value is chosen as distance node from BMU, 𝑑(𝑗). 𝑁(𝑗) corresponds to a hexagonal lattice around 𝑛win having neighbourhood width as below:1𝑅=6×2×𝑟×𝑟214𝑟2,(1) where 𝑅 is the standard hexagonal lattice and𝑁(𝑗,𝑡)=1,𝑑(𝑗)𝐷(𝑡)0,𝑑(𝑗)>𝐷(𝑡).(2)

The weights of all neurons within this hexagon are updated with 𝑁(𝑗)=1, while the others remaining unchanged. As the training progresses, this neighborhood gets smaller, resulting in the neurons that are very close to the winner and will get updated accordingly. The training stops when there is no more neuron in the neighborhood. Usually, the neighborhood function, 𝑁(𝑗,𝑡), is chosen as an L-dimensional Gaussian function as given below:𝑁(𝑗,𝑡)=exp𝑑(𝑗)22𝜎(𝑡)2.(3) The proposed SOM algorithm for the above process is shown below.

For each input vector 𝑉, do the following.(a)Initialisation—set initial synaptic weights to small random values, say in a interval [0,1], and assign a small positive value to the learning rate parameter. (b)Competition—for each output node 𝑗, calculate the value 𝐷(𝑉𝑊𝑗) of the scoring function. For instance, Euclidean distance measurement is denoted as𝐷𝑉𝑊𝑗=𝑖=𝑛𝑖=0𝑉𝑖𝑊𝑖𝑗2,(4a) for the Manhattan distance, the equation is given as 𝐷𝑉𝑊𝑗=𝑖=𝑛𝑖=0𝑉𝑖𝑊𝑖𝑗,(4b) for the Chebyshev distance, the equation is given as 𝐷𝑉𝑊𝑗=max𝑖||𝑉𝑖𝑊𝑖𝑗||.(4c)Find the winning node 𝐽 that minimizes 𝐷(𝑉𝑊𝑗) overall output nodes.(c)Cooperation—identify all output nodes 𝑗 within the neighborhood of 𝐽 defined by the neighborhood size 𝑅. For these nodes, do the following for all input records. Reduce the radius with exponential decay function: 𝜎(𝑡)=𝜎01exp𝜆,𝑡=1,2,3,,(5) where 𝜎0 is the initial radius, 𝜆 is the maximum iteration, 𝑡 is the current iteration; and formulation of improved hexagonal lattice is given as 𝑅new=(2𝑟+1)2+2𝑟2,(6) where 𝑅new is the enhanced/improved hexagonal lattice, 𝑟 is the neighborhood radius.(d)Adaptation—adjust the weights: 𝑊(𝑡+1)=𝑊(𝑡)+Θ(𝑡)𝐿(𝑡)(𝑉(𝑡)𝑊(𝑡)),(7) where 𝐿 is the learning rate, Θ is the influence a node’s distance from the BMU, 𝐿(𝑡)=𝐿0𝑡exp𝜆,𝑡=1,2,3,,(8) where 𝐿0 is the initial learning rate, Θ(𝑡)=expdist22𝜎2(𝑡),𝑡=1,2,3,(9) and dist is the distance of a node from BMU, 𝜎 is the width of neighborhood.(e)Iteration—adjust the learning rate and neighborhood size, as needed until no changes occur in the feature map. Repeat step (ii) and stop when the termination criteria are met. The improved hexagonal lattice area consists of six important points: right_border (𝑥,𝑦), left_border (𝑥,𝑦), up_right_border (𝑥,𝑦), up_left_border (𝑥,𝑦), bottom_right_border (𝑥,𝑦), bottom_left_border (𝑥,𝑦) (see Algorithm  1). Figure 3 illustrates the formulation of improved hexagonal lattice area. Detail explanation of the proposed method is discussed in next paragraph.

radius = n, bmu (x, y),
right_border_x = bmu_x;
left_border_x = bmu_x;
right_border_y = bmu_y + 2*n;
left_border_y = bmu_y − 2*n;
Up_right_border_x = bmu_x − n;
Up_right_border_y = bmu_y + n;
Bottom_right_border_x = bmu_x + n;
Bottom_right_border_y = bmu_y + n;
Up_left_border_x = bmu_x – n;
Up_left_border_y = bmu_y − n;
Bottom_left_border_x = bmu_x + n;
Bottom_left_border_y = bmu_y − n;

Subsequently, the weights of ESOMPSO learning are optimised by PSO. Particle Swarm Optimisation (PSO) is one of the Swarm Intelligence (SI) techniques that are inspired by social behavior of bird flocking and fish schooling. The pioneers of the PSO algorithm are Kennedy, Eberhart, and Shi in 1995 [35]. PSO is a global optimisation, population-based evolutionary algorithm for dealing with problems in which the best solution can be presented as a point or surface in an 𝑛-dimensional space. Hypothesis are plotted in this space and seeded with an initial velocity, as well as a communication between the particles. In this study, the hybridisation approach of ESOMPSO is based on the Kohonen structure to improve the quality of data classification and labeling. An improved hexagonal lattice area is introduced for SOM learning enhancement; and PSO is integrated into this proposed SOM to evolve the weights for the learning prior to the weights adjustments. This is because PSO can find the best reduced search space for a particular input and support the algorithm to take more nodes into consideration while determining search space and not to be trapped by the same node continuously [15]. The algorithm for integrating ESOMPSO is shown below. At this stage, the enhanced SOM will be implemented for the classification purpose to obtain the weights and later will be optimised using PSO. (1)The rectangular topology and hexagonal lattice structure of the SOM is initialized with feature vectors 𝑚𝑖, where 𝑖=1,2,,𝐾 randomly, where 𝐾 is the length of the feature vector.(2)Input feature vector 𝑥 is presented to the network and the winner node 𝐽, that is closest to the input pattern, 𝑥 is chosen using the equation: 𝐽=arg𝑖min𝑥𝑚𝑗.(10)(3)Initialise the population array of particle representing random solutions for 𝑑 dimensional problem space.(4)For each particle, the distance function is evaluated, 𝐷𝑖𝑗=𝑘𝑙=1||𝑥𝑖𝑗𝑥𝑗𝑙||.(11)(5)The personal best 𝑝best is updated by the following condition: if𝑓𝑝best𝑖>current𝑖,then𝑝best𝑖=current𝑖.(12)(6)The global best 𝑔best is updated with the following condition: if𝑓𝑔best𝑑=𝑓current𝑑,then𝑔best𝑑=current𝑑.(13)(7)Update the velocity 𝑉𝑖𝑑 using 𝑉𝑖𝑑=𝑊𝑥𝑉𝑖𝑑𝐺+𝐶1best,𝑑𝑋𝑖𝑑𝑃+𝐶2best,𝑖𝑋𝑖𝑑,(14) where 𝐶1>0and𝐶2>0 constants are called the cognitive and social parameters, and 𝑊>0 is a constant called the inertia parameter. (8)Update the position 𝑋𝑖𝑑 using 𝑋𝑖𝑑=𝑋𝑖𝑑+𝑉𝑖𝑑,(15) where 𝑋𝑖𝑑 is the new position 𝑋 and 𝑉𝑖𝑑 is the new velocity 𝑉.(9)Repeat steps 2 to 9 until all input patterns are exhausted in the training.

3. Experimental Setup

To investigate the effectiveness of PSO in evolving the weights from SOM, the proposed method has been performed in the testing and validation process. In the testing phase, data is presented to the network with target nodes for each input sets. The reference attributes or classifier computed during training process is used to classify input data set. The algorithm identifies the winning node that will be used for determining the output of the network. Then, the output of the network is compared to the expected result to decide the ability of the network for classification phase. This classification stage will classify test data into correct predefined classes obtained during training process. A number of data is presented to the network, and the percentage of correct classified data is calculated. The percentage of the correctness is measured to obtain the accuracy and the learning ability of the network. The result is validated and compared using several performance measurements: quantisation error (QE) and classification accuracy. Later, the error differences between the proposed methods are computed for further validations.

The performance measurement of the proposed methods is based on quantisation error (QE) and classification accuracy (CA). QE is measured after SOM’s training, and CA is the analysis for testing. The efficiency of the proposed methods is validated accordingly; if QE values are smaller and the classification accuracy is higher, then the results are promising. QE is used for measuring the quality of SOM map. QE of an input vector is defined by the difference between the input vector and the closest codebook vectors. QE describes how accurately the neurons respond to the given dataset. For example, if the reference vector of the BMU calculated for a given testing vector 𝑥𝑖 is exactly similar as 𝑥𝑖, the error in precision is 0.0. The equation is given as follows

Quantization Error: 𝐸𝑞=1𝑁𝑁𝑘1𝑥𝑘(𝑡)𝑤𝑚𝑘(𝑡),(16) where 𝑤𝑚𝑘 is the best unit of weight on times 𝑡.

While the classification accuracy indicates how well the classes are separated on the map, the classification accuracy of new samples measures the networks generalisation for better quality of SOM’s mapping.

Classification accuracy, 𝑃𝑛(%)=𝑁×100,(17) where 𝑛 is the number of classified pattern, 𝑁 is the total number of testing data.

The goal of the conducted experiments is to investigate the performance of the proposed methods. The comparisons are done on ESOMPSO, SOM with PSO (SOMPSO), and enhanced SOM (ESOM). The results are validated in terms of classification accuracy and quantisation error (QE) on standard universal machine learning datasets: Iris, XOR, Cancer, Glass, and Pendigits. From the conducted experiments, it shows that the proposed methods, ESOMPSO and SOMPSO, give better accuracy despites higher convergence time. As PSO and improved lattice structure are being implemented, the convergence time is increasing. This scenario is due to the PSO process in searching for the 𝑔best of BMU as well as wider coverage for updating nodes with the improved lattice structure.

Self-Organizing Maps (SOM) has two layers: input and output layers. The basic SOM architecture consists of a lattice that acts as an output layer with its input nodes fully connected. In this study, the network architecture is designed based on the selected real world classification problems. Table 1 provides the specification for each dataset.

Data typeIrisXORCancerGlassPendigits

Input node44301016
Output ode11111
Data size150856921410992
Training size1206379149494
Testing size30219065498

The input layer is comprised of input pattern with different nodes that is randomly chosen from training data set. Input patterns are presented to all output nodes (neurons) in the network simultaneously. The number of input node determines the number of data required to be fed into the network, while the numbers of nodes in the Kohonen layer represent the maximum number of possible classes. Table 2 shows the class information for Iris, XOR, Glass, and Pendigits training datasets.

DatasetNumber of classClasses

Iris3Class 1: Iris Virginia
Class 2: Iris Sentosa
Class 3: Iris Versicolor
XOR4Class 1: 0
Class 2: 1
Cancer2Class 1: Benign
Class 2: Malignant
Glass6Class 1: Building windows float
Class 2: Processed building
Class 3: Windows float
Class 4: Processed building
Class 5: Windows nonfloat
Class 6: Processed containers
Class 7: Tableware Headlamps
Pendigits10Class 0: Digit 1
Class 1: Digit 2
Class 2: Digit 3
Class 3: Digit 4
Class 4: Digit 5
Class 5: Digit 6
Class 6: Digit 7
Class 7: Digit 8
Class 8: Digit 9

The training starts once the dataset has been initialised and input patterns have been selected. The learning phase of the SOM algorithm repeatedly presents numerous patterns to the network. The learning rule of the classifier allows these training cases to organize in a two-dimensional feature map. Patterns which resemble each other are mapped onto a specific cluster. During the training phase, the class for randomly selected input node is determined. This is done by labeling the output node that is more similar (best-matching unit) to the input node compared to other nodes in the Kohonen mapping structure. The outputs from the training are the resulting map that contains the winning neurons and its associated weight vectors. Subsequently, these weight vectors are optimised by PSO. The quality of the classification accuracy is calculated to investigate the behavior of the network in the training data.

In the testing phase, for any input patterns, if the 𝑚th neuron is the winner, it belongs to the 𝑚th clusters. In this case, we were able to test the capacity of the network to correctly classify new independent test set to a reasonable class. An independent test set is a set similar to the input set but not part of the training set. The testing set can be seen as a representative of the general case. There is no weight updating in the recalling phase. A series of datasets obtained that was not used in learning phases, but was previously interpreted, was presented to the network. For each case, the response of the network (the label of the associated neuron) was compared to the expected result, and the percentage of correct responses was computed. This simulation results obtained from standard SOM and Enhanced SOM classifiers were used for further analysis.

It is often reported in the literature that the success of the Self-Organizing Maps (SOM) formation is critically dependent on the initial weights and the selection of main parameters of the algorithm, namely, the learning rate parameter and the neighborhood set [36, 37]. They usually have to be counteracted by trial and error method, hence time consuming to retrain the procedures. Due to the time constraints, all the parameter values were fixed and constantly used throughout all the experiments. According to [38], the number of map units is usually in the range of 100 to 600. Deboeck and Kohonen [39] recommend using ten times the dimension of the input patterns as the number of neurons, and this was adopted in these experiments.

There is no guideline in suggesting good learning rates to any given learning problem. In standard SOM, too large and too small learning rates can lead to poor network performance [40]. Neighborhood function and the number of neurons determine the granularity of the resulting mapping. Larger neighborhood was used in the beginning of training and then gradually decreases to a suitable final radius. The larger the area for neighborhoods functions with high values, the more rigid and flexible the map will be. In these experiments, the initial radius size is set to half of the size of the lattice. A more recent version of the feature map adapts the Gaussian function to describe the neighborhood and the learning rate. The Gaussian function is supposed to describe a more natural mapping so as to help the algorithm converge in a more stable manner.

The accuracy of the map also depends on the number of iterations of the SOM algorithm. A rule of thumb states, for good statistical accuracy, number of iterations should be at least 500 times the number of neurons. According to [36], the total learning time is always 100 to 10000. If the time taken is longer, the clustering result becomes inaccurate. A more serious problem is that the topology preserving mapping is not guaranteed even if a huge number of iterations were used. Here, the SOM classifiers were evaluated by measuring the performance of clustering result based on the classification accuracy and the computation time [41]. To meet the requirement of SOM’s quality measurement, the quantisation error was calculated, which is defined as the average distance between every input vector and its BMU. The experiments on ESOMPSO were carried out for each selected dataset (Table 3).


Input vector (Training)12063791497494
Input vector (Testing)302190653498
Input dimension4430916
SOM's Mapping Dimension ( 𝑋 , 𝑌 ) 10 × 1010 × 1010 × 1010 × 1010 × 10
SOM lattice structureStandard
ESOM lattice structureImproved
Learning rate0.
Number of runs10 times10 times10 times10 times10 times
𝐶 1
𝐶 2
Δ 𝑡
Number of particles100100100100100
PSO problem dimension10 × 1010 × 1010 × 1010 × 1010 × 10
Stop condition (minimum error)0.00001930.00001930.00001930.00001930.0000193

4. Experimental Results and Analysis

The experiments were conducted with various datasets and distance measurements: Euclidean, Manhattan, and Chebyshev distance. The comparisons were conducted between standard SOM and standard SOM with improved hexagonal structure, so-called ESOM. Standard SOM was trained using standard hexagonal lattice, while ESOM with improved hexagonal lattice. The choice of distance measure influences the accuracy, efficiency, and generalisation ability of the results. From Table 4, ESOM with Euclidean distance gives promising accuracy of 86.9876%, followed by the Chebyshev distance 84.2561% and the Manhattan distance 80.4462%. The least quantisation error is 0.0108 for Glass dataset. It shows that the improved lattice structure of ESOM yields significant impact on the accuracy of the classifications.


IRISQuantization error0.03480.03580.04190.01710.02440.0275
Classification (%) 74.333360.000070.000076.666773.333374.333
XORQuantization error0.20090.20600.21590.19410.24580.2077
Classification (%) 75.343668.452572.563286.987680.446284.2561
CANCERQuantization error0.45410.49130.50370.43970.47710.4781
Classification (%) 37.894743.157974.210577.894734.736871.5789
GLASSQuantization error0.03370.03070.03500.01080.01220.0117
Classification (%) 50.923113.846236.923155.384650.769244.6154
PENDIGITSQuantization error0.19860.20060.21030.18970.19571.2008
Classification (%) 74.642744.596972.641576.357952.944569.1252

EUC: Euclidean distance, MAN: Manhattan distance, CHEBY: Chebyshev distance.

Similar experiments were conducted for standard SOM with PSO, so-called SOMPSO and ESOMPSO with Euclidean, Manhattan's, and Chebyshev’s distance measurements. SOMPSO was trained using standard hexagonal lattice, while ESOMPSO was trained with improved hexagonal lattice. The results were compared in terms of classification accuracy, quantisation error, and convergence error. As illustrated in Table 5, ESOMPSO provides the least error distance for searching the particles nearest to the input vector. It shows that improved lattice structure of ESOM yields significant impact on the accuracy of the classifications despite slower convergence time. This is due to the usage of larger lattice structure in ESOMPSO. By having larger grid size, higher training time will be generated. Furthermore, the larger the lattice size is, the more nodes for BMU calculation are to be considered. However, in this study the focus is on the performance of the proposed method based on higher accuracy and lower QE.


Quantisation error4.07994.09794.08751.88842.01252.0565
Convergence error0.03180.03580.03220.02430.05870.0347
Convergence time22 sec22 sec22 sec240 sec240 sec240 sec
Classification (%)92.0089.2490.4592.7290.1190.75
Quantisation error0.50110.64550.58660.00480.02500.0145
Convergence error0.25000.32040.30500.19160.25910.2641
Convergence time10 sec10 sec10 sec17 sec17 sec17 sec
Classification (%)94.1185.2588.4795.2286.1490.24
Quantisation error0.00940.01450.01020.00500.01250.0078
Convergence error0.59510.65230.64240.44220.53710.4823
Convergence time80 sec80 sec80 sec110 sec110 sec110 sec
Classification (%)90.6975.2378.8991.7777.3582.05
Quantisation error0.00460.00870.00520.00380.00600.0048
Convergence error0.04350.05410.12420.01570.03240.0224
Convergence time40 sec40 sec40 sec60 sec60 sec60 sec
Classification (%)87.8880.9884.6689.4582.4584.87
Quantization error0.04580.47520.47770.05870.51430.4221
Convergence error0.20600.23650.22410.14050.15690.1478
Convergence time110 sec110 sec110 sec205 sec205 sec205 sec
Classification (%)75.4470.2572.4885.6270.8572.89

EUC: Euclidean distance, MAN: Manhattan distance, CHEBY: Chebyshev distance.

Figures 4 and 5 depict the effectiveness of the ESOMPSO with better average accuracy and quantisation errors compared to the others. Regardless the types of distance measurements, the results of the proposed method are significant. This is due to the improved lattice structure and PSO in optimising the weights. As discussed before, the improved formulation of the hexagonal lattice structure gives more coverage on neighbourhood updating procedure. Hence, the probability for searching the salient nodes as winner nodes is higher, and this is presented in terms of accuracy and quantisation. However, the convergence time is slower for the proposed method due to the natural behaviour of the particles in searching for 𝑔best globally and locally. ESOMPSO with Euclidean distance gives the highest classification accuracy of 95.22% and the least quantisation error of 0.0038, accordingly.

However, this tradeoff, that is, higher accuracy with more convergence time and vice versa, does not give big impact on the success of the proposed methods due to the concept of No Free Lunch Theorem [42]. It means that general-purpose universal algorithm is impossible; an algorithm may be good at one class of problems, but its performance will suffer in the other problems. For detail explanation, higher accuracy is depending not only on types of datasets but also on the purpose of implementing the problems’ undertaking.

From the findings, it seems that the selection of SOM’s lattice structure for better learning is crucial in updating the neighbourhood structures for network learning. The standard formulation for basic and improved hexagonal lattice structure is illustrated in Figure 6. However, after training, the number of nodes to be updated was 10.39. Using the basic hexagonal formula, the wide area was not covered and caused insufficient neighborhood updating. The potential node might not be counted during the updating process. Now, we illustrate the scenario of the improved hexagonal lattice structure for wider and better coverage (Figure 6). Let say the BMU coordinate is (4, 4) with current radius, 𝑟=2. The radius will decrease with exponential decay function. The improved neighborhood hexagonal lattice area is defined as (2).

By using improved hexagonal lattice area, the nodes will be updated to 33. The coverage area is better compared to the basic hexagonal lattice, and the potential nodes differences are 22.61. This formulation improves the neighborhood updating process; hence, better results of ESOMPSO are quite promising. The proposed methods are validated using the Kruskal-Wallis [43] test to probe the significance of the results. The experiments are implemented on all accuracy of the proposed methods, and the mean rank is generated as given in Table 6. It shows that the ESOMPSO with Euclidean distance generates higher mean rank. The table also illustrates that this method has yielded higher accuracy among others as claimed previously in our experiments. The generated 𝑃 value is 0.004 which is less than the level of significant value of 𝛼=0.05. Hence, the proposed methods have shown their dissimilarity among each other.

MethodsNumber of datasetsMean rank based on accuracy (Euclidean distance)Mean rank based on accuracy (the Manhattan distance)Mean rank based on accuracy (the Chebyshev distance)



𝑃 -value0.0040.0080.025

5. Conclusion

This paper presents multistrategy learning by proposing Enhanced Self-Organizing Map with Particle Swarm Optimization (ESOMPSO) for classification problems. The proposed method was successfully implemented on machine learning datasets: XOR, Cancer, Glass, Pendigits, and Iris. The analysis was done by comparing the results for each dataset produced by Self-Organising Map (SOM), Enhanced Self-Organising Map (ESOM), Self-Organizing Map with Particle Swarm Optimization (SOMPSO) and ESOMPSO with different distance measurements. The analysis reveals that ESOMPSO with Euclidean distance generate promising results based on the highest accuracy and the least quantization errors (referring to Figures 5 and 6) compared to SOM, ESOM, and SOMPSO for classification problems. This major impact of the proposed method is due to the improved formulation of the hexagonal lattice structure which gives more distributions and wider exploration and exploitation of the particle swarm optimization (PSO) particles to search for a better 𝑔best.


Authors would like to thank the Research Management Centre (RMC), Universiti Teknologi Malaysia, and the Soft Computing Research Group (SCRG) for the support in making this studies a success.


  1. M. Negnevitsky, Artificial Intelligence: A Guide to Intelligent Systems, Addison Wesley, Harlow, England, 2nd edition, 2005.
  2. S. Chattopadhyay, D. Jhajharia, and G. Chattopadhyay, “Univariate modelling of monthly maximum temperature time series over northeast India: neural network versus Yule-Walker equation based approach,” Meteorological Applications, vol. 18, no. 1, pp. 70–82, 2011. View at: Publisher Site | Google Scholar
  3. Y. L. Wong and S. M. Shamsuddin, “Writer identification for Chinese handwriting,” International Journal of Advances in Soft Computing and Its Applications (IJASCA), vol. 2, no. 2, 2010. View at: Google Scholar
  4. S. Hasan and M. N. M. Sap, “Pest clustering with self organizing map for rice productivity,” International Journal of Advances in Soft Computing and Its Applications (IJASCA), vol. 2, no. 2, 2010. View at: Google Scholar
  5. S. M. Shamsuddin, M. N. Sulaiman, and M. Darus, “An improved error signal for the backpropagation model for classification problems,” International Journal of Computer Mathematics, vol. 76, no. 3, pp. 297–305, 2001. View at: Google Scholar
  6. C. Surajit and B. Goutami, “Artificial neural network with backpropagation learning to predict mean monthly total ozone in Arosa, Switzerland,” International Journal of Remote Sensing, vol. 28, no. 20, pp. 4471–4482, 2007. View at: Publisher Site | Google Scholar
  7. J. Kennedy and R. C. Eberhart, “Particle swarm optimization,” in Proceedings of the International Conference on Neural Networks, vol. 4, pp. 1942–1948, IEEE service center, Piscataway, NJ, USA, 1995. View at: Google Scholar
  8. Y. Shi and R. C. Eberhart, “A modified particle swarm optimizer,” in Proceedings of the IEEE International Conference on Evolutionary Computation, pp. 69–73, IEEE Press, Piscataway, NJ, USA, 1998. View at: Google Scholar
  9. X. Xiao, E. R. Dow, R. Eberhart, Z. B. Miled, and R. J. Oppelt, “Gene-Clustering Using Self-Organizing Maps and Particle Swarm Optimization,” in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS '03), IEEE Press, Nice, France, April 2003. View at: Google Scholar
  10. X. Xiao, E. R. Dow, R. Eberhart, Z. B. Miled, and R. J. Oppelt, “A hybrid self-organizing maps and particle swarm optimization approach,” Concurrency Computation Practice and Experience, vol. 16, no. 9, pp. 895–915, 2004. View at: Publisher Site | Google Scholar
  11. M. O'neill and A. Brabazon, “A particle swarm algorithm for unsupervised learning,” in Proceedings of the Self-Organizing Swarm (SOSwarm '06), IEEE World Congress on Computational Intelligence, Vancouver, Canada, July 2006. View at: Google Scholar
  12. K. Chandramouli, “Particle swarm optimization and self organizing maps based image classifier,” in Proceedings of the IEEE 2nd International Workshop on Semantic Media Adaptation and Personalization, pp. 225–228, December 2007. View at: Publisher Site | Google Scholar
  13. F. Forkan and S.M Shamsuddin, “Kohonen-swarm algorithm for unstructured data in surface reconstruction,” in Proceedings of the IEEE 5th International Conference on Computer Graphics, Imaging and Visualization, 2008. View at: Google Scholar
  14. A. Sharma and C. W. Omlin, “Performance comparison of Particle Swarm Optimization with traditional clustering algorithms used in Self Organizing Map,” International Journal of Computational Intelligence, vol. 5, no. 1, pp. 1–12, 2009. View at: Google Scholar
  15. A. Özçift, M. Kaya, A. Gülten, and M. Karabulut, “Swarm optimized organizing map (SWOM): a swarm intelligence basedoptimization of self-organizing map,” Expert Systems with Applications, vol. 36, no. 7, pp. 10640–10648, 2009. View at: Publisher Site | Google Scholar
  16. B. Fritzke, “Growing cell structures-A self-organizing network for unsupervised and supervised learning,” Neural Networks, vol. 7, no. 9, pp. 1441–1460, 1994. View at: Google Scholar
  17. A. L. Hsu, I. Saeed, and K. Halgamuge, “Dynamic self-organising maps: theory, methods and applications,” Foundations of Computational, Intelligence Volume 1, vol. 201, pp. 363–379, 2009. View at: Publisher Site | Google Scholar
  18. C. K. K. Chan, A. L. Hsu, S. L. Tang, and S. K. Halgamuge, “Using growing self-organising maps to improve the binning process in environmental whole-genome shotgun sequencing,” Journal of Biomedicine and Biotechnology, vol. 2008, no. 1, 2008. View at: Publisher Site | Google Scholar
  19. J. C. Créput, A. Koukam, and A. Hajjam, “Self-Organizing Maps in Evolutionary Approach for the Vehicle Routing Problem with Time Windows,” IJCSNS International Journal of Computer Science and Network Security, vol. 7, no. 1, pp. 103–110, 2007. View at: Google Scholar
  20. S. T. Khu, H. Madsen, and F. di Pierro, “Incorporating multiple observations for distributed hydrologic model calibration: an approach using a multi-objective evolutionary algorithm and clustering,” Advances in Water Resources, vol. 31, no. 10, pp. 1387–1398, 2008. View at: Publisher Site | Google Scholar
  21. K. Eisuke, S. Kan, and Z. Fei, “Investigation of self-organizing map for genetic algorithm,” Advances in Engineering Software, vol. 41, no. 2, pp. 148–153, 2010. View at: Publisher Site | Google Scholar
  22. D. Brennan and M. M. van Hulle, “Comparison of flat SOM with spherical SOM. A case study,” in The Self-Organizing Maps and the Development—From Medicine and Biology to the Sociological Field, H. Tokutaka, M. Ohkita, and K. Fujimura, Eds., pp. 31–41, Springer, Tokyo, Japan, 2007. View at: Google Scholar
  23. C. Hung, “A constrained neural learning rule for eliminating the border effect in online self-organising maps,” Connection Science, vol. 20, no. 4, pp. 1–20, 2008. View at: Publisher Site | Google Scholar
  24. P. K. Kihato, H. Tokutaka, M. Ohkita et al., “Spherical and torus SOM approaches to metabolic syndrome evaluation,” in Proceedings of the ICONIP, vol. 4985 of Lecture Notes in Computer Science (LNCS), pp. 274–284, Springer, Heidelberg, Germany, 2008. View at: Google Scholar
  25. H. Ritter, “Self-organizing maps on non-Euclidean spaces,” in Kohonen Maps, E. Oja and S. Kaski, Eds., pp. 95–110, Elsevier, New York, NY, USA, 1999. View at: Google Scholar
  26. K. Marzouki and T. Yamakawa, “Novel algorithm for eliminating folding effect in standard SOM,” in Proceedings of the European Symposium on Artificial Neural Networks Bruges (ESANN '05), pp. 563–570, Brussels, Belgium, 2005. View at: Google Scholar
  27. D. Nakatsuka and M. Oyabu, “Usefulness of spherical SOM for clustering,” in Proceedings of the 19th Fuzzy System Symposium Collected Papers, pp. 67–70, Japan, 2003. View at: Google Scholar
  28. T. Matsuda et al., “Decision of class borders on spherical SOM and its visualization neural information processing,” Lecture Notes in Computer Science, vol. 5864, pp. 802–811, 2009. View at: Google Scholar
  29. L. Middleton, J. Sivaswamy, and G. Coghill, “Logo shape discrimination using the HIP framework,” in Proceedings of the 5th Biannual Conference on Artificial Neural Networks and Expert Systems (ANNES '01), pp. 59–64, 2001. View at: Google Scholar
  30. Y. S. Park, J. Tison, S. Lek, J. L. Giraudel, M. Coste, and F. Delmas, “Application of a self-organizing map to select representative species in multivariate analysis: a case study determining diatom distribution patterns across France,” Ecological Informatics, vol. 1, no. 3, pp. 247–257, 2006. View at: Publisher Site | Google Scholar
  31. T. Kohonen, Self-Organizing Maps, vol. 30, Springer Series in Information Sciences, Berlin, Germany, 3rd edition, 2001, Extended Edition.
  32. J. Vesanto and E. Alhoniemi, “Clustering of the self-organizing map,” IEEE Transactions on Neural Networks, vol. 11, no. 3, pp. 586–600, 2000. View at: Publisher Site | Google Scholar
  33. A. Astel, S. Tsakovski, P. Barbieri, and V. Simeonov, “Comparison of self-organizing maps classification approach with cluster and principal components analysis for large environmental data sets,” Water Research, vol. 41, no. 19, pp. 4566–4578, 2007. View at: Publisher Site | Google Scholar
  34. Y. Wu and M. Takatsuka, “Spherical self-organizing map using efficient indexed geodesic data structure,” Neural Networks, vol. 19, no. 6, pp. 900–910, 2006. View at: Publisher Site | Google Scholar
  35. R. C. Eberhart and Y. Shi, “Particle swarm optimization: developments, applications and resources,” in Proceedings of the IEEE Congress on Evolutionary Computation '01, Piscataway, NJ, USA, 2001, Seoul, Korea. View at: Google Scholar
  36. Y. Norfadzila, Multilevel Learning in Kohonen SOM Network for Classification Problems, M.S. thesis, Faculty of Computer Science and Information System, UTM University, Johor, Malaysia, 2006.
  37. H. Ying, F. Tian-Jin, C. Jun-Kuo, D. Xiang-Qiang, and Z. Ying-Hua, “Research on some problems in Kohonen SOM algorithm,” in Proceedings of the 1st Conference On Machine Learning and Cybernatics, Beijing, China, 2002. View at: Google Scholar
  38. J. Vesanto and E. Alhoniemi, “Clustering of the self-organizing map,” IEEE Transactions on Neural Networks, vol. 11, no. 3, pp. 586–600, 2000. View at: Publisher Site | Google Scholar
  39. G. Deboeck and T. Kohonen, Visual Explorations in Finance with Self-Organizing Maps, Springer, London, UK, 1998.
  40. A. P. Engelbrecht, Computational Intelligence: An Introduction, John Wiley & Sons, New York, NY, USA, 2nd edition, 2007.
  41. M. Hagenbuchner, A. Sperduti, and A. C. Tsoi, “A self-organizing map for adaptive processing of structured data,” IEEE Transactions on Neural Networks, vol. 14, no. 3, pp. 491–505, 2003. View at: Publisher Site | Google Scholar
  42. H. W David and G. M. William, “No free lunch theorems for optimization,” IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67–82, 1997. View at: Google Scholar
  43. W. H. Kruskal and W. A. Wallis, “Use of ranks in one-criterion variance analysis,” Journal of the American Statistical Association, vol. 47, no. 260, pp. 583–621, 1952. View at: Google Scholar

Copyright © 2011 S. Hasan and S. M. Shamsuddin. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

3154 Views | 748 Downloads | 8 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.