Computational Intelligence and Neuroscience

Volume 2017 (2017), Article ID 4263064, 15 pages

https://doi.org/10.1155/2017/4263064

## Prototype Generation Using Self-Organizing Maps for Informativeness-Based Classifier

^{1}Graduate Program in Electrical Engineering and Computing, Mackenzie Presbyterian University, Sao Paulo, SP, Brazil^{2}Computing and Informatics Faculty & Graduate Program in Electrical Engineering and Computing, Mackenzie Presbyterian University, Sao Paulo, SP, Brazil

Correspondence should be addressed to Leandro A. Silva

Received 31 January 2017; Revised 13 June 2017; Accepted 15 June 2017; Published 25 July 2017

Academic Editor: Toshihisa Tanaka

Copyright © 2017 Leandro Juvêncio Moreira and Leandro A. Silva. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

The nearest neighbor is one of the most important and simple procedures for data classification task. The , as it is called, requires only two parameters: the number of and a similarity measure. However, the algorithm has some weaknesses that make it impossible to be used in real problems. Since the algorithm has no model, an exhaustive comparison of the object in classification analysis and all training dataset is necessary. Another weakness is the optimal choice of parameter when the object analyzed is in an overlap region. To mitigate theses negative aspects, in this work, a hybrid algorithm is proposed which uses the Self-Organizing Maps (SOM) artificial neural network and a classifier that uses similarity measure based on information. Since SOM has the properties of vector quantization, it is used as a Prototype Generation approach to select a reduced training dataset for the classification approach based on the nearest neighbor rule with informativeness measure, named NN. The SOMNN combination was exhaustively experimented and the results show that the proposed approach presents important accuracy in databases where the border region does not have the object classes well defined.

#### 1. Introduction

The main task of a data classifier is to predict the class of an object that is under analysis. The simplest procedure for data classification tasks is the nearest neighbor (NN) algorithm. The algorithm strategy for classification comprises three operations: (i) an unlabeled sample is compared to dataset training through a similarity measure; (ii) the labeled objects are sorted in order of similarity to the unlabeled sample; and finally, (iii) the classification occurs giving the unlabeled sample the majority class of the nearest neighbors objects. Because of its simplified algorithm (three basic operations steps), and reduced number of parameters (similarity measure and the number of nearest neighbor), this instance-based learning algorithm is widely used in the data mining community as a benchmarking algorithm [1–5].

Since the NN algorithm has no model, an exhaustive comparison of the unlabeled sample with all the labeled and stored objects in the database is necessary, which increases the computational time of the process. In addition to this weakness of algorithm, the decision boundaries are defined by the instances stored in the training set and, for this, the algorithm has low tolerance to noise; that is, all training dataset objects are considered relevant patterns. Finally, the optimal choice of depends upon the dataset mainly when the object analyzed is in a boundary region, making this parameter to be tuned according to the application [6–9].

To overcome the drawbacks above, there are in the literature different approaches such as similarity measure alternative to the Euclidean distance to minimize misclassification in boundaries region [10], methods to avoid searching the whole space of training set [11], and dataset summarization to find representative objects of training set [9]. For the dataset summarization approach, there are two main strategies to reduce the dataset volume: one of them based on instance selection and the other based on prototypes. For the approaches based on pattern (or instance) selection, the aim is to find a representative and reduced set of objects from the training dataset, which has the same or higher classification accuracy of a raw dataset [8, 12–15]. The strategies based on prototype, on the other hand, are defined in two approaches: Prototype Selection (PS) [16] and Prototype Generation (PG) [13, 17–19]. The approaches are equivalent; both can be used to identify an optimal subset of representative prototypes, discarding noise, and redundancy. The difference is that PG can also be used to generate and to replace the raw dataset by an artificial dataset. The use of prototypes or reduced training objects that are represented by prototypes minimizes some of NN drawbacks previously mentioned as the exhaustive comparison of all training dataset.

Silva and Del-Moral-Hernandez [5] presented combination methods that use the winning neuron and topological maintain concepts of the Self-Organizing Maps (SOM) neural network to define a reduced subset of objects of the training set that are highly similar to the object that is under analysis for classification [5, 20]. This object subset is retrieved and then utilized by the NN to execute the classification task. In other words, the SOM executes a preprocessing for the NN classifier, recovering the similar objects from the winning neuron and from the adjacent neighbors of the SOM map [21].

With respect to drawback in the tuning of parameter , Zhang et al. proposed a computation learning for this parameter [22]. Song et al., on the other hand, proposed a metric based on informativeness to perform the classification process in a boundaries region, where the choice of is more sensible [10]. This algorithm was called NN and the main idea is investigating the nearest objects more informative instead of the closest. This approach outperforms the use of NN with Euclidean distance; however, it further increases the complexity of the comparison, consequently increasing process time [23].

Inspired by use of PG [5, 20, 21], we introduce a hybrid approach, where in a first step there is the SOM, which has the quantization vector and topological maintenance as important features for using it as a preprocessing in order to present to the classifier algorithm a reduced set of objects, highly similar to the unknown object that is being investigated. Next, the NN algorithm will attribute a class to the unknown object based on the most informative objects of selected set. For the initial exploratory experiments, we observed important results of accuracy and time in classification process [23].

We here formally detail how SOMNN works in hybrid architecture for classification problems. Besides that, here we introduced an experimental methodology to analyze qualitatively the SOMNN classifier in three artificial datasets, experimenting different distribution in the region of class overlapping. In addition, we perform the experiments in 21 databases publicly (7 times more than in the previous study) available in the UCI repository and also sampling way by the 5-fold cross validation method in the complementary website to the paper published by Triguero et al. [9]. The results are analyzed using accuracy, kappa, prototype reduction, and time as performance indices.

The rest of the paper is organized as follows: in Section 2, a brief explanation of Prototype Generation and the taxonomy proposed by [9] are shown; Self-Organizing Maps and the methods to use them in classification with NN are presented in Section 3. In Section 4, the experimental methodology is introduced. Experimental results, discussion, and comparative results are given in Section 5. In the last section, the conclusions are provided.

#### 2. Theoretical Fundamental

##### 2.1. A Brief Introduction to Prototype Generation

For a better understanding of the Prototype Generation idea, let us consider an object of a dataset, defined as a set of descriptive attributes of dimensional and with a class attribute ; that is, . Then, let us assume that is a training dataset with samples of . The purpose of Prototype Generation (PG) is to obtain a reduced set, , with instances selected or generated from , but with . The cardinality of this reduced set must be sufficiently small to decrease the evaluation time taken by a classifier (NN, for example), maintaining the classification accuracy. In fact, data reduction approaches aim mainly to summarize the raw dataset, without damaging the analytical properties, which implies performance accuracy.

For the PG methods, prototypes are used by classifiers instead of raw datasets, or they are used to generate an artificial dataset. Data generation can be interesting in some cases to eliminate data noise or to solve dataset with unbalanced class. Since the possibilities of usage are diversified, the literature presents different methods, approaches, and algorithms. This was the reason for Triguero et al. [9] to propose a PG taxonomy that is used to enhance NN drawbacks, which was defined as a hierarchical way of three levels (generation mechanisms, resulting generation set, and type of reduction), and also review the all algorithms of the PG from the literature (see [9] for a detailed explanation).

In the next section, we introduce a brief of Self-Organizing Maps and the approach is proposed, the combination of SOM and NN.

##### 2.2. A Brief Summary for the Kohonen Self-Organizing Maps

Kohonen Self-Organizing Map (SOM) is a type of neural network that consists of neurons located on a regular low-dimensional grid, usually two-dimensional (2D). Typically, the lattice of the 2D grid is either hexagonal or rectangular [24]. The SOM learning or training process is an iterative algorithm which aims to represent a distribution of the input pattern objects in that regular grid of neurons. The similar input patterns are associated in the same neurons or in the adjacent neurons of the grid.

For the SOM training, a dataset is chosen and divided into two distinct sets. The training set is used to train the SOM which is here called . The other set is used to test the trained SOM (). After this dataset division, we start the training SOM. Formally, an object is randomly selected from during a training, defined as , where the element is an attribute or feature of the object, which belongs to . The object is similar to what was before defined, but without the class information. Additionally, each neuron of the SOM grid has a weight vector , where ; here is the total number of neurons of the map.

During the learning process, the input pattern is randomly selected from the training set and it is compared with the weights vector of the map, initially initialized randomly. The comparison between and is usually made through Euclidean distance. The shortest distance indicates the closest neuron , which will have its weight vector , updated to get close to the selected input pattern . Formally, neuron is defined as follows:

The closest weights vector and their neighbors are updated using the Kohonen algorithm [24]. However, the topological neighborhood is defined so that the farther away the neuron from , the lower the intensity for the neighborhood to be updated. The intensity of the neighborhood function is defined in relation to the training time. In other words, in initial times, the level has high value and, according to the next iterations, it is reduced at each iteration. See Kohonen [24] for a complete explanation of the training rule of the SOM map.

##### 2.3. Building a Prototype Generation Based on SOM

Since the training phase has been completed, each input pattern object from the training set has to be grouped to the closest neuron. The idea in this approach of using SOM as a PG technique is that the index of each instance is a part of the nearest neuron list. Thus, the list of each neuron is here called the Best Matching Unit List (), formally defined aswhere is assigned to the number of the map neuron and is a list with the indexes of input patterns objects associated with the nearest neuron.

The relationship between the instance of training set and the list of the best match unit is of many-to-one. That is, some units , which we could call microclusters, must be associated with one or more instances and other units may have no associations; that is, the list can be empty .

The classification method proposed herein explores two important characteristics of the SOM: vector quantization and topological ordering [24]. For better understanding these features, consider the representation of Figure 1 with input patterns objects (filled circles) used for training a SOM map and the weight vectors of each neuron (squares) after the training phase. In this figure, each weight vector represents a microcluster of input patterns, which is a quantization characteristic. The relationship between the weight vectors can be interpreted as a boundary, which can be understood as a Voronoi region, as exemplified by the shaded area in Figure 1. In operational aspects of use, this can be considered in a classification process in which the strategy, introduced and explored herein, means to establish a two-step process. In the first step, when a test sample (see Figure 1, the unfilled circle) is compared to the weight vectors of the trained SOM map (the squares of Figure 1), the algorithm defines the closest unit according to the following equation: