Fuzzy Lattice Reasoning for Pattern Classification Using a New Positive Valuation Function

Jamshidi Khezeli, Yazdan; Nezamabadi-pour, Hossein

doi:https://doi.org/10.1155/2012/206121

Advances in Fuzzy Systems

On this page

Abstract Introduction Experimental Results Conclusion References Copyright Related Articles

Research Article | Open Access

Volume 2012 | Article ID 206121 | https://doi.org/10.1155/2012/206121

Fuzzy Lattice Reasoning for Pattern Classification Using a New Positive Valuation Function

Yazdan Jamshidi Khezeli¹and Hossein Nezamabadi-pour¹

Academic Editor: F. Herrera

Received20 Apr 2012

Accepted23 Jul 2012

Published30 Aug 2012

Abstract

This paper describes an enhancement of fuzzy lattice reasoning (FLR) classifier for pattern classification based on a positive valuation function. Fuzzy lattice reasoning (FLR) was described lately as a lattice data domain extension of fuzzy ARTMAP neural classifier based on a lattice inclusion measure function. In this work, we improve the performance of FLR classifier by defining a new nonlinear positive valuation function. As a consequence, the modified algorithm achieves better classification results. The effectiveness of the modified FLR is demonstrated by examples on several well-known pattern recognition benchmarks.

1. Introduction

Much attention has been paid lately to applications of lattice theory [1] in different fields including neural networks [2]. Artificial neural networks whose computation is based on lattice algebra have become known as morphological neural networks [3, 4]. Lattices are popular in mathematical morphology including image processing applications [5, 6]. Moreover, algebraic lattices have been used for modeling associative memories [7]. In [8], the problem of capacity storage limitation in associative memories [9, 10] has been eliminated by proposing one-way and bidirectional lattice associative memories. Furthermore, lattices are used implicitly in some neural networks such as fuzzy ART and min-max [11, 12] as explained in [2, 13]. A practical advantage of lattice theory is the ability to model both uncertain information and disparate types of lattice-ordered data [14]. The term of a fuzzy lattice was proposed by Nanda in 1989 on the basis of the concept of a fuzzy partial-order relation [15]. Several authors have employed the notion “fuzzy lattice” in mathematics emphasizing algebraic properties of lattice ideals [16, 17]. Furthermore, the notion of fuzzy concept lattice has been studied in [18–20]. Sussner and Esmi [21] have introduced the morphological perceptron with a fusion of fuzzy lattices for competitive learning. Fuzzy lattices have also been used in clustering and classification algorithms. More specifically, independently from the development of morphological neural networks, Petridis and Kaburlasos [13] have found inspiration in lattice theory and versions of the ART model and have devised another successful approach to lattice-based computational intelligence. Hence, they proposed a fundamentally new and inherently hierarchical approach in neuron computing named fuzzy lattice neurocomputing (FLN) [14]. Moreover, fuzzy lattice reasoning (FLR) classifier was announced for inducing descriptive, decision-making knowledge (rules) in a mathematical data domain including space , and it has been successfully applied to a variety of problems such as ambient ozone estimation [22] as well as air quality assessment [23]. Decision making in FLR is based on an inclusion measure function; moreover, the definition of an inclusion measure is based on a positive valuation function.

The original FLR model employs a linear positive valuation function to define an inclusion measure. Liu et al. [24] proposed a nonlinear valuation function (arctan) for computing the inclusion measure function and successfully applied it to several data set benchmarks.

In this work, we apply FLR algorithm to solve pattern classification problems without feature extraction and improve its performance based on a new nonlinear positive valuation function. As a consequence, the modified algorithm achieves better classification results. The effectiveness of the modified FLR is demonstrated by examples on several well-known benchmarks.

The layout of this paper is as follows. In Section 2, the mathematical background of fuzzy lattices is reviewed. Section 3 explains modified fuzzy lattice reasoning classifier model. Section 4 provides experimental results that demonstrate the performance of modified FLR. Finally, Section 5 summarizes the results of this work.

2. Mathematical Background

A lattice is a partially ordered set (or, simply, poset) such that any two of its elements have a greatest lower bound and a least upper bound . The lattice operations and are also called meet and join, respectively. A lattice is called complete when each of its subsets has a least upper bound and a greatest lower bound in [1]. A nonvoid complete lattice has a least element and a greatest element denoted by and , respectively. The inverse of an order relation is itself an order relation. The order is called the dual order of symbolically . A lattice can be Cartesian product of constituent lattices , that is, . The lattice operations meet and join of product lattice are defined as follows:

A valuation on a crisp lattice is a real-valued function which satisfies . A valuation is called monotone if and only if in implies and positive if and only if implies . We remark that the goal of positive valuation function is to deal with lattice elements. Choosing a suitable valuation function is problem dependent. Decision making by the FLR is based on an inclusion measure function; therefore,a proper positive valuation function might improve performance.

Definition 1. An inclusion measure with least element and great element in a complete lattice is a mapping such that it satisfies the following conditions [2]:(i), (ii) for all ,(iii), for all (consistency property).

It reveals that an inclusion measure indicates the degree to which one fuzzy set is contained in another one.

Theorem 2. A positive valuation function in a lattice with is a sufficient condition for two inclusion measures [2]:

In our experiments, the data have been normalized in lattice , that is, the unit -dimensional hypercube, where is the dimension of the input data. Furthermore, we propose the following nonlinear positive valuation function: where is called location parameter. Without loss of generality, let , that is, : Furthermore, the proposed valuation function is a strictly increasing function, thus for any . Finally, the aforementioned function (3) maps the least element of lattice to zero. On the other hand, .

The aforementioned valuation function operates in a more flexible manner compared with other valuations proposed in the literature. First, the performance of FLR can be optimized by selecting different values of the location parameter. Second, if the first variable is assumed to be constant, then it will be converted to , in the space , and in some special applications, it might be a proper valuation function. One may say that it does not satisfy the condition , in this case, it can be defined as follows [14]:

Figure 1(a) plots the positive valuation function , whereas Figure 1(b) plots for .

(a)

(b)

A lattice is totally ordered if and only if for any , either or . The lattice under inequality relation is not a totally ordered lattice.

A fuzzy lattice is a pair where is a crisp lattice and is a fuzzy set with membership function such that . Note that given a lattice , for which a positive valuation function can be defined with , then both and are fuzzy lattices [2].

Consider the set of real numbers. It turns out that under the inequality relation between is a complete lattice with the least element −∞ and the greatest element +∞ [25].

For lattice , we define the set of (closed) intervals as . We remark that is a lattice with the ordering relation, lattice join and meet defined as follows [25]:

Including a least (the empty) interval, denoted by , to leads to a complete lattice . Note that in case of , the lattice is equal to conventional intervals (sets) in . Our particular interest here is in the complete lattice with greatest element and least element .

An isomorphic function from poset to poset is a map if both “” and “ is onto Q.” Based on the positive valuation function of lattice and an isomorphic function , a valuation function in is defined as

As a consequence, the degree of inclusion of an interval in another one in lattice is computed as follows [25]:

For two -dimensional hypercubes and , the following inclusion measure between two intervals and is defined:

3. FLR Model

This section presents a classifier for extracting rules from the input data based on fuzzy lattices. One of FLR important properties is the ability of dealing with disparate type of data, including real vectors, fuzzy sets, symbols, graphs, images, waves, and even any combination of the aforementioned data and this shows the ability of FLR in combining different types of data. Furthermore, FLR can handle both complete and noncomplete lattices, and it can cope with both points and intervals. Moreover, stable learning is carried out both incrementally and fast in a single pass through the training data. In some applications, we might face with “missing” or “do not care” data. In this case, FLR can manage “missing” and “do not care” data by replacing them with least element and great element , respectively. For example, if the constituent lattice is , then we can replace “missing” and “do not care” data by intervals of and , respectively [13, 14, 22].

It should be mentioned that an input datum to the FLR classifier (model) is represented as where is the class label of the datum , and it can be interpreted as a rule “if then .” We remark that a single real number corresponds to the trivial interval . Learning and generalization in FLR is based on the computation of hyperboxes in space , that is, a rule induced by FLR corresponds to an N-dimensional hypercube.

Suppose a knowledge base is given. KB can be empty at first. Decision making in FLR is based on an inclusion measure. During learning phase, when an input datum is presented to the network, the degree of inclusion between input and each stored rules in KB will be calculated as , respectively. The fuzzy lattice reasoning classifier will choose the rule with as the winner. If the winner rule and input datum have the same class label and the size of , denoted by , is less than a user defined threshold, then the winner rule will be updated. Note that the size of an interval is computed as . Otherwise, this process is repeated; if no more rules are left, then the input datum will be a new member of KB. Algorithm for training is described in Algorithm 1.

S0. The first input is memorized. an instant, there are Known
Classes memorized in the memory, initially .
S1. Present the next input to the initially “set” family of rules.
S2. If no rules are “set” then
Store input ,
,
Go to S1.
Else
Compute of the “set” rules.
S3. Competition among the “set” rules:
Winner is rule such that .
S4. The Assimilation Condition:
Both and .
S5. If the Assimilation Condition is satisfied then
Replace by .
Else
“reset” the winner , Go to S2.

Note that is the threshold size which is used to specify the maximum size of a hyperbox to be learned. The decision boundaries which can be formed by FLR endowed with logarithmic valuation function are illustrated in the following example.

Example 3. The Simpson benchmark is a two-dimensional data set consisting of 24 points which is used for testing the performance of a clustering algorithm [26]. This is a perceptual grouping problem in vision, which deals with the detection of the right partition of an image into subsets [27]. We have divided the data into three classes.

Figure 2 shows the decision surfaces of the FLR endowed with proposed logarithmic valuation function without any misclassified data. Here, due to lack of space, we have classified the data four times for location parameter () equal to and with four different vigilance parameter values. Note that the size of a hyperbox is tuned by the vigilance parameter ; more specifically, larger values of result in more hyperboxes.

(a)

𝜌
=
0
.
0
3

: eight hyperboxes are generated

(a) : eight hyperboxes are generated

(b)

𝜌
=
0
.
0
5

: five hyperboxes are generated

(b) : five hyperboxes are generated

(c)

𝜌
=
0
.
0
8

: four hyperboxes are generated

(c) : four hyperboxes are generated

(d)

𝜌
=
0
.
0
9

: three hyperboxes are generated

(d) : three hyperboxes are generated

As it was said in the previous section, one of the FLR properties is the ability of knowledge representation. Indeed, FLR is capable of extracting implicit features beyond the data and represents them as rules. Each rule is represented as if then where are attributes, each one corresponding to an interval, and are class labels. Table 1 shows three induced rules corresponding to Figure 2(d).

4. Experimental Results and Discussions

4.1. Benchmark Dataset Description

In this section, we evaluate the classification performance of the optimized FLR in a series of experiments on six well-known benchmarks.

4.1.1. Object Recognition

We evaluate the classification performance of the FLR model using images of the Columbia Image database [28]. Columbia object image library (COIL-100) is a database of color images of 100 objects. We selected the 10 objects from the dataset shown in Figure 3. The objects were placed on a motorized turntable against a black background. The turntable was rotated through 360° to vary object pose with respect to a fixed color camera. Images of the objects were taken at pose intervals of 5°. This corresponds to 72 poses per object. The images were size normalized. There are 720 -dimensional instances divided into 10 separate classes; 72 for each class; only six randomly selected instances per each object were sufficient for whole training set, and the remaining patterns were used for testing set. The aim is the correct classification of the testing data to their corresponding classes.

4.1.2. Image Segmentation

The image segmentation data set was donated by the Vision Group, University of Massachusetts, and is included in the Machine Learning Repository of the University of the California, Irvine [29]. The Image Segmentation data set consisted of data relating numerous analyses of the colors in subdivided images to the type of surface in the image. Each image was divided up into small subsections, each of which comprised one point of data. Each data point was composed of 18 different attributes, including one that determined what the image was of: brick face, foliage, grass, sky, window, concrete, and dirt. This data set consists of 210 samples for training and 2100 samples for testing. The goal is to distinguish between seven different classes.

4.1.3. Pen-Based Recognition of Handwritten Digits

The Pen-based recognition of handwritten digits dataset was taken from the UCI repository of machine learning databases [29]. It was created by collecting 250 digit samples from 44 writers. The data have 16 continuous attributes distributed in 10 separated classes. A training set is given explicitly. For a faster simulation, we have resized the training set by selecting randomly six instances per each class. Distribution of digits between 0 and 9 in the dataset is shown in Figure 4.

4.1.4. Letter Recognition

The letter recognition benchmark was employed from the UCI repository of machine learning databases [29]. The data set consists of 20,000 unique letter images generated by randomly distorting pixel images of the 26 uppercase letters from 20 different commercial fonts. The parent fonts represented a full range of character types including script, italic, serif, and gothic. The features of each of the 20,000 characters were summarized in terms of 16 primitive numerical attributes. A training set is not given explicitly. We have divided all the data into a training set consisting of 10 percent of patterns which have been selected randomly and a testing set consisting of the remaining patterns. Examples of the character images are presented in Figure 5.

4.1.5. Semion Hand Recognition

The semion hand recognition benchmark was taken from the UCI repository of machine learning databases [29]. This dataset consists of 1593 handwritten digits from around 80 persons was scanned and stretched in a rectangular box in a gray scale of 256 values. Then each pixel of each image was scaled into a boolean value using a fixed threshold. Each person wrote on a paper all the digits from 0 to 9, twice. The commitment was to write the digit the first time in the normal way and the second time in a fast way. We have used 10 percent of data for whole training set.

4.1.6. Optical Recognition of Handwritten Digits

The optical recognition of handwritten digits benchmark was employed from the UCI repository of machine learning databases [29]. In this data set, bitmaps of handwritten digits from a total of 43 people are divided into nonoverlapping blocks of , and the numbers of on pixels are counted in each block. This generates an input matrix of where each element is an integer in the range of 0 to 16. This reduces dimensionality and gives invariance to small distortions. Training and testing sets are given explicitly including 3823 and 1797 64-dimensional samples. For a faster simulation, the way we have used the data set was to employ 10 percent of the training set for actual training.

Table 2 shows briefly the characteristic of the selected benchmark data sets.

4.2. Experiments and Results

In order to provide a meaningful comparison, all the algorithms have been implemented in the same environment using the C++ object-oriented programming language, the same partitioning of data sets for training and testing, the same order of input patterns, and a full range of parameters, and we have employed the isomorphic function . Furthermore, all the -dimensional data have been normalized into space by the function , where and stand for the least and the greatest attribute values, respectively, in a data dimension. In this work, the FLR algorithm endowed with linear valuation function , nonlinear valuation arctan, and nonlinear logarithmic valuation function is denoted, respectively, by , , and . To compare the learning capability, Table 3 shows the comparison of the experimental results of with the ones produced by the and , the SOM [30], fuzzy ART [11], and GRNN [31].

In all our experiments in order to achieve the best performance, we have considered GRNN for different values of variance parameter between 0 and 0.5 in steps of 0.001. For fuzzy ART, we have set the choice parameter to 0.01, and the values of vigilance and learning parameters have been adopted between 0 and 1 in steps of 0.01. Computational experiments for the SOM algorithm have been carried out using grids of units and number of 100 epochs. Since the results produced by SOM depend on the initialization of the weights, we have chosen the weights that yielded best results on the testing set for 10 random initializations. The only parameter for FLR algorithm that should be tuned is the threshold size parameter . In our simulations, the size of was set from 0.01 up to , where is the dimension of the input data, in steps of 0.01 except for object recognition data set, we have set in steps of 50 due to high-dimensional input data. Just for , the location parameter should be tuned too. We have set between 1 and 50 in steps of 1.

Table 3 cites the classification accuracy and ranking of different methods for each benchmark. In other words, each table cell, which belongs to a specific learning algorithm and the data set, contains the percentages of correct classification of that model over the corresponding data set. The number in brackets in each table cell shows the ranking of each method after running on a specific data set. The best results have been shown in bold face.

As can be seen in Table 3, has obtained acceptable results in comparison with other methods, and in five cases, it gets the first rank. In Table 4, the average classification accuracy on all data sets for each of the different learning algorithms along with the new relative obtained ranks has been shown. In other words, first the average of each column of the previous table has been calculated, then the corresponding ranking is shown within brackets. As it can be seen from among all the methods, has achieved the best ranking among all other methods.

In Table 5, this comparison has been made according to the sum of the ranks available in Table 3 per each column. Although this quantity is of lower precision degree for reporting results in some cases, it is common in nonparametric statistics. As can be seen in this table, , , and get first, second, and third rankings, respectively. It should be pointed out that although there is no universal learning algorithm that can get the best results on the all benchmarks, the results obtained by confirm that our proposed model is an efficient classifier compared with established classifiers from the literature.

5. Conclusion

In this work, we introduced an improvement of fuzzy lattice reasoning (FLR) classifier using a new nonlinear positive valuation function. We have investigated the performance of new FLR model in several well-known classification problems. Experimental results demonstrated that our proposed methods outperformed established classification models in terms of classification accuracy on the testing data.

References

G. Birkhoff, Lattice Theory, American Mathematical Society, Providence, RI, USA, 3rd edition, 1967.
V. G. Kaburlasos, “Towards a unified modeling and knowledge-representation based on lattice theory: computational intelligence and soft computing applications,” in Studies in Computational Intelligence, Springer, New York, NY, USA, 2006.
View at: Google Scholar
G. X. Ritter and G. Urcid, “Lattice algebra approach to single-neuron computation,” IEEE Transactions on Neural Networks, vol. 14, no. 2, pp. 282–295, 2003.
View at: Publisher Site | Google Scholar
M. E. Valle and P. Sussner, “Storage and recall capabilities of fuzzy morphological associative memories with adjunction-based learning,” Neural Networks, vol. 24, no. 1, pp. 75–90, 2011.
View at: Publisher Site | Google Scholar
G. X. Ritter and J. N. Wilson, Handbook of Computer Vision Algorithms in Image Algebra, CRC Press, Boca Raton, Fla, USA, 1996.
M. Graña, A. M. Savio, M. García-Sebastián, and E. Fernandez, “A lattice computing approach for on-line fMRI analysis,” Image and Vision Computing, vol. 28, no. 7, pp. 1155–1161, 2010.
View at: Publisher Site | Google Scholar
G. X. Ritter and G. Urcid, “Learning in lattice neural networks that employ dendritic computing,” Studies in Computational Intelligence, vol. 67, pp. 25–44, 2007.
View at: Publisher Site | Google Scholar
T. Kohonen, “Correlation matrix memory,” IEEE Transactions on Computers, vol. C-21, no. 4, pp. 353–359, 1972.
View at: Google Scholar
J. J. Hopfield, “Neurons with graded response have collective computational properties like those of two-state neurons,” Proceedings of the National Academy of Sciences of the United States of America, vol. 81, no. 10 I, pp. 3088–3092, 1984.
View at: Google Scholar
B. Kosko, “Bidirectional associative memories,” IEEE Transactions on Systems, Man and Cybernetics, vol. 18, no. 1, pp. 49–60, 1988.
View at: Publisher Site | Google Scholar
G. A. Carpenter, S. Grossberg, and D. B. Rosen, “Fuzzy ART: fast stable learning and categorization of analog patterns by an adaptive resonance system,” Neural Networks, vol. 4, no. 6, pp. 759–771, 1991.
View at: Google Scholar
P. K. Simpson, “Fuzzy min-max neural networks–I: classification,” IEEE Transactions on Neural Networks, vol. 3, no. 5, pp. 776–786, 1992.
View at: Publisher Site | Google Scholar
V. Petridis and V. G. Kaburlasos, “Fuzzy lattice neural network (FLNN): a hybrid model for learning,” IEEE Transactions on Neural Networks, vol. 9, no. 5, pp. 877–890, 1998.
View at: Google Scholar
V. G. Kaburlasos and V. Petridis, “Fuzzy Lattice Neurocomputing (FLN): a novel connectionist scheme for versatile learning and decision making by clustering,” International Journal of Computers and Their Applications, vol. 4, no. 3, pp. 31–43, 1997.
View at: Google Scholar
S. Nanda, “Fuzzy lattices,” Bulletin Calcutta Mathematical Society, vol. 81, pp. 1–2, 1989.
View at: Google Scholar
Y. Bo and W. Wangming, “Fuzzy ideals on a distributive lattice,” Fuzzy Sets and Systems, vol. 35, no. 2, pp. 231–240, 1990.
View at: Google Scholar
N. Ajmal and K. V. Thomas, “Fuzzy lattices,” Information Sciences, vol. 79, no. 3-4, pp. 271–291, 1994.
View at: Google Scholar
G. Georgescu and A. Popescu, “Non-dual fuzzy connections,” Archive for Mathematical Logic, vol. 43, no. 8, pp. 1009–1039, 2004.
View at: Publisher Site | Google Scholar
R. Bělohlávek, “Fuzzy Galois connections,” Mathematical Logic Quarterly, vol. 45, no. 4, pp. 497–504, 1999.
View at: Google Scholar
S. Q. Fan, W. X. Zhang, and W. Xu, “Fuzzy inference based on fuzzy concept lattice,” Fuzzy Sets and Systems, vol. 157, no. 24, pp. 3177–3187, 2006.
View at: Publisher Site | Google Scholar
P. Sussner and E. L. Esmi, “An introduction to morphological perceptrons with competitive learning,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN'09), pp. 3024–3031, Atlanta, Ga, USA, June 2009.
View at: Publisher Site | Google Scholar
V. G. Kaburlasos, I. N. Athanasiadis, and P. A. Mitkas, “Fuzzy lattice reasoning (FLR) classifier and its application for ambient ozone estimation,” International Journal of Approximate Reasoning, vol. 45, no. 1, pp. 152–188, 2007.
View at: Publisher Site | Google Scholar
I. N. Athanasiadis and V. G. Kaburlasos, “Air quality assessment using fuzzy lattice reasoning (FLR),” in Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 29–34, Vancouver, Canada, July 2006.
View at: Publisher Site | Google Scholar
H. Liu, S. Xiong, and Z. Fang, “FL-GrCCA: a granular computing classification algorithm based on fuzzy lattices,” Computers and Mathematics with Applications, vol. 61, no. 1, pp. 138–147, 2011.
View at: Publisher Site | Google Scholar
V. G. Kaburlasos and T. Pachidis, “A Lattice-computing ensemble for reasoning based on formal fusion of disparate data types, and an industrial dispensing application,” Information Fusion. In press.
View at: Publisher Site | Google Scholar
P. K. Simpson, “Fuzzy min-max neural networks. Part 2. Clustering,” IEEE Transactions on Fuzzy Systems, vol. 1, no. 1, pp. 32–45, 1993.
View at: Google Scholar
K. C. Tan and H. J. Tang, “New dynamical optimal learning for linear multilayer FNN,” IEEE Transactions on Neural Networks, vol. 15, no. 6, pp. 1562–1570, 2004.
View at: Publisher Site | Google Scholar
A. Nene, S. K. Nayar, and H. Murase, “Columbia object image library (COIL100),” Tech. Rep. CUCS-005-96, Department of Computer Science, Columbia University, 1996.
View at: Google Scholar
C. J. Merz and P. M. Murphy, UCI Repository of Machine Learning Databases, Department of Information and Computer Science, University of California, Irvine, Irvine, Calif, USA, 1996, http://archive.ics.uci.edu/ml/datasets.html.
T. Kohonen and P. Somervuo, “How to make large self-organizing maps for nonvectorial data,” Neural Networks, vol. 15, no. 8-9, pp. 945–952, 2002.
View at: Publisher Site | Google Scholar
D. F. Specht, “A general regression neural network,” IEEE Transactions on Neural Networks, vol. 2, no. 6, pp. 568–576, 1991.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2012 Yazdan Jamshidi Khezeli and Hossein Nezamabadi-pour. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1865

Downloads

1458

Citations