Abstract

A similarity classifier based on Bonferroni mean based operators is introduced. The new Bonferroni mean based variant of the similarity classifier is also extended to cover a new Bonferroni-OWA variant. The new Bonferroni-OWA based similarity classifier raises the question of how to accomplish the weighting needed and for this reason we also examine a number of linguistic quantifiers for weight generation. The new proposed similarity classifier variants are tested on four real world medical research related data sets. The results are compared with results from two previously presented similarity classifiers, one based on the generalized mean and another based on an arithmetic mean operator. The results show that comparatively better classification accuracy can be reached with the proposed new similarity classifier variants.

1. Introduction

In this paper we introduce a new generalization to the similarity classifier that is based on using Bonferroni mean operators in the aggregation of similarities. The Bonferroni mean aggregation operator was introduced in [1] and extended in [26]. Currently, research with respect to Bonferroni mean is increasingly active (see, e.g., [710]). The Bonferroni mean operator is constructed in a way such that it consists of two parts; each argument of the outer arithmetic mean is the product of one argument and the average of all the other remaining inner arguments; this “feature” makes it a unique operator in terms of aggregation [2]. Arithmetic mean and “generalized mean” are special cases (subcases) of the Bonferroni mean (see, e.g., [2]), an issue that makes it a flexible and a “versatile” operator—previously, both the generalized and the arithmetic mean have been used in similarity classifiers [11].

In this paper we also apply an ordered weighted averaging (OWA) based variant of the Bonferroni mean, the so-called “Bonferroni-OWA operator,” proposed by Yager [5]. The basic OWA operator has previously been studied in connection with similarity classifiers in [12], but the Bonferroni-OWA operator is applied in this context for the first time. In order to effectively use the OWA operator a set of associated weights (vector of weights) is required; here we have selected using linguistic quantifiers in order to generate these weights. Linguistic quantifiers give a parametrized way of producing weights for the Bonferroni-OWA operator, which adds flexibility but also introduces a need to find a proper parameter value. Parameter values can be examined and good parameter values found by, for example, sensitivity analysis. For the interested reader, more on linguistic quantifiers and their applications can be found, for example, in [1318]. By using different linguistic quantifiers, we show how several new and different variants of the Bonferroni-OWA based similarity classifiers can be created and examine the newly created variants. The algorithms examined here have been implemented with the MATLAB software, and the new classifiers with different variants are tested by using four different medical research data sets.

In the field of medical research, classification is a key concept and the use of classifiers is warranted in many practical problems, such as patient diagnosis and inevitably also the prognosis of various human conditions and pathologies [19]. Medical diagnosis of common diseases like breast cancer, lung cancer, hepatitis, thyroid, and many others requires high accuracy. However, in real world (medical) problems, it is most often not possible to achieve a classification accuracy due to the complexity of the analyzed conditions and the complications caused by the available data [20]. The complications connected to the data can be the result of several different causes, for example, small (limited) amount of data samples that make accurate generalizations impossible, very large number of attributes and/or variables that creates complexity, and the difficulty in determining the relevance of the considered attribute. Often even small improvements in classification accuracy connected to medical diagnoses can be valuable, since even small improvements can help save human lives. Similarity based classifiers (see [21]) have been shown to have the ability to work well on medical diagnosis problems (see, e.g., [11, 22]) and have advantages such as fast speed and high classification accuracy and have already been shown to work rather well with small sets of samples (see, e.g., [20]). For more information about fuzzy classification and clustering methods, see [2329].

The rest of the paper is organized as follows: in the second section we briefly go through the aggregation operators, the weight generation schemes for the new OWA based classifier variants, and the similarity measures applied in the paper, in the third section we introduce the new similarity classifiers and the new variants, and in the fourth section we first shortly introduce the used medical research data sets and then examine the achieved results. The paper is closed with discussion and conclusions.

2. Preliminaries

2.1. Aggregation Operators

The choice of an aggregation operator that is used in a similarity classifier is a fundamental issue, as it affects the final classification accuracy of the classifier. Several aggregation operators that can be used are available in the existing literature; in this paper we concentrate on averaging type aggregation operators [30]. In what follows, we briefly go through the aggregation operators that we use in our new classifier; the interested reader may find more information on aggregation operators, for example, from [2, 4, 18, 3038].

One of the most common aggregation operators is the arithmetic mean, from which several different generalizations exist, for example, the generalized mean and the ordered weighted average (OWA). The aggregation operator is an important component that is used in similarity classifiers and in this paper, we specifically propose and examine the use of the Bonferroni mean and the Bonferroni-OWA as aggregation operators to be used in a similarity classifier, to create new similarity classifiers. The presented new variants of the similarity classifier are compared with previously presented methods that use the generalized mean and the arithmetic mean. Both the generalized mean and the arithmetic mean are special cases of the Bonferroni mean [2]. The generalized mean is defined as follows.

Definition 1. Let be an averaging operator; a generalized mean of an -tuple is defined by [30]where , and .

By varying the value of the parameter several other means can be derived from the generalized mean (e.g., the arithmetic mean, when , the harmonic mean, when , and the geometric mean, when ).

One other type of generalization of the arithmetic mean is the ordered weighted averaging operator. The ordered weighted averaging operator was introduced by Yager in [16]. Later on several researchers have developed new aggregation operators based on the OWA; for example, see [4, 39, 40]. The OWA operator is also an averaging operator that is characterized by a “reordering step” that allows emphasizing the importance of selected data values. The OWA operator is defined as follows.

Definition 2. Let , be a weighting vector such that , and let be an -tuple. An OWA operator associated with is defined aswhere for any is the th largest element of the collection arranged in a descending order.

As it is our intention to apply the OWA together with the Bonferroni mean, we next present the Bonferroni mean operator and its OWA extension, the so-called Bonferroni-OWA operator, following the work by Yager in [5]. The Bonferroni mean operator was formally introduced in [1] and discussed extensively by other researchers in, for example, [25]. Recently, several researchers have successfully utilized the generalized Bonferroni mean in practical problems [4145]. The Bonferroni mean is defined as follows.

Definition 3. Let , be a vector with at least one and let be parameters. The general Bonferroni mean of is defined by [1]

It has been shown that the Bonferroni mean is an averaging operator and that it satisfies the necessary axioms (see [5]). Following (3) the Bonferroni mean operator can be viewed as the root of the arithmetic mean, where each argument is the product of each with the arithmetic mean of the remaining ; see [2]. Equation (3) was further modified to include several other means, by replacing either the inner or the outer means. One of the results involves using the OWA operator in place of the inner mean and is called the Bonferroni-OWA; for more details see [2, 5]. The Bonferroni-OWA is defined as follows.

Definition 4. Let , be a weighting vector such that and let , be a vector with at least one . A Bonferroni-OWA operator is defined by [5]

When the OWA operator is used, the need to generate the weights that the OWA uses arises; we propose to do this by applying linguistic quantifiers introduced by Zadeh [46] and Yager in [14, 15].

2.2. Linguistic Quantifiers and OWA Weight Generation

Linguistic quantifiers are quantifiers that use a scale of linguistic expressions to summarize the properties of a class of objects without enumerating them; this way they offer an imprecise and a flexible methodology for the quantification of objects; Ying [47] offers a compact review of the literature focused on linguistic quantifiers for the interested reader. Yager [15] classified linguistic quantifiers into three main categories: Regular Increasing Monotone (RIM), Regular Decreasing Monotone (RDM), and Regular Unimodal (RUM) quantifiers. These categories are options for when weight generation systems are envisioned; here we concentrate on RIM quantifiers and apply them. RIM quantifiers were defined by Yager [14] as follows.

Definition 5. A fuzzy subset of a real line is called a Regular Increasing Monotone (RIM) quantifier if it satisfies the following conditions: (1) , (2) , and (3) .

During the ordered weighted aggregation process, terms like most, at least, many, and all are captured by an appropriate linguistic quantifier with parameter . Following [14, 16], for any RIM quantifier , weights for the OWA operator are calculated fromwhere and .

In this paper we consider five different RIM quantifiers; these are the “basic,” “polynomial,” “quadratic,” “exponential,” and “trigonometric” RIM quantifiers. In what follows, we have denoted these with subscript enumerations 1–5 in the order given above. Next we briefly present each of the five selected RIM quantifiers and show how they can be applied in creating weight generating schemes for OWA.(1)The basic linguistic quantifier, , is defined by the equationwhich is associated with the weights ; by application of (5) and (6) we obtain(2)The linguistic quantifier, , proposed by Schweizer and Sklar [48], which we for the purposes of this research call a polynomial quantifier, is defined by the equationwhen , the polynomial and the basic RIM quantifiers will coincide; otherwise they behave differently. Applying the polynomial RIM quantifier to the weight generation we get(3)The quadratic linguistic quantifier, , was suggested by Ribeiro and Marques Pereira in [49]. has two parameters: , which controls the maximum value of weight generation, and , which controls the ratio between the maximum and the minimum values of the generating function; see [49]. The basic form of is given by By applying it to weight generation we getFor the purposes of practical implementation, we have chosen , but we acknowledge that the parameter value can be tuned for optimal performance.(4)The exponential linguistic quantifier, , is defined aswhen it is applied to weight generation we get(5)The trigonometric linguistic quantifier, , is defined by the equationand application to weight calculation givesThese operators, with the generated weighting vectors, are applied in the aggregation of similarities.

2.3. Similarity Measures

In this paper we use similarity measures in a generalized Łukasiewicz-structure (see [50]) to compare objects. The motivation for this choice is that it has been shown that, in Łukasiewicz-structure, the mean of many similarities is a similarity [51]. Also this approach has been previously used in determining similarities implemented in similarity classifiers; see details in [21, 22]. By choosing the Łukasiewicz-structure, two objects can be compared for all participating features. Let and be two objects in a set with entries across all features . We can get similarities, when the two objects are compared, that is, . Thus, we have the similarity, between and defined as follows [50]:An equivalence relation, , between two objects in Łukasiewicz-structure was defined in [52] as . It was shown in [50] that this relation can be generalized asCombining (16) and (17) leads one to a similarity measure, which can be used to calculate the similarity between two vectors with objects. This has been earlier discussed in [50] and further applied in [11, 12, 21]. Thus, with the arithmetic mean, we can write the similarity between two objects and aswhere is the parameter for the similarity measure in the generalized Łukasiewicz-structure.

Several other means can be used instead of the arithmetic mean in (18). With the generalized mean, a modification can be made to include the parameter in the generalized mean to obtainIf one replaces the generalized mean with the Bonferroni mean one arrives at a similarity with the following form:Now, to apply the Bonferroni-OWA to the similarity, the inner mean in (20) is replaced with the OWA operator and and the similarity can be rewritten aswhere is a weighting vector such that and is the th largest element of the reordered similarity. In the next section, we explain how classification based on the presented similarity measures is done.

3. Similarity Classifier with Bonferroni Mean Operators

A new Bonferroni mean based similarity classifier and its OWA variant are introduced in this section. Before going into details of these new classifiers, we briefly describe the main components typically found in similarity classifiers.

It is possible to determine the similarity between two or more samples in a given data set; the main idea is based on comparing samples and as a result of the comparison providing a numerical value that represents their similarity. Typically for similarity classifiers, resulting values closer to 1 indicate high similarity between objects and values closer to indicate low similarity. For classification tasks, the challenge is typically the partitioning of the attribute space in a way such that samples with the same characteristics are allocated into the same classes; for example, see [53]. Once the assignment of samples into individual classes is done properly the classification procedure can proceed.

Suppose a data matrix is to be classified into different classes across attributes, . The initial step is to find mean vectors for each class; these are often called ideal vectors; for example, for class , such a vector is denoted as , where the entry is the mean value of the elements in the class . We observe that there are several ways of determining these ideal vectors, ; for example, one can use the generalized mean; see also [31] for other methods of computing means that can be applied as ideal vectors in this context. The generalized mean, as it is usable in this context, is defined aswhere the parameter (that comes from the generalized mean) is fixed and denotes the number of samples in class . To determine to which class any arbitrary sample belongs, it is compared to the ideal vectors of different classes. The comparison can be done by computing the similarity for attributes in the earlier described generalized Łukasiewicz-structure [50]. The similarity between a sample and an ideal vector of a given class with the Bonferroni mean based similarity measure is given byfor , where is a parameter from the similarity measure and and are parameters from the Bonferroni mean operator; see [1]. In the same manner, we write the similarity measure with the Bonferroni-OWA aswhere is a weighting vector such that and is the th largest element of the reordered similarities.

The sample is assigned to a class with which it has the highest similarity value, for example, in accordance withA pseudocode algorithm from the main part of the process is given in Algorithm 1.

Require:
for to do
  for to do
   for to do
    
   end for
  end for
end for
for to do
  for to do
   for to do
    
   end for
   
  end for
end for
for to do
  
end for

In order to use similarity with the Bonferroni-OWA in Algorithm 1 we need to replace with the Bonferroni-OWA based similarity. In this case, all the other steps are the same, but and becomewhere and are defined in accordance with (24). For purposes of finding , different weight generating linguistic quantifiers were used.

4. Data Sets and Obtained Results

4.1. Experimental Setting for Examination of Results

The experiments were carried out by splitting each studied data set into two parts, one part for training and the other for testing. The data set divisions were repeated randomly times in each experiment and the resulting classification accuracies with corresponding means and variances (from the thirty runs) were recorded. Individual surface plots for each new similarity classifier variant were also plotted; these help to identify proper parameter values. Statistical comparison of classification accuracies from new classifiers with accuracies from the two benchmarks was made by using typical sample statistics (-test).

4.2. Data Sets

Data sets used in testing our new classifier were retrieved from the UCI Machine Learning Repository [54]. Properties of each set used, including the number of classes and attributes and number of instances, are summarized in Table 1.

Further detailed attribute information for the fertility, blood transfusion service center, and echocardiogram data sets is presented in Tables 2, 3, and 4.

4.3. Experimental Results

In this section we present the obtained results from the experiments. Mean accuracies from 30 separate runs were computed for each data set and for each classifier combination. The resulting classification accuracies and the variances obtained are reported in Tables 5, 6, 7, and 8 separately for each data set. For the purposes of benchmarking the new similarity classifiers and their variants, we also report results obtained with arithmetic mean and generalized mean based similarity classifiers.

For the fertility data set, the highest classification accuracy of was obtained using the Bonferroni mean based similarity classifier. The Bonferroni-OWA based classifier produced a classification accuracy of with the polynomial quantifier variant, the generalized mean based classifier gave the accuracy of , and the arithmetic mean based classifier gave the accuracy of . Complete results and their corresponding variances are presented in Table 5. In the same table, computed accuracies with a star are statistically significantly different from the result with the arithmetic mean based similarity classifier and those with a circle are significantly different from the results obtained with the generalized mean based similarity classifier with a confidence level. The results with the new Bonferroni mean based similarity classifier exhibit a statistically significant difference (improved classification performance) to both benchmark cases. The Bonferroni-OWA based similarity classifier that uses the polynomial quantifier for weight generation was found to be statistically significantly better compared to the arithmetic mean based similarity classifier. For the rest of the Bonferroni-OWA variants a statistically significant difference to the benchmark cases could not be established.

Figure 1 shows the surface corresponding to the classification results received from a set of runs with different combinations of parameter values of the parameters for the similarity measure and for the Bonferroni mean, obtained with the Bonferroni mean based similarity classifier and the corresponding variances. The -value (for Bonferroni mean) was fixed at the value 6—this value was not randomly set, but the choice was a result of optimization performed. The best performance was reached with set to approximately 6 and set to approximately 6.

When the new similarity classifiers were used to classify the blood transfusion service center data set, the highest achieved mean accuracy was and was obtained with the basic linguistic quantifier variant of the Bonferroni-OWA based similarity classifier. The classification accuracy of was achieved with the Bonferroni mean based similarity classifier. With this data set, all results with the new proposed similarity based classifiers were found to have statistically significant difference with results from the arithmetic mean based similarity classifier method (indicated with ). There was also statistically significant difference between the results from the generalized mean based method with three variants of the Bonferroni-OWA based similarity classifier (denoted with a circle in Table 6). Figure 2 shows how different combinations of the values for parameters and affect the resulting classification accuracy and the corresponding variance of the classification accuracy.

With echocardiogram data set, the highest achieved mean classification accuracy was , obtained by using the trigonometric linguistic quantifier variant of the Bonferroni-OWA based similarity classifier. The Bonferroni mean based similarity classifier gave a mean accuracy of . All results received with the new proposed similarity classifiers and their variants were statistically significantly different from the results obtained by using the arithmetic mean based similarity classifier, at a confidence level. Statistically significant difference could not be verified between the new methods and the generalized mean based approach. In Table 7 mean accuracies and variances are reported for all new similarity classifiers and their variants and for the benchmark cases ( indicates that achieved accuracies are statistically significantly different from accuracies obtained with the arithmetic mean based classifier (this is explained in the text)).

Figure 3 shows how different parameter value combinations affect the mean classification accuracy and the variance of the accuracy, when the trigonometric linguistic quantifier variant of the Bonferroni-OWA based similarity classifier is used.

When the lung cancer data set was examined, the highest achieved mean accuracy was and was obtained with the Bonferroni mean based similarity classifier. We obtained classification accuracy of with the best functioning quadratic linguistic quantifier variant of the Bonferroni-OWA based similarity classifier. Detailed results are presented in Table 8. Statistically significant difference could not be verified between the new methods and the benchmark cases. Figure 4 shows how different parameter value combinations affect the mean classification accuracy and the variance of the accuracy, when the Bonferroni mean based similarity classifier is used. To produce the figure the parameter (in the Bonferroni mean) was fixed at after a set of experimental runs to optimize performance. The highest accuracy reported was achieved with and .

In Table 9 we summarize the best achieved mean classification accuracies with the new similarity classifiers (and variants) and both the benchmark cases. As can be seen from the table there is no clear “winner,” but for all cases the new proposed similarity classifiers outperformed the benchmark cases; specifically the Bonferroni mean based similarity classifier was the best in two out of four cases and also for the remaining two cases beat both benchmark cases. In a way the overperformance with regard to the selected benchmarks might have been expected, as both the benchmarks are subcases of the Bonferroni mean.

5. Discussion

In this paper we have proposed a new Bonferroni mean based similarity classifier and a new Bonferroni-OWA based similarity classifier with five variants that are based on different linguistic quantifier based weight generation schemes for the OWA used in the classifier. The classification performance of the proposed new similarity classifiers was tested on four real world medical data sets, for each of which thirty sets of runs were made and the mean average classification performance was recorded. As a benchmark, we have compared the results from the proposed new similarity classifiers with two previously presented similarity classifiers, based on the generalized mean and on the arithmetic mean. The mean classification performance of the proposed new similarity classifiers was better than the performance of the benchmarks; however, not on all data sets was the difference in performance statistically significant. Nevertheless, there is evidence that suggests that the proposed new similarity classifiers perform at least as well as and often better than the benchmark similarity classifiers. We note that the performance of these classifiers is data dependent.

Future research on the subject of similarity classifiers, multiclassifier approach, could be considered where each classifier would have one vote on samples class and the final class of the sample is decided by the consensus of the classifiers.

Competing Interests

The authors declare that there are no competing interests regarding the publication of this paper.