Complexity

Volume 2018, Article ID 1342562, 8 pages

https://doi.org/10.1155/2018/1342562

## Kernel Neighborhood Rough Sets Model and Its Application

^{1}School of Data Science, Guizhou Institute of Technology, No. 1 Caiguan Road, Guiyang 550003, China^{2}School of Computer Science, Leshan Normal University, Binhe Road, Leshan, Sichuan 614000, China

Correspondence should be addressed to Kai Zeng; moc.anis@kniliakgnez

Received 7 March 2018; Revised 13 June 2018; Accepted 18 July 2018; Published 23 August 2018

Academic Editor: Danilo Comminiello

Copyright © 2018 Kai Zeng and Siyuan Jing. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Rough set theory has been successfully applied to many fields, such as data mining, pattern recognition, and machine learning. Kernel rough sets and neighborhood rough sets are two important models that differ in terms of granulation. The kernel rough sets model, which has fuzziness, is susceptible to noise in the decision system. The neighborhood rough sets model can handle noisy data well but cannot describe the fuzziness of the samples. In this study, we define a novel model called kernel neighborhood rough sets, which integrates the advantages of the neighborhood and kernel models. Moreover, the model is used in the problem of feature selection. The proposed method is tested on the UCI datasets. The results show that our model outperforms classic models.

#### 1. Introduction

Rough set theory, which was proposed by Pawlak in 1982, is a powerful mathematical method to study incomplete and imprecise information. This theory has been successfully applied to many fields, such as data mining, decision-making, pattern recognition, machine learning, and intelligent control [1–4]. Kernel rough sets [5] and neighborhood rough sets [6] are two important models in rough set theory.

Hu innovatively proposed the kernel rough sets model [5, 7]. A Gaussian kernel rough sets model-based feature selection method was discussed in [8]. The information fusion problem of imperfect images has also been studied based on Hu’s research [9]. Ghosh et al. proposed an efficient Gaussian kernel-based fuzzy rough sets approach for feature selection [10]. A novel fuzzy rough sets model was constructed by combining the hybrid distance and the Gaussian kernel in [11]. A new feature selection method based on kernel fuzzy rough sets and a memetic algorithm were proposed for the transient stability assessment of power systems [12]. In these studies, the information granules are constructed in the kernel structure. The “min” and “max” aggregation operations are used in approximation calculations [13–15]. That is, the decision for a sample is dependent on the nearest sample [7]. The computation of the lower approximation comes with risks if there is noise in the datasets [16]. Data noise can lead to an increase in the classification error rate by using the kernel rough sets model [16].

Neighborhood is an important concept in classification and clustering. To formulate the notion of approximation, the neighborhood system was introduced into the relational model by Lin [17–19]. Yao presented a framework for the formulation, interpretation, and comparison of neighborhood systems and rough sets approximations [20]. Hu et al. investigated the issue of heterogeneous feature subset selection based on neighborhood rough sets [6, 21]. Based on neighborhood granulation, samples are constructed as a family of neighborhood granules to approximate the object sets. The neighborhood model can handle noisy data well based on the tolerance neighborhood relation and probabilistic theory [22]. However, the main limitation of this model is that it cannot describe the fuzziness of samples [16].

Overall, the kernel rough sets model, which has fuzziness, is susceptible to noise in the decision system. The neighborhood rough sets model can handle noisy data but cannot describe the fuzziness of samples. That is, we can construct a new model by the combination of advantage of kernel and neighborhood rough sets.

On the other hand, increasing amounts of high-dimensional data must be processed for some real applications. Currently, feature selection plays an important role in machine learning and data mining. Neighborhood rough sets and kernel rough sets are widely used in feature selection [23–26]. We also can deal with the feature selection problem by using the new rough sets model.

Based on the motivations above, the contributions of this paper include the following: (1) We define a novel model, the kernel neighborhood rough sets model, which integrates the advantages of the neighborhood and kernel models. (2) Moreover, the model is used in the problem of feature selection. (3) The proposed method is tested on the UCI datasets. The results show that our model yields a better performance than classic models.

This paper is organized as follows. In Section 2, some basic concepts regarding neighborhood rough sets and kernel rough sets are briefly reviewed. In Section 3, the kernel neighborhood rough sets (KNRS) model is investigated in detail. Section 4 shows the application of KNRS to feature evaluation and feature selection. Numerical experiments are reported in Section 5. Finally, Section 6 concludes the paper.

#### 2. Preliminary Knowledge

In this section, we review the kernel rough sets (KRS) model [5] and the neighborhood rough sets (NRS) model [6].

##### 2.1. Kernel Rough Sets (KRS) Model

*Definition 1. *Suppose is a nonempty finite set of objects and is a Gaussian kernel function , where is the Euclidean distance. Therefore, is a kernel approximation space, where
(1),(2),(3).

*Definition 2. *Given a kernel approximation space , is a fuzzy subset of , and we define the lower and upper approximations of on the space as follows:

##### 2.2. Neighborhood Rough Sets (NRS) Model

Pawlak rough sets theory works only on data sets with discrete attributes [27]. Hu et al. introduced a neighborhood rough sets model for heterogeneous data to avoid discretization. The definition is as follows [6, 28].

*Definition 3. *Suppose is a nonempty finite set of objects and is a given distance function. is a set of features. Therefore, is a neighborhood approximation space, where
(1), if and only if , ,(2),(3).

*Definition 4. *Given a neighborhood approximation space , , and , is a neighborhood of whose center is and the radius is , where .

Here, can be considered to be the neighborhood granule.

*Remark 1. *Given two points and in -dimensional Euclidean space, the distance between them can be computed as
where and .

*Definition 5. *Given a neighborhood approximation space , for any subset , we define the lower and upper approximations of on the space , respectively, as follows:
The definitions of the lower and upper approximations are the most important concepts in KRS and NRS.

#### 3. Kernel Neighborhood Rough Sets (KNRS) Model

In this section, we study the KNRS model. The definitions and theorems of KNRS are discussed in detail. The kernel neighborhood decision system is also investigated.

##### 3.1. Kernel Neighborhood Rough Sets

*Definition 6. *Given a kernel neighborhood approximation space , , where is a Gaussian kernel function, ; thus, is a kernel neighborhood granule of , where

*Definition 7. *Given a kernel neighborhood approximation space , , where is a kernel function, for any fuzzy subset ; we define the lower and upper approximations of on the space , respectively, as follows:
The method defined above is crisp and has no noise tolerance ability. Here, we propose an improved model that is called variable precision lower and upper approximation.

*Definition 8. *Given a kernel neighborhood approximation space , , where is a kernel function, for any fuzzy subset ; the variable precision lower and upper approximations of are defined as follows, where denotes the cardinality of the specified set:
Then, as in [22].

*Example 1. *Consider a kernel neighborhood approximation space , where , , , , and is a kernel function. The details are listed in Table 1.