Abstract

With the development of different kinds of techniques, especially the Internet of Things (IoT), a large amount of quantitative (either numeric or categorical) data have been generated, transmitted, and stored in the modern society. People hope to understand the interested phenomenon from the collected quantitative data by utilizing different data analysis methods. Exploring the structure of data (e.g., the cluster centers or prototypes) has always been a hot spot in the domain of data mining and knowledge discovery, yet it seems that the modeling and analyzing process still focus on a low-level abstraction of the data because normally, the structure found is only represented by some numeric data points. In this study, we highlight that a low-level abstraction may not be a user-friendly way for people to grasp the knowledge contained in the data. Instead, we explore the structure of the data from a perspective of symbolic analysis. Specifically, two modes of abstraction are proposed. In the vertical mode (i.e., values of each feature are abstracted), the numeric prototypes are characterized by the symbolic prototypes such that people could get rid of being stuck in minor details of each feature. In the horizontal mode (i.e., values of each prototype are abstracted), the linguistic summarization is used to describe all the features of each symbolic prototype such that people could immediately grasp the essential information conveyed in the symbolic prototype. We conduct comprehensive experimental studies on the publicly available data to illustrate the feasibility and validity of the proposed symbolic analysis process.

1. Introduction

An increasing volume of data is being generated, transmitted, and stored nowadays due to the well-developed techniques such as Internet of Things (IoT) [18]. People prefer analyzing the data generated from the real-world phenomenon and building models to describe the phenomenon such that later, with the developed models, they could understand the surrounding environment and be more confident in making decisions. For a long time, quantitative models (constructed directly on a basis of quantitative data) have been favored due to the ubiquitous easy-to-collect quantitative (either numeric or categorical) data and the rigorous mathematical fundamentals behind them, say, linear regression models, polynomials, and neural networks. However, quantitative models are not always feasible or effective when the interested phenomenon is too complex to be completely understood. It happens that people only have some intuitive understanding, e.g., some common sense, of the phenomenon, which hinders building a sophisticated quantitative model. Besides, even though the quantitative models are effective, e.g., the neural networks constructed for pattern recognition, people today are still not quite clear how the model functions (resulting in the well-known black box). All these drawbacks of the quantitative models bring the attention to the qualitative modeling techniques [9].

The crux of the qualitative modeling is that, instead of directly using the quantitative values, the qualitative concepts (e.g., describing the value of a variable or describing the changing trend of a variable) serve as the building blocks of such models, based on which further processing is conducted. This entire process could be regarded as a symbolic analysis process. Many different kinds of qualitative (or symbolic) models have been proposed so far, which could be roughly categorized into five groups. (i)Physical process-based models [10, 11]. This kind of methods highlights the concept of process; i.e., many natural phenomena or physical situations could be regarded as a certain qualitative process. To describe this process, many new concepts such as the objects, quantity space representation of the objects, individual views, and histories have been proposed(ii)Qualitative differential equation- (QDE-) based models. There are two branches of methods in this type of models. One branch revolves around the QSIM model proposed in [12], where the qualitative abstraction of the ordinary differential equation is pursued. The final QDE is represented by a set of qualitative constraints. Many QDEs [1319] have been proposed to extend the applicable scenarios of QSIM. The other branch focuses on the construction QDE based on the so-called confluence [20]; here, the behavior of physical system is divided into different qualitative states, each of which is then described by a set of confluences(iii)Logic-based models. Two branches of method are observed in this group. One group focuses on building an advanced expert system [21]; here, the quantity is first described by some qualitative symbols, say intervals; then, the logic is used to describe the relationships therein. Another group of methods [2224] focus on applying the Inductive Logic Programming (ILP) to construct the qualitative models such that the background knowledge could be incorporated(iv)Qualitative tree-based models. In this type of models, a structure similar to that of the conventional decision tree is formulated. The nodes of the trees are still some numeric thresholds of the features; however, the leaves are represented by some qualitative constraints. When the constraint describes the monotonicity of the output with all the features, we come with the QUIN model [9, 25]; when it only describes that with one feature (i.e., the partial derivative), we have either the Padé model [26] (corresponding to the numeric data) or the Qube model (corresponding to the categorical data) [27](v)Others. Here, we observe three other different qualitative models. In [28], the polynomials are used to approximate the numeric data, then from which the qualitative constraints given in QSIM model [12] are derived. In [29], the semiquantitative extension to the QSIM model is conducted; i.e., the fuzzy sets are used to build the quantity space, rules are used to realize the transition between the states, and fuzzy mathematics is used for qualitative simulation. In [30], the OLAP model based on the qualitative representation derived from the dynamic Bayesian networks (DBN) is constructed, making it possible to build high-level states and actions in the continuous environment for an intelligent agent

By summarizing all these proposed qualitative models, a common point observed is that the qualitative representation of the quantity stands in a key position during the entire modeling process. The qualitative representation is normally represented by a series of so-called landmarks (symbols); for example, we could use three symbols +, 0, and - to represent the sign of the magnitude of a variable or that of the changing trend of a variable (e.g., + represents an increasing trend). Albeit the significance of the qualitative representation, their derivation is kind of coarse, and there is a lack of a mechanism to obtain them automatically from the numeric data. This motivates our first objective to form a symbolic description of numeric data based on prototype-based clustering algorithms [31]. The symbolic prototypes (represented as a series of symbols, say, labels such as 1, 2, and 3) would be formed based on the numeric prototypes.

As another observation, it seems that the modeling process in the aforementioned literature still focuses on a low-level abstraction of the data. By low level, we mean the global abstraction of all the features of a prototype is missing in the current literature. In other words, people could not immediately grasp the most essential information contained in the data due to a lack of a human-centric manner of knowledge representation. This motivates the second objective of this research. We intend to build a higher level of qualitative model to describe each symbolic prototype. Specifically, given a symbolic prototype, we would address the issue of how to qualitatively describe the symbols of all the features based on the concept of linguistic summarization.

The aforementioned two objectives naturally bring the originality of the study as follows: the proposed symbolic description of the raw quantitative data delivers a high-level abstraction of the data in a vertical mode. By vertical, we mean that for each feature of the prototypes, the attention is paid on the relative locations of the prototypes rather than the detailed quantitative values. The proposed linguistic summarization of the symbolic prototype delivers a high-level abstraction of the data in a horizontal mode. By horizontal, we mean that for each symbolic prototype, we make a linguistic summarization of the symbols of the entire features. With both modes of abstraction, we deliver a more human-centric manner to understand the knowledge in the original raw data. The symbolic description followed by the linguistic summarization forms the symbolic analysis highlighted in this paper.

The remainder of the paper is organized as follows: in Section 2, we propose the method of symbolic characterization of the numeric data (i.e., vertical mode of abstraction) such that the symbolic prototypes (information granules) are obtained. We investigate the property (i.e., stability) of the obtained symbolic prototypes in Section 3. In Section 4, the linguistic summarization of information granules (i.e., horizontal mode of abstraction) is proposed. We conduct experimental studies on the IoT data to illustrate the proposed two modes of abstraction in Section 5. The paper is concluded in Section 6.

2. Symbolic Characterization of Numeric Data

Description of the data of interest has always been a major target in the field of data mining and knowledge discovery. To find the representatives in the data, many clustering concepts and algorithms [3133] have been proposed and applied to the data, resulting in the numeric representatives (e.g., the prototypes) of data. To make the obtained structure more descriptive, in recent years, the concept of granular computing [3438] has been used to generalize these numeric prototypes to granular prototypes (e.g., hyperboxes). For either of these two categories of methods, the essence of these prototypes is values defined in the domain of those variables (features/attributes). However, these results are not always beneficial especially when they are directly provided to a human being for understanding the structure of the data. For example, suppose we have three prototypes in a two-dimensional space, which are represented as , , and . These numbers are not easy for a human being to grasp the structure of the data intuitively. Hence, this motivates us to make a symbolic characterization of the data structure. Specifically, instead of directly using these numbers, we represent the value of a feature of a prototype by a certain symbol (or label). And this label specifies the location (in an ascending order) of this value among the values of this feature for all the prototypes. Still using the above example, for 0.17 in , since we have 0., it ranks the second place among all the values for the first attribute; thus, we replace it by a symbol 2. Similarly, we replace 0.01 by 1 and 0.28 by 3. With this mechanism, we have the new prototypes as , , and . In this case, for each prototype, we could immediately have a sense of what is the position of a value of a certain feature among all the prototypes. These new prototypes provide us with a high-level information which is much easier for a human being to understand.

With the illustrated example, in what follows, we briefly summarize the methods of symbolic characterization of data structure. Since it is not our focus to more reasonably devise the clustering algorithms, in this study, we specifically focus on the fuzzy clustering algorithms, Fuzzy -Means (FCM) [31, 32, 39], in particular. Obviously, one may still resort to other prototype-based clustering methods (e.g., the -means) to get the prototypes of the data. Suppose the data we are interested in are represented by , where the th data point is a vector in a -dimensional space spanned over features . By setting the number of clusters and fuzzification coefficient , clustering data with FCM is realized by minimizing the following objective function: with the squared weighted distance expressed as where is the standard deviation of the th feature of the data, fuzzification coefficient is greater than 1 (its commonly used value is set as 2), and unless otherwise specified, stands for the weighted Euclidean distance. Here, this weighted distance is used to normalize features with different dimensions. The data set is clustered into clusters coming in the form of the partition matrix , , and a collection of prototypes represented by a prototype matrix as . The th data is described by the th column of the partition matrix . The FCM clustering algorithm is summarized as Algorithm 1.

Input:X, c, m
Output:U, V
Initialize U
Repeat
Until, where ε is a small positive number.

Suppose that the structure of the data is finally represented as a series of prototypes represented as . Focusing on the th feature, the feature values of all the prototypes read as . By sorting these values in ascending order, we easily know their positions in this ordered array. We then replace these feature values by their equivalent labels, that is, , where is an integer that ranges between 1 and . Hence, finally, the numeric prototypes are transformed into the symbolic ones which are represented by . This transformation is shown in Figure 1 when only a two-dimensional data set is encountered. We see that the symbolic characterization delivers a qualitative description of the data, specifically of each feature because each label describes the position of the value of the th feature of the th prototype relative to other prototypes. This kind of description reflects the information from other prototypes and could be regarded as a vertical abstraction, in a sense that the abstraction is conducted for numeric values of the same feature but different prototypes. Hence, we term the symbolic characterization of numeric data as the vertical mode of abstraction.

3. Stability of Information Granules

Under the ideal condition, clustering algorithms are expected to be applied directly to the entire data to make use of the information provided by all the data (which reflects the entire phenomenon). However, in reality, quite often, only a part of the entire phenomenon is observed by an organization or user, which results in a subset of the entire data. Since the phenomenon is often observed by multiple users, multiple subsets are generated. In this part, we are interested in analyzing the similarity or stability of the symbolic prototypes among the different subsets.

An intuitive illustration of the notion of stability is provided in Figure 2. Suppose that we have four data sets, and three numeric prototypes , , and are found for each data set along with their corresponding symbolic prototypes , , and . The only difference among these four plots is the location of the third prototype which is moved upward gradually from Figures 2(a) to 2(d). We may find that the structures between Figures 2(a) and 2(b) remain stable, because the relationships (relative positions) among the prototypes are the same. Those relationships are slightly changed in Figure 2(c) and are totally different in Figure 2(d) when three prototypes are nearly aligned. Although only one numeric prototype keeps changing, the labels of all the numeric prototypes may change depending on the stability of the data structure. As observed, since data structures in Figures 2(a) and 2(b) are most similar, their corresponding symbolic prototypes are exactly the same; structures in Figures 2(a) and 2(d) are significantly distinct, and their corresponding symbolic prototypes (especially the labels for feature ) are nearly different. In what follows, we study how the stability of the data structures could be measured by the symbolic prototypes.

Suppose that we generate two subsets which are represented as and from the entire data, respectively. By applying the symbolic characterization method to and , we represent the symbolic prototypes as and , where and . We propose the index to measure the distance between two symbolic prototypes.

The distance measure in (3) is a “coarse distance” because we only count how many the same-positioned entries are different. For example, if we have and , then only the third entries are different and we have . Obviously, in the extreme cases, if two symbolic prototypes are identical, the distance equals zero; and if they are different (i.e., the values of the same-positioned entries are always different), the distance is one.

With the distance measure for any pair of the symbolic prototypes, we define the stability index of the information granules as follows when two subsets are considered.

By (4), for any given symbolic prototype , we measure its distance to all the other symbolic prototypes generated from the other data set and pair it with the one having the shortest distance to it. As it could be envisioned, if the distributions of the symbolic prototypes in and are similar (i.e., symbolic prototypes are stable under the different environment), could be a value close to one; otherwise, it is close to zero.

In what follows, we consider the case of multiple subsets. Suppose that we sample subsets from , which are represented as . Their symbolic prototypes are denoted by , , . By using (4), we first measure the stability of information granules focusing on any pair of two subsets, that is,

Obviously, (5) could be regarded as a more detailed (indexed) version of (4). The overall stability of the information granules on all the subsets is defined as which could be regarded as the average of the stability values of the related data pairs.

We use the four data structures shown in Figure 2 to illustrate the calculation of the stability of the data structures. Here, we have and . Besides, to make it clear, we use to represent the th prototype of the th data set, ; and . We show the process to obtain the stability index between any two data sets in columns in Table 1. For example, when considering the structures of and in Figures 2(a) and 2(b), by calculating the distance between and any symbolic prototype in , we match with . Similarly, is matched with , and is matched with . Afterwards, with (5), we have which demonstrates the high stability between the structures in and . By the same process, the stability index values between other data pairs are also observed. Note that in Table 1, entry such as means that can be matched with either or because their distance to is the same. Now, if we consider the stability among the four data structures, with (6), we have . Since, the stability values are always located in [0, 1], the obtained values generally reflect our intuition about stability among the data structures.

4. Linguistic Summarization of Information Granules

With the same notation used in the previous section, the symbolic data structure is represented as , where . In this section, based on the concept of linguistic summarization [40, 41], we propose the horizontal mode of abstraction, in a sense that a higher level of qualitative description is applied to each of the formed symbolic prototypes to make a qualitative description of the values of a group of features.

For example, focusing on the symbolic prototype , we could describe it by the alike following sentence.

Note that two linguistic terms with italic bold fonts are used in the above sentence. The term most is a linguistic “quantifier” of the number of attributes that satisfy a certain condition (status), while the term high is the linguistic “descriptor” of the symbol (label) of a certain feature. Obviously, for the linguistic quantifiers, instead of most quantifiers like a few, around half, etc. could also be used. These quantifiers intrinsically are fuzzy sets defined on the interval [1, ]; for example, the term most could be illustrated as a fuzzy set in Figure 3(a), while for the descriptors, terms like low and median could be used. These terms are intrinsically fuzzy sets defined on the interval [1, ]; for example, the term low could be illustrated as a fuzzy set in Figure 3(b). We may use and , respectively, to represent all the available linguistic quantifiers and the possible linguistic descriptors.

For simplicity, the linguistic characterization of an information granule (i.e., the symbolic prototype ) could be represented as a formula as where indicates to what degree the summarization is valid.

To get the value of , we follow the initial straightforward idea provided in [40], although many alternative methods could be found in the literature. Suppose we are focusing on the certain linguistic descriptor and quantifier , the detailed procedure is given as follows:

Step 1. For each symbol , calculate its membership degree to the linguistic descriptor , as .

Step 2. Denote by the cumulated number of features that satisfy the given descriptor, where is similar to the -count.

Step 3. Determine the membership degree of with respect to the quantifier as .

Since quantifiers and descriptors are used, we may finally obtain pieces of linguistic summarizations, attached with the corresponding validity values. Then, the summarizations with the values of validity exceeding some given threshold , say 0.8, could be reserved due to the high value of reliability.

Example 1. We use an example to illustrate how the validity degree of a summarization is calculated and what the summarization results look like when multiple descriptors and multiple quantifier are used. Suppose we are provided with a symbolic prototype of with the cluster number equals 5, the used descriptors are {low, intermediate, high}, and the quantifiers are {a few, around half, most}. The corresponding membership functions of these linguistic terms are provided by users, which are shown in Figure 4. Here, for simplicity, only piecewise linear membership functions are considered, and we only show the formulas for membership functions of the three descriptors as (9)–(11); those formulas for the quantifiers are obtained in a similar way (replacing by ).

Descriptor low for a label

Descriptor intermediate for a label

Descriptor high for a label

Now, let us focus on the summarization of “most attributes of information granule assume low values” for the symbolic prototype when . We follow the steps mentioned above to get the validity degree of this sentence.

Step 1. We calculate the membership degree of each label to the linguistic descriptor low. With (9), we have , , and .

Step 2. We summarize the obtained membership values of all the elements of the symbolic prototype, resulting in .

Step 3. With the membership function of most in Figure 2(b), we easily obtain that .

By exploring all the possible descriptors and quantifiers, we finally obtain 9 pieces of summarizations, along with their validity degrees as in Table 2. All the used linguistic terms for the descriptors and quantifiers are shown in italics. We sort them in descending order according to the validity degree. One may set a threshold beyond which the linguistic summarizations could be reserved or simply claim that the top (say, three) summarizations could be reserved. Throughout this paper, we adopt the latter strategy; hence, in the example, we reserve the first three linguistic summarizations (with their values of validity degree shown in boldface). These summarizations generally are consistent with our intuition of the symbolic prototype.

5. Experimental Studies

In this section, we check the stability of the information granules and see the performance of the linguistic summarizations on some publicly available IoT data.

5.1. Stability of Information Granules

We use some publicly available IoT data sets to demonstrate the stability of information granules. These data sets could be found in either the UCI machine learning repository (https://archive.ics.uci.edu/ml/index.php) or the KEEL data repository (https://sci2s.ugr.es/keel/datasets.php). We list the basic information of these data sets in Table 3. From each given data set, a sampling rate is used to get subsets, and we are interested in how the structures among these subsets change (the stability of these structures) with the increasing number of clusters . The flow of our experiments is shown as follows: the FCM algorithm is performed on each of the subsets to get the numeric prototypes; these prototypes will be transformed into the symbolic prototypes according to the method introduced in Section 2, accompanied with the process to get the value of the stability index proposed in Section 3. The sequential processes of FCM clustering, symbolic prototype obtaining, and stability index calculating will be repeated times (i.e., the -fold experiment) so that the mean and standard deviation of the stability index are obtained. During all the experiments, the fuzzification coefficientis set as 2, is set as 10, and ranges from 2 to 50 with a step size of one.

5.1.1. Identical Subsets

We start the experiment with the simplest case; i.e., we set and . In other words, for a given data set, the two sampled subsets are identical with the given data set. The trends of the stability of the structures of the two subsets with the increasing number of clusters are shown in Figure 5. Obviously, for each data set, three phases of the trends could be identified. The structures between the two subsets remain quite stable when the number of clusters is small (e.g., around 5), because the value of the stability index is equal to or quite close to one. Those high values of stability are highlighted with the solid ellipses. Then, the stability of the structures drops significantly with the increasing cluster number, whose trends are highlighted by the dotted ellipses. Afterwards, these trends will generally remain in a low level, which are illustrated by the dashed ellipses. Now, if we consider more subsets, i.e., we set and , the trends of the stability of the structures are shown in Figure 6. Similar observations are observed to the case where .

It seems reasonable for us to obtain such a type of changing trends (with three phases) of the stability; we give the reason as follows: (a)The phase of high stability. It seems that when the number of clusters is low, it is easier to obtain the similar cluster prototypes. If two data subsets are similar (in this experiment, they are the same) to each other, then there is a large possibility that the obtained high-level abstraction is similar (i.e., a high value of stability close to one)(b)The phase of sharp decreasing of stability. When the level of detail increases, it happens frequently that the obtained structures start to be distinct with each other (even if the two subsets are the same). We illustrate this by a simple example. Suppose that a two-dimensional synthetic data set is generated as three Gaussian clusters with the cluster centers as , , and , and the spread (the covariance matrix) of each cluster is represented as The synthetic data are shown as the black dots in any plot in Figure 7. Now, if we cluster those data into three clusters, we obtain the data structure (three prototypes denoted by the circles) in Figure 7(a). If we repeat our clustering process two more times, the derived prototypes are shown in Figures 7(b) and 7(c), respectively. Clearly, the structures among the different data sets are nearly the same. Now, if we cluster the data into nine clusters (i.e., to explore more details of the data), the obtained prototypes (derived from three repeated clustering process on the data set) are shown from Figures 7(d) to 7(f). Note that the locations of these prototypes among different plots are significantly different. Specifically, the number of prototypes in the bottom left cluster ranges from 2 to 4 in the repeated experiments. This example illustrates why when more details of the data are explored, the less stability among the data structures (even of the same data set) could be encountered(c)The phase of converging to a stable level. This phenomenon is also explainable; after all, we only considered a rough index in (3) to measure the difference between two symbolic prototypes. For instance, when the cluster number , we may have two symbolic prototypes as and ; their distance equals one according to (3). However, suppose that now we have and the two symbolic prototypes are and , we obtain the same distance as the case where . Here, we see that although has been increased, the distance between the pair of symbolic prototypes may remain the same, which may further lead to the unchanged value of the stability

5.1.2. Different Subsets

In the former section, we only considered the case where all the subsets are identical. It is interesting to check the results when different subsets are encountered. By setting the sampling rate to other values rather than one, we could obtain the different subsets. Then, the stability of the information granules among these subsets is calculated in a similar way to the previous section. Here, the point we are interested in is that how many times we could observe the high value of the stability (the threshold for this high value is set as 0.9) even though the subsets are different. Taking Figure 6(f) as an example, we observe four points whose index values are greater than 0.9; then, we conclude that the number of cases with a high value of stability is 4. By ranging the sampling rate from 0.1 to 1, in Figure 8, we show the obtained results on three UCI data sets when the number of subsets is set as 2 and 10, respectively. The general observation is that with the increasing sampling rate (hence, the subsets become more and more similar with each other), we obtain more cases where a high stability of information granules is achieved.

We show some possible insights obtained from the experiments in this section as follows. In reality, the same phenomenon is observed by many different agents. It is difficult for each of them to obtain all the related data. However, even with that, if they adopt the symbolic description of the structure of their own data and as long as the number of the prototypes is kept in a low level, there is a large possibility that the symbolic prototypes across the different agents are similar to each other. In other words, without utilizing the outside data, one agent could still nicely grasp the essential structure of the entire phenomenon with the method of symbolic description.

5.2. Linguistic Summarization of Information Granules

In this part, we conduct experiments on two publicly available data stock and CBM (see their basic information in Table 3) to comprehensively show the linguistic summarizations of the symbolic prototypes of the two data sets. For each data set, the FCM algorithm is used to cluster the data into clusters and these numeric prototypes are transformed into the symbolic prototypes as . These symbolic prototypes for each data set are shown as the radar plots in Tables 4 and 5. The histogram for the labels in each symbolic prototype is also provided in these tables, which later would be used as auxiliary tools to validate the formed linguistic summarizations. The linguistic descriptors and quantifiers defined in Figure 4 are still used here to form the linguistic summarizations; thus, the same as the illustrated example in Section 4, there should be 9 pieces of summarizations for each symbolic prototype. We sort these summarizations in descending order in terms of their values of the validity degree and report those with the first three largest values in Tables 4 and 5. If we compare the obtained three linguistic summarizations with their corresponding symbolic prototype and histogram, most of the time, we find they make sense. For example, if we check the second linguistic summarization for derived on stock, i.e., “a few attributes of information granule assume intermediate values (0.56),” we find that it is consistent with the corresponding histogram because there is only one label “3,” which could be regarded as a few to some extent. The more detailed information of the validity degree of each linguistic summarization of a specific symbolic prototype is documented in Table 6. Here, for each data set, one may read the table column wise, and in each column, we show the values of validity degree of all the possible linguistic summarizations for each symbolic prototype. Then, we highlight the largest three values of validity degree in boldface along with their orders among all the nine values of validity.

6. Conclusions

To have a better understanding of the knowledge contained in the ubiquitous quantitative (either numeric or categorical) data generated with the novel techniques such as IoT, in this study, we proposed a symbolic analysis method to represent the data structure in a more human-centric manner. Specifically, two different abstraction modes have been proposed. With the vertical mode, the numeric prototypes could be represented by the symbolic ones, and for each feature, prototypes are arranged in ascending trend and represented by labels such as 1, 2,..., . In this way, people could pay more attention to the relative locations of the prototypes rather than the detailed quantitative values. With the horizontal mode, each symbolic prototype will be described by a linguistic summarization considering all the labels of the features; a sentence could be used to reveal the information contained in the symbolic prototype. With both modes of abstraction, we deliver a more human-centric manner to understand the knowledge in the original raw data. Furthermore, although we focus on the symbolic data description in this study, high-level causality analysis with symbolic rules could be an interesting research direction. Obviously, a detailed study of the “symbolic” fuzzy rule-based model [4245] deserves a future investigation.

Data Availability

The data sets could be found in either the UCI machine learning repository or the KEEL data repository.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 72001032, Grant 72002152, Grant 72071021, and Grant 71904020 and in part by China Postdoctoral Science Foundation under Grant 2020M673148.