Fuzzy Collaborative Clustering-Based Ranking Approach for Complex Objects
This paper makes a discussion on the ranking problem of complex objects where each object is composed of some patterns described by individual attribute information as well as the relational information between patterns. This paper presents a fuzzy collaborative clustering-based ranking approach for this kind of ranking problem. In this approach, a referential object is employed to guide the ranking process. To achieve the final ranking result, fuzzy collaborative clustering is carried on the patterns in the referential object by using the collaborative information obtained from each ranked object. Since the collaborative information of ranking objects is represented by cluster centers and/or partition matrices, we give two forms of the proposed approach. With the aid of fuzzy collaborative clustering, the ranking results can be obtained by comparing the difference of the referential object before and after collaboration with respect to ranking objects. One can find that this proposed ranking approach is totally different from the previous ranking methods because of its completely collaborative clustering mechanism. Moreover, some synthetic examples show that our proposed ranking algorithm is valid.
For decision makers, the goal of ranking is to discover a mechanism that induces an increasing or decreasing order over the data set of given objects to be ranked. Ranking problem appears in many fields such as information retrieval , selection and evaluation [2, 3], image similarity measure , and income distribution . It can be easily found that, during the ranking process, only the attribute information is usually considered. Thereinto, the attribute information may be given by fuzzy numbers [6, 7], intuitionistic fuzzy numbers , or interval-valued intuitionistic fuzzy numbers . Generally speaking, in these cases, the ranking problem can be regarded as a multiple criteria decision making problem [10, 11], and the detailed decision making process can be executed by many approaches [11–13].
Most ranking algorithms receive attribute information and output a real-valued sequence as the ranking results. Sometimes, besides the attribute information, the relational information of the patterns is also provided for the ranking, especially in the fields of machine learning [14–16]. Up to now, there have been several developments in theory and algorithms for learning over relational information. For example, Agarwal  developed an algorithmic framework for learning ranking functions on graph data; Mihalcea  presented an innovative unsupervised method for automatic sentence extraction using graph-based ranking algorithms; Lee et al.  focused on the flexibility of recommendation and proposed a graph-based multidimensional recommendation method; Agarwal  considered the graph learning problems through ranking objects relative to one another, and so on [21, 22]. No matter what problem to investigate or which algorithm to construct, in general, the patterns that need to be sorted are described by only one kind of information: attribute information or relational information.
Obviously, the above-mentioned methods are concentrated on attribute information or relational information separately. In practical terms, it is not enough when the attribute information and relational information of patterns are available at the same time. In fact, there have been many ranking problems that need to take both of these types of information into consideration, for the purpose of improving knowledge recognition degree.
Bearing this in mind, in this paper, we make a discussion on the ranking problem of complex objects. Each complex object consists of some simple patterns which are described by attribute information as well as the relational information between patterns. Because of the complexity of the objects, inspired by the famous TOPSIS method , in our proposed ranking approach, a referential object is proposed in advance. Hereinto, the number of patterns of the referential object is the same as that of each complex object. To fully fulfill the ranking problem, the fuzzy collaborative clustering mechanism is proposed and it is carried on the patterns in the referential object by using the collaborative information. The collaborative information refers to the partition matrices and/or cluster centers with respect to the complex objects, and it can be employed freely during the ranking process. During the process of ranking, the essence of the ranking algorithm is to compare the difference of the referential object before and after collaboration with respect to the ranking objects. Throughout the ranking process, only the collaborative information participates in the sequence calculation.
The remainder of this paper is organized as follows. In Section 2, we recall some basic concepts, such as the mathematical description of complex object and fuzzy collaborative clustering. In Section 3, we discuss the ranking problem about complex objects when the collaborative information is cluster centers that derived from the corresponding complex object by some clustering approaches. In Section 4, the partition matrices are regarded as the collaborative information and corresponding ranking approach is constructed. In Section 5, some synthetic examples are simulated to illustrate the validity of the proposed ranking approaches. Finally, Section 6 concludes this paper.
In this section, at first we introduce the concept of complex objects and then make a brief review of the fuzzy collaborative clustering algorithm. For simplicity, here we ignore the presentation of fuzzy -means clustering algorithm [24–26], not to mention other associated research topics [27–29].
2.1. Mathematical Description of Complex Objects
Mathematically, the complex object can be expressed as a tuple , where is a nonempty finite set and for are patterns, is a relational information set, and for is the possible relational information between the patterns and .
Formally, if is nothing but a symbolic representation of the th pattern, then the relational information set can be only used during the process of data analysis. If not, the pattern can be expressed as , where represents the value of with respect to the attribution . In general, the value may not be a real number. Sometimes it may be a set , an interval , or a fuzzy number [32, 33], and so forth.
Certainly, the foregoing proposed data representation formats are also suitable for the entries in relational information set . Particularly, the relationship between any two patterns may be described by more than one relation; that is, with , where describes the th relationship between patterns. Notice that, in this paper, we adhere to the hypothesis that for and is a real number, for belongs to unit interval , and for complex object .
2.2. Fuzzy Collaborative Clustering
It is well known that the nature of clustering analysis is to find out the potential structure of the data. If one wants to search for the common structure of some different data sets while the information of these data sets cannot be shared freely, the collaborative clustering algorithm [34, 35] plays an important role.
One has that the collaborative clustering can be divided into horizontal collaborative clustering, vertical collaborative clustering, and hybrid collaborative clustering. Because we have assumed that the number of patterns for each ranking object is equal, in what follows, we will use the horizontal collaborative clustering algorithm; thus we give a brief introduction of it.
Given that are data sets, and there are patterns in each for , the collaborative mechanism based objective function can be constructed aswhere is the distance between the pattern and the cluster center in and the nonnegative parameter represents the collaborative degree of to .
The concrete optimization process of objective function in (1) is the same as that of fuzzy -means clustering algorithm [24–26] or partial supervised fuzzy clustering algorithm [36, 37]: applying Lagrange multiplier method. Just as  did, the partition matrix can be expressed aswhere and . Similarly, the cluster centers can be expressed aswhere
In fact, the process of collaborative clustering can be partitioned into two phases: the first stage is to execute fuzzy -means clustering algorithm for each data set separately, in which case we can obtain partition matrix and cluster centers . The second stage is to compute the collaborated partition matrix by (2) and the cluster centers by (3).
3. Cluster Center-Based Ranking Approach for Complex Objects
Here, we pay our attention to the construction of ranking algorithm when the collaborative information of each complex object is provided in terms of cluster centers. For all ranking objects , we suppose that they have the same number of patterns, as well as the referential object .
In the approach to be given in this section, there are three kinds of collaborative information, producing corresponding cluster center matrices: if it is derived from the attribute information of complex object , then we apply to denote the collaborative information; if it is derived from the relational information of complex object , then we apply to denote the collaborative information; if it is derived from both of the attribute information and relational information of the complex object , then we apply to denote the collaborative information. It should be noted that if the collaborative information is composed of two irrelative parts (one is derived from attribute information and another is derived from relational information), then comes naturally. Certainly, the information contained in these matrices can be combined to be embedded in collaborative fuzzy -means clustering.
With the above description, one knows that the collaborative information has at least three possible representation types, and different type represents different meaning. No matter which types the collaborative information belongs to, once the collaborative mechanism is introduced to rank the objects for , the objective function of collaborated by can be constructed as follows:where is the distance between pattern and center and is a label function with respect to the collaborative information. Moreover, is the distance between and the collaborative information :
Obviously, in (5) describes the collaboration influence of on , and parameter can be viewed as the collaborative coefficient that is relevant to the types of collaborative information. Meanwhile, (6) represents the distances between the cluster center of referential object and the cluster center of complex object .
Remark 1. The collaborative coefficient can be determined from the types of the collaborative information:(i)If the collaborative information is , then and .(ii)If the collaborative information is , then and .(iii)If the collaborative information is , then and .(iv)If the collaborative information is , then , but and .
As can be seen from Remark 1, the types of collaborative information can determine which component of is equal to 0, but the value of other nonzero components is not provided. No matter which types of collaborative information, that is, the collaborative information is , , , and/or , the concrete value of depends entirely on the requirement of practical ranking problem, and it can be calculated by many methods.
So far, one can regard the objective function in (5) as an optimization problem with the constraint . In what follows, we still apply the Lagrange multiplier method to solve it. At first, the Lagrange function of (5) can be rewritten as
Computing the derivative of with respect to and making it equal to 0, we haveOn account of the constraint , we have that the Lagrange multiplier can be determined asInserting the expression of the above into (9) yields
The computations of the cluster centers are straightforward, as no constraints are imposed on them. By executing the derivation of (5) with respect to variant , and letting the result be equal to 0, then we have thatFor the solving of , by (6), we have thatTherefore, taking (10) into (9) yieldswhere and .
Up to now, the partition matrix in (11) and the cluster centers in (14) can be used to describe the clustering results of referential object under the collaboration of complex object . Notice that, throughout this section, the notation “” denotes the cluster centers or collaborative information of the ranking object . To avoid unnecessary ambiguity, in the rest of this subsection, we apply to replace the computing results of (14).
By the collaboration of , one can obtain the new partition matrix and cluster centers of the referential data . Certainty, if the difference of and is smaller, then we can say that and have a bigger similarity in aspects of partition matrix. Similarly, if the difference of and is smaller, then we can say that and have a bigger similarity in aspect of cluster center.
Bearing this in mind, the equation can be used to compute the similarity of and about the referential object , where represents the similarity of and , and can be calculated by many methods . Similarly, the equationcan be used to calculate the similarity of and , where . By (15) and (16), the similarity of clustering results of referential object before and after collaboration of can be determined by the following equation:
Obviously, if the similarity between and is greater than that of and , then is nearer to than the object in aspects of collaborative information. Therefore, complex object should be sorted in front of complex object . So much for this, the collaborative clustering-based ranking approach for complex objects can be summarized in Algorithm 1.
4. Partition Matrix-Based Ranking Approach for Complex Objects
In the partition matrix-based ranking approach for complex objects, the collaborative information provided by each ranking object is the corresponding partition matrix. Similarly, we suppose that the referential object and the ranking object for have the same number of patterns.
Similar to what was discussed in Section 3, there are three kinds of collaborative information, producing corresponding partition matrices: if it is derived from the attribute information of , then we apply to denote the collaborative information; if it is derived from the relational information of , then we apply to denote the collaborative information; and if it is derived from both of the attribute information and relational information of , then we apply to denote the collaborative information. Certainly, if the collaborative information is composed of two irrelative parts (one is derived from attribute information and another is derived from relational information), then comes naturally.
In terms of the cluster center numbers , , , and , one has and . Because of the complexity of the representation of collaborative information , at first we make a discussion on , , and , respectively. By the description of collaborative clustering, if , then the objective function can be constructed as follows:where is the decision maker’s confidence for the collaborative information.
Obviously, in (18) the part is the collaboration of to , and can be viewed as the collaborative coefficient. Note that the mechanism of parameter is the same as that of parameter .
If , the objective function in (18) will be invalid. By taking the fact that each row of the partition matrix represents the classification distribution as well as the concept of proximity proposed in [39, 40] into consideration, here (18) can be rewritten as follows:where and can be computed by formula . Up to now, under the constraint , the objective function in (19) can be solved by Lagrange multiplier method.
Next, we discuss the problem of how to construct the objective function when the collaborative information is . By taking and into consideration, the objective function can be constructed aswhere is a label function with respect to and is also a label function with respect to . Moreover, can be defined asand that represents the distance between objects and can be defined aswhere , , and can be determined by formula .
Remark 2. Similarly, (18)–(22) describe the collaboration of complex object to the referential object . The collaborative coefficient and can be determined from the types of the collaborative information:(i)The collaborative information is : if , then there is only ; otherwise there is only .(ii)The collaborative information is : if , then there is only ; otherwise there is only .(iii)The collaborative information is : if , then there is only ; otherwise there is only .(iv)The collaborative information is : if and , then there are only and ; if and , then there are only and ; if and , then there are only and ; if and , then there are only and .
Evidently, types of collaborative information have a critical implication for the value selection of parameters and . No matter which types it belongs to, the parameters can be predetermined or provided by decision makers during the process of ranking.
Just as described in foregoing sections, next we still employ the Lagrange multiplier method to solve the ranking problem based on (20). Let the Lagrange function of (20) bewhere is the Lagrange multiplier.
Computing the derivative of with respect to and making it equal to , we haveIn addition, taking (21) into variant , we haveSimilarly, by computing, we havewhere if ; otherwise . Note that is an index set and can be described as if and only if there exists one such that and . Taking (26) and (25) into (24), we have thatwhere andObviously, (27) can be rewritten aswhere and . By taking the above equation into constraint , we have thatTaking (30) into (27), we have
The computation of the cluster centers is straightforward, as no constraints are imposed on them. By executing the derivation of (20) with respect to variant , and letting the result be equal to 0, then we have thatwhich yields
Up to now, the partition matrix determined by (31) and the cluster centers determined by (33) can be applied to describe the clustering results of the referential object under the collaboration of complex object . Notice that, throughout this section, the notation “” denotes the partition matrix or the collaborative information of the complex object . To avoid unnecessary ambiguity, in the rest of this subsection, we apply to describe the computing results of (31).
Similarly, the similarity between and can be determined by equationSo much for this, the collaborative clustering-based ranking approach for complex objects can be summarized in Algorithm 2.
5. Experimental Studies
In this section, we make an experimental analysis about the proposed ranking approaches for complex objects by simulating some data sets.
5.1. Example of Cluster Center-Based Ranking Approach
Given that for are ranking objects, and is the referential object, Figures 1 and 2 show the attribute information and the partition matrix of , respectively. The cluster centers of the referential object are , , and . The collaborative information of complex objects for is listed in Table 1.
Since the cluster number of referential object is and the collaborative information is , holds. For the determination of , here we use the following formula: Based on this, parameter can be expressed as for , , and and for and .
According to the description of Algorithm 1, we haveHence, if we take the cluster centers into consideration, then we haveif we take the partition matrix into consideration, then we haveif we follow the route of Algorithm 1, then there holds
5.2. Example of Partition Matrix-Based Ranking Approach
Because the cluster number of referential object is and the collaborative information is , then and . For the determination of and , we apply the following formulas: For this, the label functions are and . By computing, the new partition matrix and cluster centers of the referential object are listed in Figure 5 and Table 3.
Up to now, by computing, we haveSimilarly, if only the partition matrix is considered, then we haveif only the cluster centers are considered, then we haveand if we apply Algorithm 2 to compute the possible sequence of the complex objects, which yields
In this paper, the problem of ranking for complex objects is discussed carefully by considering the merits of collaborative clustering and the privacy of information. During the process of constructing the ranking algorithm, a referential object is preproposed and the collaborative information of complex objects is provided by means of partition matrices and/or cluster centers. No matter what the type of the collaborative information is, that is, the collaborative information is partition matrix or cluster center, we regarded it as an objective-function-based optimization problem and solved it by Lagrange multiplier method, where the complex objects can be sorted by comparing the difference of the referential objects before and after collaborative clustering with respect to the ranking objects. To illustrate the validity of our proposed ranking algorithms, we constructed synthetic examples and the experimental results show that our proposed algorithms are valid.
It is well known that no matter what result one ranking algorithm leads to, the topic of ranking is a soft decision making problem. We will still, in future work, pursue the analysis of the ranking approaches for complex objects. What is more, another interesting issue in this aspect is how to apply the proposed ranking methods of data with relational information into large and sparse data, in a clever, perfect way.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is supported by the National Natural Science Foundation of China (no. 31460297), Scientific Research Funds of Yunnan Provincial Department of Education (project name: Research on Dynamic Graph Data Clustering), Yunnan Applied Basic Research Youth Projects (project name: Research on the Sorting of Graph Data), and the Talent Introduction Research Project of Yunnan Minzu University.
J. Figueira, S. Greco, and M. Ehrgott, Multiple Criteria Decision Analysis: State of the Art Surveys, Kluwer Academic Publishers, London, UK, 2005.
G.-H. Tzeng and J.-J. Huang, Multiple Attribute Decision Making: Methods and Applications, CRC Press, Boca Raton, Fla, USA, 2011.
S. Džeroski, Relational Data Mining, Springer, Berlin, Germany, 2010.
R. Mihalcea, “Graph-based ranking algorithms for sentence extraction, applied to text summarization,” in Proceedings of the Meetings of the Association for Computational Linguistics, pp. 1–4, 2004.View at: Google Scholar
R. Bekkerman, M. Bilenko, and J. Langford, Scaling up Machine Learning: Parallel and Distributed Approaches, Cambridge University Press, Cambridge, UK, 2012.
J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1981.View at: Publisher Site
C. Luo, Introduction to Fuzzy Sets, Beijing Normal University Press, Beijing, China, 1989.