On Clustering Detection Based on a Quadratic Program in Hypergraphs

Tang, Qingsong

doi:https://doi.org/10.1155/2022/4840964

Journal of Mathematics

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Novel Approaches in Graph and Complexity-Based Data Analysis and Processing

View this Special Issue

Research Article | Open Access

Volume 2022 | Article ID 4840964 | https://doi.org/10.1155/2022/4840964

On Clustering Detection Based on a Quadratic Program in Hypergraphs

Qingsong Tang¹

Academic Editor: Ewa Rak

Received03 Oct 2021

Revised19 Nov 2021

Accepted21 Dec 2021

Published11 Jan 2022

Abstract

A proper cluster is usually defined as maximally coherent groups from a set of objects using pairwise or more complicated similarities. In general hypergraphs, clustering problem refers to extraction of subhypergraphs with a higher internal density, for instance, maximal cliques in hypergraphs. The determination of clustering structure within hypergraphs is a significant problem in the area of data mining. Various works of detecting clusters on graphs and uniform hypergraphs have been published in the past decades. Recently, it has been shown that the maximum -clique size in -hypergraphs is related to the global maxima of a certain quadratic program based on the structure of the given nonuniform hypergraphs. In this paper, we first extend this result to relate strict local maxima of this program to certain maximal cliques including 2-cliques or -cliques. We also explore the connection between edge-weighted clusters and strictly local optimum solutions of a class of polynomials resulting from nonuniform -hypergraphs.

1. Introduction

Many important phenomena depend on the structures of graphs or hypergraphs, for example, the spread of disease in a society, image segmentation problems in image analysis, or feature extraction in networks. To understand hypergraph structure, we often start with the study of the subhypergraphs with denser relations inside and sparser connections to other subhypergraphs. Thus, detecting such hypergraph clusters of closely related objects remains one of the most interesting problems in the field of bioinformatics, social society, and data mining. Clustering is a process of partitioning a set of objects into meaningful subsets so that all objects in the same group are similar and objects in different groups are dissimilar. It is a method of data exploration and a way of looking for patterns or structure in the data that are of interest. The majority of approaches to clusters available in the literature assume that objects similarities are expressed as pairwise relations in networks in terms of 2-graphs. There is also study of pairwise clustering to edge-weighted and vertex-weighted graphs (see [1–6], respectively). For uniform hypergraphs, there are various works on clustering with applications in different aspects, such as face clustering, perceptual grouping, and parametric motion segmentation, as well as image categorization using high order relations, since approximation of more complicated similarities in terms of pairwise interaction can lead to substantial loss of information (see [7–14]). In real-world cases, similarities in a group of objects may be more appropriate to be modeled in nonuniform weighted edges in general hypergraphs. As an illustration, think of a society of people with different income levels. It makes perfect sense to define similarity measures over one person and two persons that indicate how close they are. To be specific, two persons knowing each other would get pairwise weight 1 and weight 0 otherwise; this pairwise relationship can be modeled by the well-known adjacent matrix in this society; further, for a person labeled with income bigger than certain amount, say , we would assign this person weight 1 on a single edge ; for income less than this amount, we would assign this person weight 0 on the single edge ; this situation involving a subset in the society may be denoted by a vector . Naturally, the internal coherency of a cluster can be represented by a maximum optimization problem based on the society as follows: . This optimization is the same as the graph-Lagrangian formed from a nonuniform -hypergraph that models the relationship in this society (the detailed definition is given in the next section). Therefore, it is interesting to detect different types of clusters, say -cliques or 2-cliques. Clearly, this example can be generalized to any model fitting problem, where the deviation of a set of points from the model provides a measure of their dissimilarity. The problem of data clustering using more comprehensive dissimilarity (uniform or nonuniform) is usually referred to as hypergraph clustering, since we can represent any instance of this problem by means of a hypergraph, where vertices are the objects to be clustered and the (edge-weighted) hyperedges (uniform or nonuniform) encode different order similarities.

In 1965, Motzkin and Straus provided a solution to the maximum value of a class of homogeneous quadratic multilinear functions on variable over the standard simplex of the -dimension Euclidean space, where the homogeneous quadratic multilinear function is associated with the edge set of a graph with vertices. Motzkin–Straus’ result established a connection between the order of a maximum complete subgraph and the graph-Lagrangian of a graph. This result also provided a new proof of a theorem by Turán who pushed the development of the study of extremal problems in graph theory. In [15], Motzkin and Straus’ result was extended to characterization of local maxima in simple graphs. For Motzkin and Straus’ type result in nonuniform hypergraphs, recently, in [16], it has been shown that the global maxima of a certain quadratic program are related to the maximum -clique size in -hypergraphs.

In this paper, we extend uniform hypergraph clustering result to nonuniform hypergraphs and provide a solution to the maximum value of a class of nonhomogeneous multilinear functions in variables over the standard simplex of the -dimension Euclidean space. Specifically, we first extend this result to relate strict local maxima of this program to certain maximal cliques (either 2-cliques or -cliques or both). We also explore the connection between edge-weighted clusters and strictly local optimum solutions of a class of polynomials in the given hypergraphs. Nonhomogeneous multilinear functions discussed in this paper are associated with nonuniform hypergraphs.

This paper is organized as follows. In Section 2, we give the brief introduction to main concepts, terminology, and related results. In Section 3, we list some useful lemmas. In Section 4, we present the characterization of certain maximal cliques (either 2-cliques or -cliques or both) in terms of strictly local optimum solutions of a class of polynomials formed from unweighted -graphs. In Section 5, we discuss the parametrization graph-Lagrangian and cliques in -graphs. In Section 6, we extend the result in Section 4 to edge-weighted -graphs in some way. Conclusions are given in Section 7.

A hypergraph consists of a vertex set and an edge set , where every edge in is a subset of . The set is called the set of of . We also say that is a -graph. For example, if , then we say that is a -graph. If all edges have the same cardinality , then is an -uniform hypergraph. A 2-uniform graph is called a graph. A hypergraph is nonuniform if it has at least two edge types. For any , the th- is the hypergraph consisting of all edges with vertices of and denotes the edge set of . We write for a hypergraph on vertices with . Given a subset , the induced subgraph denoted by is a hypergraph on U with the edge set . An edge in a hypergraph is simply written as throughout the paper.

For a positive integer , let denote the set . For a finite set and a positive integer , let denote the family of all -subsets of . The complete hypergraph is a hypergraph on vertices with edge set . For example, is the complete -uniform hypergraph on vertices. is the nonuniform hypergraph with all possible edges of cardinality at most . The complete graph on vertices is also called a clique. We also let represent the complete -uniform hypergraph on vertex set .

For a -graph , for , we denote the -neighborhood of a vertex by . Similarly, we will denote the -neighborhood of a pair of vertices by . We denote the complement of by . Denote .

Definition 1. For an -uniform graph with the vertex set , edge set , and a vector , we associate a homogeneous polynomial in variables, denoted by , as follows: . Let . The graph-Lagrangian of , denoted by , is the maximum of the above homogeneous multilinear polynomial of degree over the standard simplex . Precisely,The value is called the weight of the vertex . A vector is called feasible weighting for if and only if . A vector is called optimal weighting for if and only if .

Remark 1. was called Lagrangian of in literature [17–20]. The terminology “graph-Lagrangian” was suggested by Franco Giannessi.
The characteristic vector of a set , denoted by , is the vector in defined aswhere denotes the cardinality of and is the indicator function returning 1 if property is satisfied and 0 otherwise.
In [21], Motzkin and Straus provided the following simple expression for the graph-Lagrangian of a 2-graph.

Theorem 1 (see Theorem 1 in [21] ). If is a 2-graph with vertices in which a largest clique has order then . Furthermore, the characteristic vector of a maximum clique of is optimal weighting for .

This result provided a solution to the optimization problem of this type of quadratic functions over the standard simplex of an Euclidean space.

In [16], Peng et al. generalized the concept of graph-Lagrangian to nonuniform hypergraphs as given below.

Definition 2. For a hypergraph with and a vector , defineLet . The Lagrangian of , denoted by , is defined asThe value is called the weight of vertex . A vector is called optimal weighting for if .
In [22], Peng and Yao gave a generalization of Motzkin–Straus result to -graphs.

Theorem 2 (see Theorem 1.4 in [22]). If is a -graph with vertices and the order of its maximum complete -subgraph is (where ), then . Furthermore, the characteristic vector of a maximum clique of is optimal weighting for .

In [23, 24], Gu et al. and Tang et al. obtained more Motzkin–Straus results to some uniform hypergraphs.

3. Graph-Lagrangians and Cliques in -Graphs: Unweighted Case

There is a 1-to-1 connection between strictly local optimum and maximal cliques (2-cliques or -cliques) in -graphs.

Theorem 3. A subset of vertices in is a maximum clique of a -hypergraph if and only if its characteristic vector is a global maximum of optimization problem (4).

Proof. One direction is immediate from Theorem 2. For the other direction, suppose that is a global maximum of optimization problem (4). From (4), , where is a maximum clique in . Let and . From Theorem 2, and . However, only if . This implies . So must be a clique from Theorem 2 and is a maximum clique.

Proposition 1. Let be a subset of vertices of a -hypergraph . Then is a maximal clique of if and only if its characteristic vector satisfies

Proof. Suppose that is a maximal clique. From the definition of maximal cliquefor all ; andfor all . Hence satisfies (5).
For the other direction, if is not a clique, then, for some vertex , there exists a vertex satisfying and or . Hence,This contradicts for all . So must be a clique. If is not a maximal clique, then there must exist a clique . Let . ThenThis contradicts for .

Lemma 1 (KKT necessary condition, [25]). If feasible weighting is a local solution of optimization problem (4), then there exists such that, for all ,

The following corollary follows from Proposition 1 immediately.

Corollary 1. If is a maximal clique of a -hypergraph , then satisfies the first-order KKT necessarily.

Definition 3. For a -hypergraph , a maximal clique is said to be strictly maximal if, for all , the number of 2-edges crossing and is less than .
Note that, for to be a maximal clique, it suffices that the number of edges crossing and be no more than for all . However, for to be a strictly maximal clique, this number needs to be strictly less than .

Lemma 2. Consider a -hypergraph which contains two cliques and of equal cardinality . Let . Then, for every satisfying , we have the following:(a)If has exactly edges crossing and , then (b)If has fewer than edges crossing and , then The proof of this lemma is similar to the proof of Theorem 6 in [15]. So we omit the details here.

Lemma 3. Let be a strict local maximum of optimization problem (4); then, , there exists an edge such that .

Proof. Suppose, for a contradiction, that there exist and in such that for any . We define new weighting for as follows. Let be an arbitrarily small positive constant. Let for , , and ; then is clearly legal weighting for , andThis contradicts being a strict local maximum of optimization problem (4). Hence Lemma 3 holds.
Now we are ready to prove the main result of this section.

Theorem 4. Let be a subset of vertices of a -hypergraph .(a)If , then is a strict maximal -clique of if and only if is a strict local maximum of optimization problem (4)(b)If , then is a strict maximal {2}-clique of if and only if is a strict local maximum of optimization problem (4)

Proof. (a) Suppose that is a strict local maximum of optimization problem (4); then the KKT conditions (10) hold for some . We will show that , where . Then, by Proposition 1, is a maximal clique. Suppose that for a contradiction. For every two vertices in by Lemma 3. We will show that all the vertices in are contained in . Then is a clique and . Assume that there exist some vertices in not contained in . Since , there must exist a vertex contained in . Assume that but it is not contained in ; thenThis contradicts by Lemma 1 Hence, all the vertices in are contained in .
To see as a strictly maximal clique, suppose to the contrary that adjacent to exactly vertex in , and let denote the only vertex in not adjacent to . Then set as a clique of the same cardinality as . Because , there are no edges crossing and , since are nonadjacent. From Lemma 2, for all , we have which contradicts the hypothesis that is a strict maximum of optimization problem (4). This proves the first part of the theorem.
For the other part, suppose that is a strictly maximal clique. To prove that is a strict local maximum of optimization problem (4), we apply the second-order sufficiency conditions for constrained optimization. First, from Corollary 1, satisfies the KKT conditions. Note that, in this case, the Lagrange multipliers ’s are given byIt remains to be shown that the Hessian of the Lagrangian associated with the optimization in (4) is negative definite on the subspacewhere . Since is a strict maximal clique, we havefor all . Now, let ; thenThis completes the proof. (b) This is similar to that in (a). We omit the details here.

4. The Parametrization Lagrangian and Cliques in -Graphs

Definition 4. For a hypergraph with and a vector , defineLet , where is a real number between . The parametrization Lagrangian of , denoted by , is defined asThe value is called the weight of vertex . A vector is called optimal weighting for if .
LetThe following is easy to see.

Lemma 4. The set is nonempty if and only if . For , consists simply of the character vector of .

Theorem 5. Let be a -hypergraph with clique number . Then(a) for (b) for (c) if and only if (d)(e)For , the set of global optimal solutions of (18) is given by(f)The set of global optimal solutions of is , where is an (optimal) -clique of order . Hence, there is a one-to-one correspondence between the global optimal solutions of and the optimal cliques in .

Proof. Proof of (a). Let . Let be any feasible solution of . Then On the other side, let be a clique of size in for some (note that is not assumed to be integral). Since , is not empty. Now, for an arbitrary , Hence, for . Proof of (b). Let . Since any that is feasible for is also feasible for , and since , we have . Proof of (c). Since is an increasing function of , (a) and (b) together imply (c). Proof of (d). Note that the feasible region of is the feasible region of for in the range . Combining with (a) and (b) implies (d). Proof of (e). By equation (22), every , where is a clique of size at least , satisfies On the other side, for an arbitrary of , we have Hence, if , then . This happens if and only if So the support of forms a clique in . Let be this clique. Clearly, . Lemma 4 implies that . Proof of (f). The result follows from (e) and Lemma 4.

5. Maximum Vertex-Weighted Cliques in -Graphs

Given a nonnegative weight vector , for any subset of the vertex set, denotes the sum of the weights of vertex in . The vertex-weighted clique number is the maximum of over all cliques of . Note that is the usual clique number of the hypergraph. Given a positive weight vector , define a set of matrices as follows:

For a given a matrix , consider the following optimization problem:

Theorem 6. Let be a -graph. Then for any positive weight vector , and .

In the Proof of Theorem 6, we will impose an additional condition on a solution to a global optimum to problem (27): (∗) is minimal; that is, if is a feasible solution for satisfying , then . We need the following lemmas.

Lemma 5. Let be a global optimum of optimization problem (27) with minimum support; then there exists an edge such that .

Proof. Let be a global optimum of optimization problem (27) with minimum support. Let . Suppose, for a contradiction, that there exist and in such that for any . We define a new feasible solution to (27) as follows. Let for , , and ; then is clearly a feasible solution (27) with smaller support compared to . By KKT necessary condition , andsince . This contradicts being a global optimum of optimization problem (27) with minimum support.

Claim 1. Either for all or for all .

Proof. Suppose that but for a contradiction. By the KKT condition, . By Lemma 5, ; therefore . This is a contradiction.
Now we are ready to prove Theorem 6.

Proof. of Theorem 6. Let be a global optimum of optimization problem (27) with minimum support. By Lemma 5 and Claim 1, induces -clique or a 2-clique of .
If induces a -clique, thenand its minimum over the simplex is atfor . So the optimal value of is . For the solution to be global optimal, must be the maximum -clique in .
If induces a 2-clique, thenand the left is similar to the case where induces -clique.

6. Dominant Set for -Graphs

Let be an edge-weighted graph with edge weight for . The weighted adjacency matrix is defined as if and otherwise. The average weighted degree of with regard to is defined as

If , define

The weight of with regard to is defined as

Definition 5 (see [4]). A nonempty subset of vertices such that for any nonempty is said to be dominant if(1) for all (2) for all Set the weighted characteristic vector as follows:Pavan and Pelillo connect the dominant set to the following quadratic program:and establish a correspondence between the global (local) maxima of (36) and the dominant sets of a graph.

Theorem 7 (see [4]). If is a dominant subset of vertices, then its weighted characteristic vector is a strict local solution of program (36). Conversely, if is a strict local solution of program (36), then its support is a dominant set, provided that for all .

Here we consider the quadratic program related to edge-weighted -graph. Let be a -graph on vertex set with edge sets . Let be the edge weight vector of and let be the edge weight matrix of .

Let if and if ; let if and if ; that is, . Consider the following quadratic program:where and . A vector satisfies the Karush-Kuhn-Tucker (KKT) conditions for problem (37), that is, the first-order necessary conditions for local optimality, if there exist real constants (Lagrange multipliers) and with for all , such thatfor all , , and . Note that and . Equality (38) is equivalent tofor all , , and . So, if we define the dominant set of as the dominant set of , then, by Theorem 7, we have the following.

Theorem 8. If is a dominant subset of vertices, then its weighted characteristic vector is a strict local solution of program (37). Conversely, if is a strict local solution of program (37), then its support is a dominant set, provided that for all .

7. Conclusion

In this paper, we study the connection between the local maxima of a class of quadratic program and certain maximal cliques including 2-cliques or -cliques of -hypergraphs. We also explore the connection between edge-weighted clusters and strictly local optimum solutions of a class of polynomials resulting from nonuniform -hypergraphs. In the future, we will try to extend these results to general hypergraphs.

Data Availability

No data were used to support this study.

Conflicts of Interest

The author declares no conflicts of interest.

Acknowledgments

The author thanks Professor Xiangde Zhang and Professor Cheng Zhao for their helpful discussion and also thanks Professor Cheng Zhao for introducing the preliminary version of this work on the 30th Midwestern Conference on Combinatorics and Combinatorial Computing (MCCCC30): https://about.illinoisstate.edu/mcccc30/list-of-participants/. This research was partially supported by Chinese Universities Scientific Fund (no. N180504008).

References

I. M. Bomze, “Evolution towards the maximum clique,” Journal of Global Optimization, vol. 10, no. 2, pp. 143–164, 1997.
View at: Publisher Site | Google Scholar
M. Budinich, “Exact bounds on the order of the maximum clique of a graph,” Discrete Applied Mathematics, vol. 127, no. 3, pp. 535–543, 2003.
View at: Publisher Site | Google Scholar
S. Busygin, “A new trust region technique for the maximum weight clique problem,” Discrete Applied Mathematics, vol. 154, no. 15, pp. 2080–2096, 2006.
View at: Publisher Site | Google Scholar
M. Pavan and M. Pelillo, “Dominant sets and pairwise clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 167–172, 2007.
View at: Publisher Site | Google Scholar
J. Hou and M. Pelillo, “A simple feature combination method based on dominant sets,” Pattern Recognition, vol. 46, no. 11, pp. 3129–3139, 2013.
View at: Publisher Site | Google Scholar
L. E. Gibbons, D. W. Hearn, P. M. Pardalos, and M. V. Ramana, “Continuous characterizations of the maximum clique problem,” Mathematics of Operations Research, vol. 22, no. 3, pp. 754–768, 1997.
View at: Publisher Site | Google Scholar
A. K. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognition Letters, vol. 31, no. 8, pp. 651–666, 2010.
View at: Publisher Site | Google Scholar
S. Agarwal, J. Lim, L. Zelnik-Manor, P. Perona, D. Kriegman, and S. Belongie, “Beyond pairwise clustering,” in Proceedings of the IEEE Conference Computer Vision and Pattern Recognition, vol. 2, pp. 838–845, San Diego, CA, USA, June 2005.
View at: Publisher Site | Google Scholar
V. M. Govindu, “Tensor decomposition for geometric grouping and segmentation,” in Proceedings of the IEEE Conference Computer Vision and Pattern Recognition, pp. 1150–1157, San Diego, CA, USA, June 2005.
View at: Publisher Site | Google Scholar
A. Shashua, R. Zass, and T. Hazan, “Multi-way clustering using super-symmetric non-negative tensor factorization,” Computer Vision-ECCV 2006, vol. 3954, pp. 595–608, 2006.
View at: Publisher Site | Google Scholar
Y. Huang, Q. Liu, F. Lv, Y. Gong, and D. N. Metaxas, “Unsupervised image categorization by hypergraph partition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 6, pp. 1266–1273, 2011.
View at: Publisher Site | Google Scholar
S. Rota Bulo and M. Pelillo, “A game-theoretic approach to hypergraph clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 6, pp. 1312–1327, 2013.
View at: Publisher Site | Google Scholar
H. Liu, L. J. Latecki, and S. Yan, “Dense subgraph partition of positive hypergraphs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 3, pp. 541–554, 2015.
View at: Publisher Site | Google Scholar
H. Liu, L. J. Latecki, and S. Yan, “Robust clustering as ensembles of affinity relations,” in Proceedings of the 24th Annual Conference on Neural Information Processing Systems 2010, Vancouver, Canada, December 2010, Advances in Neural Information Processing Systems 23.
View at: Google Scholar
M. Pelillo and A. Jagota, “Feasible and infeasible maxima in a quadratic program for maximum clique,” Journal of Artificial Neural Networks, vol. 2, pp. 411–420, 1995.
View at: Google Scholar
Y. J. Peng, H. Peng, Q. S. Tang, and C. Zhao, “An extension of Motzkin-Straus theorem to non-uniform hypergraphs and its applications,” Discrete Applied Mathematics, vol. 200, pp. 170–175, 2015.
View at: Publisher Site | Google Scholar
P. Frankl and Z. Füredi, “Extremal problems whose solutions are the blowups of the small witt-designs,” Journal of Combinatorial Theory-Series A, vol. 52, no. 1, pp. 129–147, 1989.
View at: Publisher Site | Google Scholar
Y. Peng and C. Zhao, “A Motzkin-Straus type result for 3-uniform hypergraphs,” Graphs and Combinatorics, vol. 29, no. 3, pp. 681–694, 2013.
View at: Publisher Site | Google Scholar
P. Frankl and V. Rödl, “Hypergraphs do not jump,” Combinatorica, vol. 4, no. 2-3, pp. 149–159, 1984.
View at: Publisher Site | Google Scholar
J. M. Talbot, “Lagrangians of hypergraphs,” Combinatorics, Probability & Computing, vol. 11, no. 2, pp. 199–216, 2002.
View at: Publisher Site | Google Scholar
T. S. Motzkin and E. G. Straus, “Maxima for graphs and a new proof of a theorem of Turán,” Canadian Journal of Mathematics, vol. 17, pp. 533–540, 1965.
View at: Publisher Site | Google Scholar
Y. J. Peng and Y. P. Yao, “On Motzkin-Straus type of results and Frankl-Fúedi conjecture for hypergraphs,” 2013, https://arxiv.org/abs/1312.3034.
View at: Google Scholar
R. Gu, X. Li, Y. Peng, and Y. Shi, “Some Motzkin-Straus type results for non-uniform hypergraphs,” Journal of Combinatorial Optimization, vol. 31, no. 1, pp. 223–238, 2016.
View at: Publisher Site | Google Scholar
Q. Tang, Y. Peng, X. Zhang, and C. Zhao, “On Motzkin-Straus type results for non-uniform hypergraphs,” Journal of Combinatorial Optimization, vol. 34, no. 2, pp. 504–521, 2016.
View at: Publisher Site | Google Scholar
D. G. Luenberger and Y. Ye, Linear and Nonlinear Programming, Springer Science Business Media, LLC, Berlin, Germany, 3rd edition, 2008.

Copyright

Copyright © 2022 Qingsong Tang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

119

Downloads

427

Citations

Journal of Mathematics

Novel Approaches in Graph and Complexity-Based Data Analysis and Processing

On Clustering Detection Based on a Quadratic Program in Hypergraphs

Abstract

1. Introduction

2. Definitions and Related Results

3. Graph-Lagrangians and Cliques in -Graphs: Unweighted Case

4. The Parametrization Lagrangian and Cliques in -Graphs

5. Maximum Vertex-Weighted Cliques in -Graphs

6. Dominant Set for -Graphs

7. Conclusion

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright