Beyond the Expanders
Expander graphs are widely used in communication problems and construction of error correcting codes. In such graphs, information gets through very quickly. Typically, it is not true for social or biological networks, though we may find a partition of the vertices such that the induced subgraphs on them and the bipartite subgraphs between any pair of them exhibit regular behavior of information flow within or between the vertex subsets. Implications between spectral and regularity properties are discussed.
We want to go beyond the expander graphs that—for four decades—have played an important role in communication networks; for a summary, see for example, Chung  and Hoory et al. . Roughly speaking, the expansion property means that each subset of the graph’s vertices has “many” neighbors (combinatorial view), and hence, information gets through such a graph very “quickly” (probabilistic view). We will not give exact definitions of an expander here as those contain many parameters which are not used later. We rather refer to the spectral and random walk characterization of such graphs, as discussed, among others by Alon  and Meilă and Shi .
The general framework of an edge-weighted graph will be used. Expanders have a spectral gap bounded away from zero, where—for a connected graph—this gap is defined as the minimum distance between the normalized Laplacian spectrum (apart from the trivial zero eigenvalue) and the endpoints of the interval, the possible range of the spectrum. The larger is the spectral gap, the more our graph resembles a random graph and exhibits some quasirandom properties, for example, the edge densities within any subset and between any two subsets of its vertices do not differ too much of what is expected, see the Expander Mixing Lemma 2.2 of Section 2. Quasirandom properties and spectral gap of random graphs with given expected degrees are discussed in Chung and Graham  and Coja-Oghlan and Lanka .
However, the spectral gap appears not at the ends of the normalized Laplacian spectrum in case of generalized random or generalized quasirandom graphs that, in the presence of underlying clusters, have eigenvalues (including the zero) separated from 1, while the bulk of the spectrum is located around 1, see for example, . These structures are usual in social or biological networks having clusters of vertices (that belong to social groups or similarly functioning enzymes) such that the edge density within the clusters and between any pair of the clusters is homogeneous. Such a structure is theoretically guaranteed for any large graph by the Szemerédi Regularity Lemmas  with possibly large , where does not depend on the number of vertices, it merely depends on the constant ruling the regularity of the cluster pairs.
Our conjecture is that so-called structural eigenvalues (separated from 1) in the normalized Laplacian spectrum are indications of such a structure, while the near 1 eigenvalues are responsible for the pairwise regularities. The clusters themselves can be recovered by applying the -means algorithm for the vertex representatives obtained by the eigenvectors corresponding to the structural eigenvalues (apart from the zero). For the case, we will give an exact relation between the eigenvalue separation (of the nontrivial structural eigenvalue from the bulk of the spectrum) and the volume regularity of the cluster pair that is obtained by the -means algorithm applied for the coordinates of the transformed eigenvector belonging to the nontrivial structural eigenvalue, see Theorem 3.1 of Section 3. To eliminate the trivial eigenvalue-eigenvector pair, we shall rather use the normalized modularity spectrum of  that plays an important role in finding the extrema of some penalized versions of the Newman-Girvan modularity introduced in .
2. Preliminaries and Statement of Purpose
Let be a graph on vertices, where the symmetric matrix has nonnegative real entries and zero diagonal. Here is the similarity between vertices and , where 0 similarity means no connection/edge at all. A simple graph is a special case of it with 0-1 weights. Without loss of generality, will be supposed. Hence, is a joint distribution, with marginal entries which are the generalized vertex degrees collected in the main diagonal of the diagonal degree matrix , . In [11, 12], we investigated the spectral gap of the normalized Laplacian , where denotes the identity matrix of appropriate size.
Suppose that our graph is connected ( is irreducible). Let denote the eigenvalues of the symmetric normalized Laplacian with corresponding unit-norm, pairwise orthogonal eigenvectors . Namely, . In the random walk setup, is the transition matrix (its entry in the th position is the conditional probability of moving from vertex to vertex in one step, given that we are in ) which is a stochastic matrix with eigenvalues and corresponding eigenvectors . “Good” expanders have a bounded away from zero, that also implies the separation of the isoperimetric number where for : is the weighted cut between and , while is the volume of . In view of (2.1), , this is why the minimum is taken on vertex sets having volume at most . In , we proved that while in the case the stronger upper estimation holds.
If a network does not have a “large” (compared to the natural lower bound), or equivalently—in view of the above inequalities—it has a relatively “small” isoperimetric number, then the 2-partition of the vertices giving the minimum in (2.3) indicates a bottleneck, or equivalently, a low conductivity edge-set between two disjoint vertex clusters such that the random walk gets through with small probability between them, but—as some equivalent notions will indicate—it is rapidly mixing within the clusters. To find the clusters, the coordinates of the transformed eigenvector will be used. In , we proved that for the weighted 2-variance of this vector's coordinates holds. For a general , the notion of -variance—in the Analysis of Variance sense—is the following. The weighted -variance of the -dimensional vertex representatives comprising the row vectors of the matrix is defined as where is the weighted center of cluster and denotes the set of -partitions of the vertices. We remark that , since is the all 1’s vector.
The above results were generalized for minimizing the normalized -way cut of the -partition over the set of all possible -partitions. Hence, is the minimum normalized k-way cut of the underlying weighted graph . In fact, is the symmetric version of the isoperimetric number and . In , we proved that where the upper estimation is relevant only in the case when is small enough and the constant depends on this minimum -variance of the vertex representatives.
The normalized Newman-Girvan modularity is defined in  as the penalized version of the Newman-Girvan modularity  in the following way. The normalized -way modularity of is is the maximum normalized k-way Newman-Girvan modularity of the underlying weighted graph . For given , maximizing this modularity is equivalent to minimizing the normalized cut and can be solved by the same spectral technique. In fact, it is more convenient to use the spectral decomposition of the normalized modularity matrix with eigenvalues , that are the numbers with eigenvectors and the zero with corresponding unit-norm eigenvector . In [9, 12], we also show that a spectral gap between and is an indication of clusters with low intercluster connections; further, the intracluster connections () between vertices and of the same cluster are higher than expected under the hypothesis of independence (in view of which vertices and are connected with probability ). In the random walk framework, the random walk stays within the clusters with high probability.
Conversely, minimizing the above modularity will result in clusters with high inter- and low intra-cluster connections. In , we proved that The existence of “large” (significantly larger than 1) eigenvalues in the normalized Laplacian spectrum, or equivalently, the existence of negative eigenvalues (separated from 0) in the normalized modularity spectrum is an indication of clusters with the above property. In the random walk setup, the walk stays within the clusters with low probability.
These two types of network structures are frequently called community or anticommunity structure. Some networks exhibit a more general, still regular behavior: the vertices can be classified into clusters such that the information-flow within them and between any pair of them is homogeneous. In terms of random walks, the walk stays within clusters or switches between clusters with probabilities characteristic for the cluster pair. That is, if the random walk moves from a vertex of cluster to a vertex of cluster , then the probability of doing this does not depend on the actual vertices, it merely depends on their cluster memberships, .
In this context, we examined the following generalized random graph model, that corresponds to the ideal case: given the number of clusters , the vertices of the graph independently belong to the clusters; further, conditioned on the cluster memberships, vertices and are connected with probability , independently of each other, . Applying the results  for the spectral characterization of some noisy random graphs, we are able to prove that the normalized modularity spectrum of a generalized random graph is the following: there exists a positive number , independent of , such that there are exactly so-called structural eigenvalues of that are greater than , while all the others are o in absolute value. It is equivalent that has eigenvalues (including the zero) separated from 1.
The case corresponds to quasirandom graphs, and the above characterization corresponds to the eigenvalue separation of such graphs, discussed in . The authors also prove some implications between the so-called quasirandom properties. For example, for dense graphs, “good” eigenvalue separation is equivalent to “low” discrepancy (of the induced subgraphs’ densities from the overall edge density).
For the case, generalized quasirandom graphs were introduced by Lovász and Sós . These graphs are deterministic counterparts of generalized random graphs with the same spectral properties. In fact, the authors define so-called generalized quasirandom graph sequences by means of graph convergence that also implies the convergence of spectra. Though, the spectrum itself does not carry enough information for the cluster structure of the graph, together with some classification properties of the structural eigenvectors it does. We want to prove some implication between the spectral gap and the volume-regularity of the cluster pairs, also using the structural eigenvectors.
The notion of volume regularity was introduced by Alon et al. . We shall use a slightly modified version of this notion.
Definition 2.1. Let be weighted graph with . The disjoint vertex pair is -volume regular if for all , we have where is the relative inter-cluster density of .
Our definition was inspired by the Expander Mixing Lemma stated for example, in  for regular graphs and in  for simple graphs in the context of quasirandom properties. Now we formulate it for edge-weighted graphs on a general degree sequence. We also include the proof as a preparation for the proof of Theorem 3.1 of Section 3.
Lemma 2.2 (Expander Mixing Lemma for Weighted Graphs). Let be a weighted graph and suppose that . Then for all : where is the spectral norm of the normalized modularity matrix of .
Proof. Let , , and denote the indicator vector of . Further, and .
We use the spectral decomposition , where are eigenvalues of and with corresponding unit-norm eigenvector . We remark that is also an eigenvector of corresponding to the eigenvalue zero, hence . Let and be the expansions of and in the orthonormal basis with coordinates and , respectively. Observe that , and , . Based on these, where we also used the triangle and the Cauchy-Schwarz inequalities.
We remark that the spectral gap of is ; hence—in view of Lemma 2.2—the density between any two subsets of “good” expanders is near to what is expected. On the contrary, in the above definition of volume regularity, the pairs are disjoint, and a “small” indicates that the pair is like a bipartite expander, see for example, .
In the next section, we shall prove the following statement for the case: if one eigenvalue jumps out of the bulk of the normalized modularity spectrum, then clustering the coordinates of the corresponding transformed eigenvector into 2 parts (by minimizing the 2-variance of its coordinates) will result in an -volume regular partition of the vertices, where depends on the spectral gap.
Our conjecture is that we may go further: if (so-called structural) eigenvalues jump out of the normalized modularity spectrum, then clustering the representatives of the vertices—obtained by the corresponding eigenvectors in the usual way—into clusters will result in -volume regular pairs, where depends on the spectral gap (between the structural eigenvalues and the bulk of the spectrum) and the -variance of the vertex representatives based on the eigenvectors corresponding to the structural eigenvalues.
3. Eigenvalue Separation and Volume Regularity ( Case)
Theorem 3.1. Let is an edge-weighted graph on vertices, with generalized degrees and . Suppose that and there are no dominant vertices: as . Let the eigenvalues of , enumerated in decreasing absolute values, be The partition of is defined in such a way that it minimizes the weighted 2-variance of the coordinates of , where is the unit-norm eigenvector belonging to . Then the pair is -volume regular.
Remark 3.2. The statement of Theorem 3.1 has relevance only if is much larger than . In this case, the spectral gap between the largest absolute value eigenvalue and the others in the normalized modularity spectrum indicates a regular 2-partition of the graph that can be constructed based on the eigenvector belonging to the structural eigenvalue.
Remark 3.3. The statement of the above theorem also has relevance in the dense case (supported by the condition that there are no dominant vertices). We remark that in the sparse case there is an exceptional set comprising low degree vertices. Authors of  prove equivalences of quasirandom properties for the core of the graph (the vertex set deprived of the exceptional set). Using the modularity spectrum of this core, the above theorem remains valid for it.
Proof of Theorem 3.1. We use the notations in Lemma 2.2’s proof. Let , . For short, , , , and . With the notations and ,
Using the spectral decomposition and the fact that , we can write (3.2) as
where and is the expansion of and in the orthonormal basis with coordinates and , respectively.
By Lemma 2.2, for all : that is also applicable to the special two-partition satisfying the postulates. Hence, is governed by the spectral norm of the normalized modularity matrix. In , as a by-product of the proof of (2.6) it came out that to get the partition , the coordinates of should be divided into two parts at the median, hence and are approximately equal to (the approximation is good for large if the underlying graph does not have dominant vertex-weights). Further, the estimation also follows. This applies to the case. In the case, the optimum is obtained by minimizing the 2-variance of the coordinates of the transformed eigenvector for which the following relation—like (2.6)—can be proved: and the optimum cut-point of the coordinates is also not far from the median. Summarizing, , where .
Therefore, (3.3) can be estimated from above with As for the second term, .
Using the Cauchy-Schwarz inequality, the last term can be estimated from above with since , and , . The first term is reminiscent of an equation for the coordinates of orthogonal vectors. Therefore, we project the vectors , onto the subspace . In fact, , therefore . The vector can be decomposed as where is the component orthogonal to . For the squared distance between and , in , we proved that it is equal to the weighted 2-variance and in (3.5) we estimated it from above with . In the case similar estimation works using (3.6). First we estimate . The problem is that the pairwise orthogonal vectors and are not in the same subspace of as, in general, . However, by an argument proved in , we can find orthogonal, unit-norm vectors such that where, in view of , . Let . Since and are coordinates of the orthogonal vectors in the basis , and because of , Therefore, using (3.10) and the fact that .
Now we estimate . Going back to (3.9), we have and similarly, that in view of yields Summarizing, the second and third terms in (3.7) are estimated from above with and , respectively. Because of and , by an easy calculation it follows that they are less than for “large” enough. Therefore, will do.
The author wishes to thank Vera T. Sós, László Lovász, and Miklós Simonovits for their useful advices. Research was supported by the Hungarian National Research Grants OTKA 76481 and OTKA-NKTH 77778.
F. R. K. Chung, Spectral Graph Theory, vol. 92 of CBMS Regional Conference Series in Mathematics, American Mathematical Society, 1997.
N. Alon, “Eigenvalues and expanders,” Combinatorica, vol. 6, no. 2, pp. 86–96, 1986.View at: Google Scholar
M. Meilă and J. Shi, “Learning segmentation by random walks,” in Proceedings of the 13th Conference of Neural Information Processing Systems (NIPS '01), T. K. Leen, T. G. Dietterich, and V. Tresp, Eds., pp. 873–879, MIT Press, Cambridge, UK, 2001.View at: Google Scholar
F. Chung and R. Graham, “Quasi-random graphs with given degree sequences,” Random Structures & Algorithms, vol. 12, no. 1, pp. 1–19, 2008.View at: Google Scholar
M. Bolla, “Penalized versions of the Newman–Girvan modularity and their relation to multiway cuts and k-means clustering,” In press.View at: Google Scholar
N. Alon, A. Coja-Oghlan, H. Hàn, M. Kang, V. Rödl, and M. Schacht, “Quasi-randomness and algorithmic regularity for graphs with general degree distributions,” Society for Industrial and Applied Mathematics Journal on Computing, vol. 39, no. 6, pp. 2336–2362, 2010.View at: Publisher Site | Google Scholar