Complexity

Volume 2018, Article ID 2547270, 17 pages

https://doi.org/10.1155/2018/2547270

## Twin Subgraphs and Core-Semiperiphery-Periphery Structures

Departamento de Matemática Aplicada a las TIC & Information Processing and Telecommunications Center, ETSI Telecomunicación, Universidad Politécnica de Madrid, Madrid, Spain

Correspondence should be addressed to Ricardo Riaza; se.mpu@azair.odracir

Received 12 December 2017; Revised 6 February 2018; Accepted 12 February 2018; Published 18 April 2018

Academic Editor: Sergio Gómez

Copyright © 2018 Ricardo Riaza. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A standard approach to reduce the complexity of very large networks is to group together sets of nodes into clusters according to some criterion which reflects certain structural properties of the network. Beyond the well-known modularity measures defining communities, there are criteria based on the existence of similar or identical connection patterns of a node or sets of nodes to the remainder of the network. A key notion in this context is that of* structurally equivalent* or* twin* nodes, displaying exactly the same connection pattern to the remainder of the network. Our first goal is to extend this idea to subgraphs of arbitrary order of a given network, by means of the notions of T-twin and F-twin subgraphs. This research, which leads to graph-theoretic results of independent interest, is motivated by the need to provide a systematic approach to the analysis of core-semiperiphery-periphery (CSP) structures, a notion which is widely used in network theory but that somehow lacks a formal treatment in the literature. The goal is to provide an analytical framework accommodating and extending the idea that the unique (ideal) core-periphery (CP) structure is a 2-partitioned , a fact which is here understood to rely on the true and false twin notions for vertices already known in network theory. We provide a formal definition of such CSP structures in terms of core eccentricities and periphery degrees, with semiperiphery vertices acting as intermediaries between both. The T-twin and F-twin notions then make it possible to reduce the large number of resulting structures, paving the way for the decomposition and enumeration of CSP structures. We compute explicitly the resulting CSP structures up to order six. We illustrate the scope of our results by analyzing a subnetwork of the well-known network of metal manufactures trade arising from 1994 world trade statistics.

#### 1. Introduction

The notion of a core-periphery (CP) structure can be traced back at least to some research on economic and commercial networks developed in the late 1970s and early 1980s [1–3], largely emanating from the influential work of Wallerstein on world systems analysis [4]. These ideas were revisited and addressed in a more formal framework by Borgatti and Everett in [5]. For these authors, the two key ideas in the definition of a core-periphery structure in a network context are those of a dense, cohesive core of heavily interconnected nodes and a sparse periphery of nodes, essentially lacking any connections among them; by contrast, the connection pattern between the core and the periphery admits several definitions and, actually, the core-periphery connection densities differ from some models to others. In idealized models, core nodes are fully connected among them, periphery nodes are isolated (within the periphery subnetwork), whereas the core and the periphery may either be fully connected or totally disconnected. Since then, a great deal of research has been directed to the detection of such core-periphery structures in real networks, measuring how well they approximate the ideal ones, and to the development of analytical and computational tools to classify nodes in such networks (cf. [6–12] and references therein). Other approaches to the definition of a core-periphery structure can be found in [13–15].

Even though the idea of a core-semiperiphery-periphery (CSP) structure can be also found in the aforementioned sociological works (cf. [3, 4]) and despite the fact that this concept has been widely used since then (see e.g., [8, 11, 16, 17]), the network literature seems to lack a formal definition and a systematic classification of these CSP structures. In the aforementioned paper by Borgatti and Everett [5], these authors indicate that there are many reasonable options to define a CSP structure and, further, discrete partitions with more than three classes. The difficulty does not seem to rely on providing a formal definition but on classifying the resulting “reasonable options,” quoting these authors; more precisely, there is a need for a notion of similar or equivalent subgraphs making it possible to somehow reduce the number of different CSP structures. When dealing with core-periphery structures, there is a well-known subgraph similarity notion which makes this reduction feasible, namely, that of* structural equivalence* defining so-called* twin nodes* (broadly, two vertices are twins if they have the same neighbors; a distinction is made between* true twins* and* false twins* depending on whether both vertices are adjacent or not; details are given in Section 2). Essentially, under structural equivalence, will be the unique core-periphery structure: details are provided later, but the reader can think for the moment, for example, of the star as a network with a unique core (the central node) to whom peripheric nodes are attached; all leaves have the same set of neighbors, namely, the central node, and are therefore structurally equivalent (more precisely, they will be false twins); then, after identifying all leaves in light of this twin notion for vertices, the quotient graph amounts to .

But in the network literature there is no equivalence notion for “similar” higher order subgraphs, which would pave the way to a systematic reduction of (eventually defined) CSP structures. As explained in detail in Section 2 (see, specifically, Section 2.3), the goal of this paper is to fill this gap by introducing a mathematical framework allowing for a systematic classification of CSP networks and other partitioned structures. The key idea is to introduce the concept of* twin subgraphs*, a notion which extends to arbitrary order that of twin (structurally equivalent) vertices. This mathematical framework will be developed in Sections 3 and 4, which address graph-theoretic problems of independent interest (i.e., problems which go beyond the eventual application of these notions to the classification of CSP structures). These sections introduce and elaborate on the idea of* F-twin* and* T-twin* subgraphs, which in a sense are dual to each other and generalize several known properties of false twin and true twin vertices; for example, distinct connected components of F-twin pairs will be proved to be disjoint and nonadjacent, whereas disjoint T-twin pairs will be fully connected to each other. With this background, the classification of CSP networks will then be tackled in Section 5. In Section 6 we present the lines along which these structures can be identified in real cases by analyzing a subnetwork of the network of manufactures of metal arising from 1994 world trade statistics. These data are available and analyzed in [16], in the spirit of the aforementioned seminal work [4], and nowadays define a widely used benchmark for the positional analyses of networks. Finally, Section 7 compiles some lines for future research.

#### 2. Background on Graphs, Twins, and Core-Periphery Networks

##### 2.1. Graph-Theoretic Notions

We refer the reader to [18–21] for excellent introductions to graph theory. Throughout the paper we will work with undirected graphs without parallel edges or self-loops, so that edges can be thought of as pairs of distinct vertices (also termed* nodes*). Given a graph , its vertex and edge sets will be written as and , respectively, or simply as and if there is no possible ambiguity. We will only work with finite graphs; that is, the order (number of vertices) will be finite in all cases. With notational abuse, we will often write to mean and for . Analogously, we will say that two graphs are disjoint when their vertex sets are disjoint (note that the latter implies that the edge sets are disjoint as well).

A* path* of length is a graph with distinct vertices , and edges with joining and . Since we are not allowing parallel edges, a path is uniquely defined by its vertex set. We say that and are* linked* by such a path. When , sometimes the vertex set will be implicitly assumed to inherit the order defined by the indices and we will then speak of a path* from ** to *. The* distance*, , between a pair of distinct vertices in the same connected component of a given graph is the length of a shortest path linking them. The* eccentricity* of a vertex in a connected graph is the maximum distance to other vertices. The distance between two disjoint subgraphs and lying in the same connected component of a given graph is defined as We say that two disjoint subgraphs and are not adjacent if there is no adjacent pair with and ; if both subgraphs lie in the same connected component of , this is equivalent to saying that .

We will denote by the set of neighbors of a given vertex (namely, the set of vertices adjacent to ), and write . The* degree* of a vertex is the number of elements in . We will call a vertex of degree one a* leaf* (note that this term is often reserved to cases in which the whole graph is acyclic, i.e., a disjoint union of trees) and will say that it is* attached* to its unique adjacent vertex.

The* null graph* defined by will be denoted by ; with stands for the complete graph on vertices. The complement of a graph of order (namely, ) will be written as , and will stand for the* empty graph * on vertices. Cycles, paths, and stars on vertices will be written as , , and , respectively, with for cycles. As usual, the union and intersection of (, ) are the graphs and , respectively. The* join * of two graphs with disjoint vertex sets and is the graph obtained after enlarging with all possible edges joining the vertices of to those of (sometimes we express the latter by saying that and are fully connected to one another).

A* partitioned* graph is simply a graph whose vertex set is split into (pairwise disjoint) classes. A *-partitioned graph* is a partitioned graph with nonempty partition classes. Obviously, a partitioned graph defines an equivalence relation in the set of vertices. The* quotient graph* (often called a* supergraph*) of a partitioned graph is defined as a graph whose vertex set is the quotient set (i.e., vertices in the quotient graph correspond to the partition classes in the original graph), with two distinct vertices in the quotient being adjacent if and only if the original graph has at least one edge which joins vertices belonging to the corresponding pair of classes.

An* isomorphism* of two graphs and is a bijection (with ) which preserves adjacencies, that is, any given pair of vertices and in are adjacent if and only if and are adjacent in . An isomorphism of partitioned graphs is a graph isomorphism which keeps the classes invariant.

##### 2.2. Twins

Different analytical and computational issues arise in connection with the existence and the distribution of isomorphic copies of certain subgraphs of a given graph: see, for example, [21–25] and references therein. From a different perspective, some attention has been focused on vertices which share the same connection pattern within a graph. Such vertices receive (at least) two different names in the literature, namely,* twins* and* structurally equivalent vertices*, as detailed in the sequel. Two (distinct) vertices and are* false twins* (resp.,* true twins*) if (resp., ) [26–29]. The exclusion of self-loops yields and this implies that false twins are not adjacent. In the dual case, true twins are necessarily adjacent to each other: for these reasons, true and false twins are also called* adjacent* and* nonadjacent twins* (see, e.g., [26, 29, 30]). True twins correspond to 1-twins in the terminology of [31, 32]. By contrast, in the social network analysis literature twin vertices and are said to be* (weakly) structurally equivalent*: this means that the transposition of and yields an automorphism of the graph (cf. [33, 34]), a condition which is easily seen equivalent to and being (false or true) twins in the sense indicated above.

The F-twin and T-twin notions that will be introduced in Sections 3 and 4 for arbitrary subgraphs somehow combine the two ideas at the beginning of the paragraph above. Twin subgraphs will be isomorphic copies of each other and, additionally, they will share the connection pattern to the remainder of the graph; in other words, our approach will define a structural equivalence notion for (isomorphic) subgraphs which extends the one already defined for single vertices. Consistently, twin subgraphs will retain,* mutatis mutandis,* certain properties already known for twin vertices, such as the aforementioned adjacency properties (which will hold for disjoint twin subgraphs; cf. Corollaries 6 and 17), the duality between F-twins and T-twins in the sense that a pair of twins of one type defines a pair of the other on the complement graph (Theorem 18), or the fact that twins will have the same distance multisets to the vertex set of the graph (cf. Proposition 8). In particular, twin subgraphs will define* homometric* sets (Corollary 9; cf. [35–37]). Both notions will induce a classification in the family of isomorphic copies of each induced subgraph, extending the way in which false and true twin concepts classify the vertices of a graph. These, together with other related results, will be extensively discussed in Sections 3 and 4.

##### 2.3. Core-Periphery Networks

Consider one of the “idealized” core-periphery (CP) networks mentioned in Section 1, namely, the one defined by a 2-partitioned graph with the following two classes of vertices:(i)*Core vertices*, which are fully connected to each other and also to the vertices in the second class (defined below)(ii)*Periphery vertices*, totally disconnected from each other (and fully connected to the core, in light of the first requirement above).

As indicated in the Introduction, other core-periphery connection patterns are possible, although the one above is often used as a starting point in different analytical and computational approaches to this topic (see, e.g., [5, 16]). These core-periphery networks are simply 2-partitioned graphs of the form (find notations in Section 2.1; when using a 2-partitioned structure in , we assume throughout the document and without further mention that the two partition classes are the vertex sets of and ). Cases with a unique core vertex amount to the star . In the simplest setting () we get a 2-partitioned , with a single core and a single periphery vertex; note that , and we prefer to use the latter notation for the singleton graph.

Aiming at later developments lets us note that, in a certain sense, is substantially different from all other joins . Actually, we may think of as the quotient graph of any other join of the form . But, in order to extend these ideas to support the definition and classification of more complex structures, we emphasize that the reduction above comprises more than a quotient reduction. Indeed, all core vertices (namely, those of ) are true twins as defined in Section 2.2 above and, analogously, all periphery vertices (the ones in ) are false twins. In this context, arises not only as the reduction of other joins, but also as the unique* twin-free* network meeting the requirements (i) and (ii) above. From this point of view we may think of as the unique core-periphery* structure* (we use the latter term to make a distinction with the CP networks above, which are allowed to display twin vertices). To avoid any misunderstanding, let us clarify that is twin-free only as a 2-partitioned graph; that is, we cannot consider both vertices as (true) twins because they belong to different partition classes; see the beginning of Section 5.

However, when scaling these ideas to define formally core-semiperiphery-periphery (CSP) structures and eventually other structures with more partition classes, one finds the problem that there is no appropriate analog of the twin notions mentioned above for subgraphs with more than one vertex. Since the intuitive idea behind the concept of a core is that of a set of heavily connected vertices, the true twin notion for single vertices may well apply to reduce the number of admissible core subgraphs in these higher order structures; by contrast, in the literature one finds no way to reduce conveniently the semiperiphery-periphery subgraph.

To put it in the simplest possible setting, compare the CP network (Figure 1(a)), which amounts to a 2-partitioned path with one class (the core, painted black in the figure) defined by the central node, with a 3-partitioned path in which the three classes are defined by the central vertex (core), the two vertices with eccentricity three (semiperiphery vertices, grey) and the two leaves (periphery vertices, white) (Figure 1(b)). We may think of the latter as a (sometimes called)* spider graph* with a central vertex (the core) and two legs, each one a attached to the core by a single articulation (the semiperiphery vertices).