#### Abstract

The exact computation of network k-terminal reliability is an NP-hard problem, and many approximation methods have been proposed as alternatives, among which the neural network-based approaches are believed to be the most effective and promising. However, the existing neural network-based methods either ignore the local structures in the network topology or process the local structures as Euclidean data, while the network topology represented by the graph is in fact non-Euclidean. Seeing that the Graph Convolution Neural network (GCN) is a generalization of convolution operators onto non-Euclidean data structure, in an effort to fill in the gap, this paper proposes a GCN-based framework for the estimation of communication network reliability. First, a dataset with sufficient sample size is constructed, by calculating the k-terminal reliability via the exact contraction-deletion method for the generated network samples. Then, an estimation model based on GCN is built, where several graph convolution layers process input information and extract node-level structural features from the network topology, a concatenation layer fuses the structural features into a graph-level representation feature, and a multi-layer perceptron computes the k-terminal reliability as output. To demonstrate the practicality and rationality of our proposed model, comparative experiments are carried out on 12 datasets, the results of which show that our proposed GCN model has an average of 59.60% and 57.52% improvement over existing methods on homogeneous datasets and heterogeneous datasets, respectively.

#### 1. Introduction

Communication networks, as a fundamental infrastructure to many critical systems such as power transmission system [1–3], emergency response system [4, 5], and aircraft navigation system [6], has become an indispensable part of our day-to-day life [7]. Disruptions in communication systems can severely undermine the performances of these critical systems, or sometimes even lead to destructive or catastrophic events. Therefore, the analysis of communication network reliability is a practical issue of great importance.

For a given system, its reliability can be loosely defined as the probability that said the system can remain operational after some of its components fail [8–10]. For network system, the notion of reliability is mainly associated with probabilistic connectedness. For an arbitrary network where nodes are perfectly reliable and links fail independently, based on the number of specified nodes, there are 3 popular network reliability measures [11–13]. The first is 2-terminal reliability, it is defined as the probability that a communication path exists for a given pair of nodes, i.e., source node and target node . The second is all-terminal reliability, defined as the probability that all nodes in are able to communicate with each other via a working path. The last is k-terminal reliability, defined as the probability that for any given pair of nodes in, a series of consecutive links can connect them to one another. It is easy to see that the k-terminal reliability is reduced to 2-terminal reliability when , and all-terminal reliability when . Thus, in this paper we consider the k-terminal reliability measure seeing that it is the most general form out of three.

Many methods have been proposed to compute the exact k-terminal network reliability, including state enumeration [14–16], minimal cut method [17, 18], sum-of-disjoint products method [19–21], and factoring theorem [22–25]. Unfortunately, the computation complexity for all these methods remains a problem. The analytic methods for calculating k-terminal network reliability are NP-hard [26–30], and the computation time grows exponentially with the increase of node or link numbers. To overcome this problem, several approximation methods have been put forward, and so far, the neural network-based approach is believed to be the most efficient and promising among them [31, 32].

In [33], He F. and Qi H. proposed to utilize Artificial Neural Network (ANN) for all-terminal network reliability estimation, where the network topology and link reliability are taken as input and the all-terminal reliability value obtained via Monte Carlo simulation is assigned as a target. Furthermore, Srivaree-Ratana C., Konak A. and Smith A. E. [34] added the reliability upper-bounds as input to the ANN model and integrated the idea of ensemble learning by training a specialized ANN for network samples with reliability values higher than 0.9. In [35], the authors categorized the network samples as homogeneous and heterogeneous depending on whether the link reliability values are identical or not, and applied the ANN model for both types of networks. To shorten the length of the input feature and reduce the parameters of the ANN model, in [36] Altiparmak F. et al. put forward a general encoding scheme based on the degree centrality of nodes for the homogeneous networks.

The basic idea of a neural network-based estimation model is that there lies a highly complex, nonlinear relationship between the network’s topology and its reliability, which can be learned via the neural network. However, the network reliability is not determined by one link or one node alone, rather it is determined by the local structures formed by several nodes and links in conjunction with each other. Although in the ANN models, the adjacency matrix flattened into a long vector can sorely determine the network topology, this type of node-level feature is inadequate for network-level representations because it overlooks the potential correlation between nodes and links that are close to one another. Stemming from this idea, in [37] Alex Davila-Frias and Om Prakash Yadav proposed to make use of the Convolution Neural Network (CNN) for the all-terminal reliability estimation for homogeneous networks. The network topology is processed as a pixel image determined by its adjacency matrix, and several convolution layers are stacked to extract the high-level structure feature of the network.

Unfortunately, one problem still remains with the CNN-based estimation model. The network topology represented by the graph is, in essence, non-Euclidean data, whereas the CNN model is best known for its dominating performances on Euclidean data structures like text [38, 39], image [40–42], and videos [4344]. In [37], because the network topology is encoded as the pixel image derived from the adjacency matrix, the “local” structures are learned for nodes and links with consecutive indices, rather for nodes and links that are actually near each other in the topology. If the nodes are reindexed, the input features can change dramatically. Without an appropriate node indexing strategy in place, the high-level feature extracted by the convolution filters lacks rationality. Besides, the CNN model in [37] assumes homogeneity for the network link reliability, which is an unrealistic simplification of many real-world systems and makes it unsuitable for certain applications.

In light of this, a general framework based on Graph Convolution Network (GCN) [45–49] for the k-terminal reliability estimation of both homogeneous and heterogeneous communication networks, is proposed in this paper. In this framework, the dataset is constructed by generating network topology and link reliability vector, then computing their corresponding k-terminal reliability. Here, the exact contraction-deletion method is chosen, as opposed to the Monte Carlo simulation approach utilized in most of the previous research, to improve the precision and practicality of the estimation model. Next, a GCN-based estimation model is designed, starting with several graph convolution layers to extract the high-level structure features, followed by a concatenation layer to fuse the features into one topology representation vector, and ending with a Multilayer Perceptron (MLP) to learn the relationship between inputs and targets. Seeing that GCN can be viewed as a generalization of CNN onto non-Euclidean data structures [48], it is our belief that it would be more suited for the task of network reliability estimation, which is also corroborated by the experiment results in later sections.

The main contribution of our proposed model is three-fold:(1)Several network k-terminal reliability datasets are constructed, via the exact contraction-deletion method. Compared to the approximative simulation method used in previous studies, our approach can provide more precise training data to the estimation model.(2)Seeing that the network topology is essentially non-Euclidean data, the GCN is taken advantage of to extract the structure feature. To the best of our knowledge, our proposed model is the first to utilize GCN in a network reliability estimation problem.(3)Experiment results show that our proposed GCN model outperforms the existing estimation models, with an average of 59.60% improvement of Root Mean Squared Error (RMSE) on homogeneous datasets, and 57.52% on heterogeneous datasets.

The remaining part of this paper is as follows: Section 2 introduces some notations of the graph and the general pipeline of GCN; in Section 3 the proposed framework for k-terminal network reliability is described in details; Section 4 reports the experiment results of our proposed GCN model on several datasets, comparative analysis with other neural network-based models is also undertaken to demonstrate the advantages of our proposed model; finally, Section 5 marks the end of this paper with some concluding remarks.

#### 2. Graph Convolution Neural Network

Many real-world systems from various fields of research can be represented by graphs, including communication networks, social networks [50, 51], protein-protein interaction networks [52], and chemistry molecular graphs [53]. In other words, graphs can be viewed as a special kind of data structure that models the pairwise relationships (edges) between a set of objects (nodes).

In essence, graphs are non-Euclidean data structures. Thus, convolutional neural networks that have outstanding performances at the task of processing Euclidean data structures like images and videos, need to be generalized before they can be directly applied to graphs [48]. From this point of view, GCN can be seen as a generalization of CNN to non-Euclidean data structure, while maintaining the three advantages of CNN, namely, local connection, shared weights, and use of multiple layers. This section briefly introduces some preliminaries of GCN.

A graph is defined as , where is a set of nodes and a set of links. The adjacency matrix of a graph is defined as a matrix where if there exists a link from node to node . A graph is said to be undirected if its adjacency matrix is symmetric, i.e., , otherwise it is directed. Since the focus of our work is communication networks where information flows both ways, in this paper we assume the graphs are undirected.

The general pipeline of the GCN model is shown in Figure 1. Firstly, for each graph , the input layer processes three types of information: node features embedded as -dimensional hidden features and edge features embedded as -dimensional hidden features , as well as the graph itself.

Then, each GCN layer computes -dimensional representation features for each node by aggregating information passed from their neighbouring nodes and edges. Let us suppose and is the hidden feature vector associated with node and edge at GCN layer , respectively. Node ’s hidden feature vector at the next layer is updated by first aggregating the hidden feature vectors of the neighbouring nodes and edges of , then applying a nonlinear transformation to the aggregated information and the hidden feature vector at layer . Similarly, the hidden feature vector of edge at layer can also be updated with the hidden features , and . Generally speaking, the updating of hidden features at layer iswhere denotes the neighbourhood of node , () and () are the aggregation functions, () and () are the transformation functions of node and edge, respectively. In each GCN layer, by aggregating the neighbouring nodes and edges’ information, the updated hidden feature can capture the local structures of the graph. For instance, for the central node in Figure 1 (coloured in green), at the first GCN layer it learns from its immediate neighbouring nodes and edges (coloured in orange). Then, at the second GCN layer, the central node can learn from nodes and edges 2 hops away from itself (coloured in blue), because at the first layer, the central nodes’ neighbouring nodes have also learned from their respective neighbours. This way, by stacking GCN layers, the node representation can inexplicitly learn from its -hop neighbourhood.

Next, the hidden features of nodes after GCN layers are fused into one feature vector. Here the fusion procedures differ from various types of tasks. For classification or regression tasks at the node level, the hidden feature of each node can be taken directly as the feature vector. For classification or regression tasks at the edge or graph level, hidden features of multiple nodes need to be aggregated or concatenated to form the feature vector. A such feature vector is then passed along to an MLP to compute task-dependent outputs. Finally, the output of the task-specific MLP can be fed to a loss function. Then, via back propagation, the parameters of the GCN model can be trained in an end-to-end fashion.

#### 3. Proposed Framework

As illustrated in Figure 2, our proposed framework for communication network reliability mainly contains the following procedures: Step 1. Generate a set of network topologies Step 2. Generate a link reliability vector for each network topology Step 3. Compute the exact network reliability via the contraction-deletion method Step 4. Convert the network topologies and corresponding link reliability vector into input features, with the computed network reliability from Step 3 as a target Step 5. Design the architecture of the GCN model Step 6. Split the dataset, then train and evaluate the GCN model

In this section, the details of the proposed framework are explained in two subsections: the dataset generation process (Step 1 to Step 3), and the reliability estimation model based on GCN (Step 4 to Step 6).

##### 3.1. Dataset Generation

To train a machine learning model that could capture the highly-complex relationship between network topology and reliability, it is of great importance to obtain a proper dataset with a sufficient sample size. To this end, we first generate a set of network topologies. For a network with node set of size , an adjacency matrix can represent the network topology, where indicates a link between node and node , and otherwise. Since it is assumed that the communication networks are undirected and acyclic [36], only entries in the upper triangle of the adjacency matrix are randomly generated, while the rest of the entries in the adjacency matrix are determined by setting . What’s more, a connectivity test is also needed to see if the nodes of interest in the set are connected to one another. If so, the network topology can be added to the dataset, otherwise a new topology needs to be generated. Such procedure repeats until sufficient dataset size is satisfied.

Next, for each network topology, a -dimensional link reliability vector is generated. If a link is present in some network topology, then a reliability value in the range of (0, 1) is assigned to the corresponding entry in . Here, for homogeneous networks, the entries in are always equal to each other; while in heterogeneous networks, the entries in may take different values.

With network topology and link reliability vector, the k-terminal reliability can be computed. To improve the precision of the estimation model, instead of the Monte Carlo simulation method as presented in [33–37], in this paper we use the exact method to calculate the target reliability value. One of the most widely-used exact methods for network k-terminal reliability is the contraction-deletion method [54–57]. It stemmed from a simple yet elegant observation that for an undirected network , the reliability is given by the following equation:where is a random edge in , is the reliability of edge , and is the new network derived from by contracting and deleting edge respectively. This way, the network topology can be recursively simplified until it is reduced to a singleton or a self-loop. Table 1.

The pseudocode of the contraction-deletion method [55] is presented in Table 1. The contraction-deletion method requires the network topology, link reliability vector, and a specified set of nodes to compute the k-terminal reliability. For simplicity, here we denote the graph by a triplet . To begin with, the algorithm randomly selects an edge from the edge set (line 4). Denote the graphs after contraction and deletion as and , respectively; (line 13 and line 16). is obtained by first replacing node with node for all the other edges on node (line 6 to line 9), then removing node from the set (line 10). If node is in , then node also needs to be in in the contracted graph (line 11 to line 12). is obtained by directly deleting edge (line 15). Next, a connectivity test is performed to determine whether the nodes in the are joined by a path of edges in . If so, the network reliability is calculated as (line 17 to line 18), otherwise (line 19 to line 20). This procedure repeats recursively until only one node in remains, at this time the k-terminal reliability of the reduced network is naturally 1 (line 1 to line 2).

##### 3.2. GCN Model

With the network reliability dataset generated via the contraction-deletion method, the next step in our proposed framework is to convert it into pair for the GCN model. As mentioned in Section 2, the input to the GCN model contains three parts: node features, edge features, and graphs. In our proposed framework, the dimensions of the node feature and edge feature are set to be 1, i.e., . More specifically, the node feature is the degree of centrality of each node:the edge feature is the reliability value of each edge:and graph is the network topology. The target of the GCN model is the k-terminal network reliability obtained via the contraction-deletion method.

The proposed GCN model for network reliability estimation is demonstrated in Figure 3. After the network information has been converted into an input form, several GCN layers update the node features by passing messages among the nodes’ neighbours. To be more specific, the GCN operator in [58] is utilized in this paper, because it is easy to use and achieves state-of-the-art performances on many learning tasks. Let be the node feature associated with node at layer , be the edge feature of at layer , a GCN layer updates the node and edge feature vector as follows:where is the activation function, and are the weight matrices to be trained.

Several GCN layers are stacked to learn the local structures of the network. Afterwards, the final node features are concatenated to form a graph-level representation vector, which is then fed to a fully-connected layer and an output layer to estimate the network reliability.

#### 4. Experiment

To demonstrate the practicality of our proposed model, a case study of communication network reliability is carried out. Comparative analysis is also undertaken to demonstrate our proposed model’s advantages to existing approaches. All experiments are run on an Intel Core i7-10510U CPU @ 1.80 GHz processor with an NVIDIA GeForce MX250 GPU.

##### 4.1. Settings

As described in Section 3.2, several network reliability datasets are generated via the contraction-deletion method, the details of which are listed in Table 2. Based on previous studies [33–37], for networks with node set size of 10, 20, and 30, the dataset sizes are set to be 3000, 5000, and 8000, respectively. Some sample network topologies are illustrated in Figure 4, Figure 5 and Figure 6.

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

Based on link reliability, the datasets can be categorized into two types. For homogeneous datasets, the entries in the reliability vector takes a value in , but are always equal to each other for the same sample. For example, the reliabilities of all links in the sample network depicted in Figure 4(a) is always 0.9, the same goes for sample networks in Figure 5(a) and Figure 6(a). While for the heterogeneous dataset, the entries of the reliability vector of one sample may take various values, as shown in Figure 4(b), Figure 5(b), and Figure 6(b). Take Figure 4(b) for instance, the reliability of link is 0.85, whereas the reliability of link is 0.9.

Since 2-terminal and all-terminal network reliability problems can be viewed as special cases to k-terminal reliability, our proposed framework can also be applied to them as well. Therefore, the sizes of in the generated datasets range from 2 to . For the dataset with , the nodes in are represented by orange squares in Figure 4, Figures 5, and 6, while other nodes are represented by green circles. Take Figure 4(a) for instance, in dataset 1, is set to be ; while in dataset 2, the network reliability is computed for all 10 nodes, i.e., .

To demonstrate the calculation process of the contraction-deletion method described in Section 3.1, here the sample network in Figure 4(a) is taken as an illustrative example. Because there are only two nodes and in set , the k-terminal reliability is reduced to 2-terminal reliability, which is defined as the probability that a working communication path exists between node and node . As shown in Figure 7, the link is chosen first. If link is contracted, then node is deleted from the node set, and the edge is replaced by , the resulting network topology is as shown in Figure 7(b). If link is deleted, from Figure 7(c), it can be seen that the nodes in set , i.e., node and , are no longer connected. Therefore, according to the contraction-deletion algorithm, the 2-terminal reliability can be calculated as , where denotes the network topology in Figure 7(b). This process repeats recursively, and the network topology is gradually simplified until only one node remains in the set . For the compactness of the presentation, the rest of the calculation process are omitted.

With the network reliability datasets, the Root Mean Squared Error (RMSE) is selected as a loss function to measure the performance of estimation models:where is the network reliability of dataset’s -th sample obtained via contraction-deletion method as described in Section 3.1, is the estimation model’s prediction, and is the size of the dataset. To avoid the problem of overfitting, the RMSE of the estimation model on both the training and testing sets are considered, denoted as “Train RMSE” and “Test RMSE” in the remaining part of this paper.

Comparative experiments are undertaken for 2 ANN models, the CNN model and our proposed model. To avoid misunderstanding and distinguish the ANN models proposed in [33–36], for the rest of the paper they are referred as “ANN1” and “ANN2”, respectively.

The same learning rate decay strategy is deployed for all estimation models. The initial learning rate is set to be . If the validation loss does not improve for 10 epochs, the learning rate is reduced by half. The training process is terminated if one of the following two conditions are met: the learning rate decays below , or the epoch reaches the maximal number of 1000.

To make a fair comparison with existing estimation models, the number of GCN layers is set to be with in/out channels configuration of , , , and , respectively. The parameter configurations of other existing estimation models are as reported in the literature. 10-fold cross-validation is used to train and evaluate the estimation models, and the mean values are reported as the experiment results.

##### 4.2. Result Analysis

Seeing that some of the existing estimation models can only be applied to homogeneous networks, here the experiment results on homogeneous datasets and heterogeneous dataset are reported separately.

###### 4.2.1. Homogeneous Network

The experiment results in Table 3 compare 2 ANN models, CNN model, and our proposed model on 6 homogeneous datasets (dataset 1 to 6), where the bold font indicates the optimal result and the italic indicates the suboptimal result.

It can be seen that our proposed model outperforms three previous models on all of the homogeneous datasets. Compared to existing models, our proposed GCN model has an average of 59.60% improvement on RMSE. The least significant improvement is on dataset 1, where compared to the best result out of the other three models (ANN1) our GCN model reduces the RMSE by 26.44%; while the most noticeable improvement is on dataset 6, where the RSME of our GCN model is 83.26% lower compared to the previous best result (ANN1).

To better visualize the estimation model performances, here we take the models trained in the first fold on all-terminal reliability datasets (dataset 2, 4, and 6) as showcases. All models are trained 5 times and the one with the lowest RMSE is selected. Their estimation results are compared on 50 sample networks, as illustrated in Figure 8, Figure 9, and Figure 10. Seeing that the contraction-deletion method is an exact method, the reliability calculated via contraction-deletion is also depicted as the actual reliability. It is easy to see that the deviation between the estimation results obtained via our model and the actual reliability values is the lowest among all estimation models.

There are several factors that may possibly contribute to our proposed model’s advantage over previous estimation models. To begin with, the network topology is, by its nature, rich in spatially local patterns. Simply put, the network topology is composed of a set of nodes and a set of links that represents the pairwise relationships or interactions between these nodes, therefore the neighbouring nodes and edges are highly correlated. The estimation models based on ANN ignore this type of spatial locality, processing the nodes close to each other in the same way as the nodes far apart from one another. While the CNN model and our proposed GCN model can exploit said spatial locality by utilizing certain types of convolution layer to extract local conjunctions of features [48]. By stacking several convolution layers, the lower-level local features can be composed into a high-dimensional graph-level representation feature.

Besides, as mentioned before, the network topology represented by the graph is intrinsically non-Euclidean. In the CNN model, the network topology is encoded and processed as a pixel image determined by its adjacency matrix. This way, in the CNN model when the convolution layer extracts local structure patterns, the neighbouring nodes of is defined as those nodes with precedent and subsequent indices, i.e., and , rather than the nodes with links connecting to . This encoding scheme may be problematic because the nodes with consecutive indices could be many hops away from each other in the actual network topology. For example, in Figure 4(a), node is 4 hops away from node , and node is 3 hops away from node . Moreover, if the node set is reindexed, the input image of the CNN model can change drastically. Seeing that GCN is a generalization of CNN on non-Euclidean data structures, the representation feature learned via our proposed GCN model has the merits of higher rationality and invariance under node reindexing compared to the CNN model, while maintaining the advantage over ANN models brought by convolution operator.

Although the parameter budget of our proposed GCN model is more expensive than most other models, it can be seen from Table 3 that the training time does not grow dramatically. Compared to the cumbersome computations of the exact methods, relatively speaking, our estimation models are quite effective. For instance, the total computation time of the contraction-deletion method for dataset 6 is 84911 seconds, that is an average of 10.61 seconds per 30-node network. While for our proposed GCN model, once properly trained, it can give an estimation result with high precision immediately. In addition, it should also be noticed that the parameter budget of the CNN model nearly grows tenfold as the node set size grows from 10 to 20 and from 20 to 30, while the parameter budget of our proposed GCN model only increases by 54.57% and 35.30%, respectively. Thus, in terms of training parameter budget, our proposed model is more scalable to larger networks than the CNN model.

###### 4.2.2. Heterogeneous Network

Comparative experiments are also undertaken for estimation models on 6 heterogeneous datasets (dataset 7 to 12), the results of which are listed in Table 4. Because the encoding method in the CNN model and the ANN2 model can only be applied to homogeneous networks, here only the ANN1 model are investigated and compared to our proposed GCN model.

Like before, our proposed GCN model consistently outperforms the ANN1 model across all 6 heterogeneous datasets, with an average of 57.52% improvement on RMSE. The best case is dataset 12, where our proposed model reduces the RMSE by 71.77% compared to the ANN1 model, while the least significant case is dataset 7, where our proposed GCN model’s RMSE is 37.81% lower than that of the ANN1 model.

For heterogeneous datasets, we also make the models trained in the first fold on all-terminal reliability datasets (datasets 8, 10, and 12) as illustrative examples. As before, the best-trained model in the 5 runs is selected for the ANN1 model and our proposed GCN model. Figure 11, Figure 12, and Figure 13 compare their estimation results to the actual reliability value on 50 sample networks. It is safe to say that compare to the ANN1 model, our proposed GCN model captures the relationship between the network topologies and reliability values more effectively and accurately.

#### 5. Concluding Remarks

This paper proposes a general framework for the k-terminal estimation problem of communication networks. The analytic contraction-deletion method is made use of to compute the k-terminal reliability for sample networks. Once the dataset with sufficient sample size is obtained, an estimation model based on GCN can be trained. Because the graph convolution layers are capable to extract high-level feature from network topology, the features learned with the GCN model are more effective in representing the network structure, which then results in higher precision. Comparative experiments on communication networks with node set size of 10, 20, and 30 is carried out. The results show that for the 6 homogeneous networks, our proposed GCN model reduces the RMSE by an average of 59.60%; while for the 6 heterogeneous networks, an average of 57.52% decrease in RMSE is witnessed comparing our proposed model to previous research studies.

However, there are still several improvements that could be applied to our model, which will be the focus of our future work. To begin with, it has been recognized that the performances of neural network-based estimation model depend heavily on the amount of data available. To this end, several topology reduction techniques like degree-1 or degree-2 node reduction can be applied in the company with a contraction-deletion method to accelerate the dataset generation process. Besides, in our proposed model, the dimensions of node and edge features are set to be 1. For edge features, if other information of the communication links, for instance, bandwidth or transmission delay, can also be incorporated, the model may be applied to the task of mission reliability estimation. Whereas for node feature, other than degree centrality, there are several attributes that can also be taken into consideration, such as betweenness centrality, eigenvector centrality, or clustering coefficient, which may further improve the precision of the estimation model. What’s more, the development of GCN is a well-studied topic in the field of machine learning. The application of more sophisticated and advanced GCN models [59–62] to the estimation of network reliability would also be a direction of our future work.

#### Data Availability

The communication network reliability data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.