Abstract

Socialnet becomes an important component in real life, drawing a lot of study issues of security and safety. Recently, for the features of graph structure in socialnet, adversarial attacks on node classification are exposed, and automatic attack methods such as fast gradient attack (FGA) and NETTACK are developed for per-node attacks, which can be utilized for multinode attacks in a sequential way. However, due to the overlook of perturbation influence between different per-node attacks, the above sequential method does not guarantee a global attack success rate for all target nodes, under a fixed budget of perturbation. In this paper, we propose a parallel adversarial attack framework on node classification. We redesign new loss function and objective function for nonconstraint and constraint perturbations, respectively. Through constructing intersection and supplement mechanisms of perturbations, we then integrate node filtering-based P-FGA and P-NETTACK in a unified framework, finally realizing parallel adversarial attacks. Experiments on politician socialnet dataset Polblogs with detailed analysis are conducted to show the effectiveness of our approach.

1. Introduction

With the development of Internet and IT technology, an emerging cyber space [1], which refers to the global network of interdependent information technology infrastructures, telecommunications networks, and computer processing systems, is covering most aspects of our daily life nowadays. In such space, as a highly important and detailed representation, various emerging social networks (e.g., Facebook, Twitter, WeChat, and TikTok) are greatly pushing the new revolution of network interconnection and interdependence, as well as the social relations and information propagation [2, 3].

Social network is called socialnet in short. Due to the popularity, billions of socialnet users share their personal data and connect with friends and family through various devices and applications. Since the socialnet can be abstracted to a simple kind of graph with features of nodes and edges, many researchers have contributed their efforts to study socialnet and corresponding services based on graph- and workflow-related approaches [48]. One of the most frequently applied tasks on graph data is node classification, the goal of which is to predict the labels of the remaining nodes when given a single large graph and the class labels of a few nodes [9]. For example, we can utilize node classification to predict the political labels of politician such as Liberals and Conservatives, according to their socialnet interactions.

For node classification in recent years, the graph convolutional network (GCN) [10, 11], a kind of graph neural network (GNN) [12, 13] based on deep learning, has shown a great potential. Unfortunately, such GCN also opens a new door for cyber attacks. Adversarial attacks against GCN are discovered, through few edge perturbations of addition or deletion, and they are uneasy to notice [14]. Furthermore, automatic attack methods are developed to explore effective perturbations including constraint and nonconstraint perturbations. Constraint perturbation refers to the edge perturbation satisfying specific requirements such as node degree distribution of graph [15]. Nonconstraint perturbation means a free perturbation. Accordingly, fast gradient attack (FGA) [16] and NETTACK [17] are typical nonconstraint and constraint methods, respectively. The above methods enable per-node attacks, as well as multinode attacks in a sequential way. However, due to the overlook of perturbation influence between different per-node attacks, the above sequential method does not guarantee a global attack success rate for all target nodes, under a fixed budget of perturbation. Figure 1 shows the differences between sequential and parallel attacks in a motivating example. For the No. 1, 2, and 3 nodes, the attack goal is to change their class labels through changing graph structure with perturbations, including edge addition and edge deletion. We can see that, due to removing the edges from efforts of edge addition in attack of No. 1 node, No. 2 and No. 3 node attacks of sequential attack waste edges perturbations and cause No. 1 node attack to fail with a global attack success rate of 2/3, while the parallel attack considers perturbation influence and has a higher attack success rate with a lower budget.

In this work, we are the first to perform multinode attack in a parallel way by integrating two methods P-FGA and P-NETTACK in a unified attack framework. Based on nonconstraint FGA, we redesign a new loss function in P-FGA, which employs CW-loss [18] to replace CE-loss. For P-NETTACK, we utilize the maximum sum of surrogate loss as new objective function to support parallel attack. Moreover, we apply a node filtering mechanism to P-NETTACK and P-FGA, which filters out those nodes that are successfully attacked from target node set. After extracting common perturbations, we also provide a random supplement of perturbations to fill the budget.

We experiment on politician socialnet dataset Polblogs [19] of 1222 nodes and 16714 edges, showing the effectiveness of our approach. We find that our approach can achieve a high attack success rate (ASR) of 71.5% at the lowest perturbation budget of 1/5 ( is the sum of the degrees of all target nodes), that is over 15% higher than that of NETTACK or FGA, still keeping a satisfied test statistic of 0.005. The filtering mechanism can greatly improve ASR, with nearly 20% average increment. We summarize our contributions as follows: (1)We give the very first attempt to propose a multinode parallel adversarial attack framework on node classification in socialnet of graph structure, based on considering perturbation influence between per-node attacks.(2)Node filtering-based nonconstraint P-FGA and node filtering-based constraint P-NETTACK are proposed, and we integrate them into a unified multinode parallel attack framework, through constructing intersection and supplement mechanisms of perturbation.(3)We evaluate our approach empirically on real dataset of politician socialnet Polblogs. Based on parallel attacking on the graph of 1222 nodes and 16714 edges, we reveal and verify the effectiveness of our approach compared to sequential attacks in terms of attack strength and attack stealthiness.

The rest of the paper is structured as follows: Section 2 introduces the preliminaries and problem definition. Section 3 proposes a multinode parallel attack framework. Section 4 reports our experiments and evaluations on the politician socialnet dataset Polblogs. In Section 5, we discuss the related works. Finally, Section 6 concludes the work of this paper.

2. Preliminaries and Problem Definition

2.1. Graph Structure of Socialnet

In real socialnet, one person can have an interaction with others by operations like the following: commenting, reposting, etc. Such interaction can be quantified and qualified, varying from different measurements. For simplicity, we assume that we just use an undirected unweighted edge to denote an interaction existence, constructing graph structure of socialnet (See Figure 2). Moreover, we simply assume that one node only has one classification label, and we do not focus on multiple free-label user profile or granularity-based hierarchical user profile [20] in attack scenarios of this paper. Thus, for a socialnet graph, we have a triple including node set V, label set C, and adjacent matrix A, in which , (), and A is shown as follows:

2.2. Graph Convolutional Network

As a kind of GNN, GCN is an extremely powerful neural network architecture for deep learning on graphs to produce useful feature representations of nodes in networks. Given a , we can partially delete some node’s label ( = null) to obtain a new . The goal of node classification is to learn a function , which maps each node to one class (=0).

We use a two-layer GCN to approximate the function : where , is the adjacent matrix of the input graph with self-loops added via the identity matrix , is the degree matrix of , and is a matrix of node feature vectors. For the graph whose nodes do not have feature attributes, can be set to an identity matrix . and are the trainable weight matrices of the first and second layers, respectively, and is a ReLU activation function. For the semisupervised node classification, the optimal parameters are learnt by minimizing the cross-entropy loss over all labeled examples:where is the set of nodes with labels, namely, training set, is the given true label of node , and is the probability of assigning class to node .

2.3. Problem Definition

Given the attack target set in and perturbation budget , multinode attack on GCN can be regarded as the following optimization problem: where , , else .

Formula (4) shows the objective function, aiming to find the optimal adjacent matrix. When the sum of misclassification for all target nodes is maximum, it means a most successful multinode attack. Formulas (5) and (6) show the constraints that should be satisfied. Formula (5) requires that the number of edge perturbations be no more than (a predefined constant). Formula (6) has the constraint that any edge perturbation must be linked to a target attack node.

3. Multinode Parallel Attack Framework

Our multinode parallel attack framework is shown in Figure 3. Firstly, given an original graph as defined in Section 2.1, we train a GCN for node classification task, and we obtain , in which all nodes are labeled with prediction, and record into testing result as ground truth. Then, given a target node set , we utilize P-FGA and P-NETTACK to perturb the original graph, attacking target nodes in . In each iteration of nonconstraint P-FGA method, based on GCN-gradient information of the adjacent matrix , we select the pair of nodes of maximum absolute value of gradient to perform perturbation (edge deletion or edge addition), generating a new adversarial graph by the generator. In each iteration of constraint P-NETTACK, to ensure keeping the perturbations unnoticeable and preserving the important structural characteristics, we firstly compute the candidate perturbation set to ensure the similar node degree distribution after perturbation execution. Then, according to our redesigned objective function, from candidate perturbation set, we greedily select the optimal perturbation , which obtains the highest objective score, generating a new adversarial graph by the generator.

In the filtering mechanism, after each perturbation of P-FGA or P-NETTACK, the new predicted labels should be compared with the testing result to determine the attack effect. For those nodes that are successfully attacked, such mechanism filters them out from target node set to form a new target node set , ignoring those nodes in the next gradient/objective function computation and perturbation selection. Such process is repeated until the perturbation budget is reached. and are perturbation sets based on P-FGA and P-NETTACK, respectively. To integrate and and generate unified perturbations, we provide an intersection mechanism to extract common perturbations and a perturbation supplement mechanism to fill the perturbation budget . Finally, the integrated perturbation set is used to realize effective multinode parallel adversarial attacks.

3.1. P-FGA Method

In our P-FGA, to adapt to the multinode attack, we redesign a new loss function for attack target set , which employs CW-loss [18] to replace CE-loss and takes all target nodes into consideration (see equation (7)).

Following the gradient-based idea of original FGA, based on the new loss function for multinode attack, we firstly calculate the partial derivatives with respect to the element of adjacency matrix and further obtain gradient matrix , and its element can be calculated by

Considering that the adjacency matrix is symmetric and its gradient matrix should also be symmetric, thus, we havewhere forms . A bigger value of multinode loss function corresponds to worse prediction results for the target nodes in . And edge perturbations along the direction of the gradient can make the loss increase more faster locally. That is, for a positive gradient , adding the edge between the pair of nodes can increase the loss. Similarly, for a negative gradient , deleting the edge also increases the loss.

However, since the adjacent matrix is binary discrete and , not all edges can be perturbed along the direction of the gradient. For example, for a pair of nodes who have positive/negative gradient (i.e., ) and meanwhile are connected/disconnected (i.e., ), we cannot further add/delete the edge along the direction of the gradient. Thus, we design equation (9); for a positive gradient , when , is positive; when , is negative. Similarly, for a negative gradient , when , is positive; when , is negative. Only the positive enables the addition/deletion of the edge along the direction of the gradient. Then, for edge addition or deletion, we pick the optimal edge , , with the maximum , and the adjacent matrix is updated to by changing the corresponding value ( and ) to a different binary value (see equation (10)).

The pseudocode for P-FGA is given in Algorithm 1.

Input: , attack target set , perturbation budget
Output: perturbation set
(1) Train the GCN model on original graph
(2) Initialize
(3) Initialize perturbation set
(4)for to do
(5)  //GCN-based Gradient Computation
(6)  Calculate multi-node target loss function
(7)  Construct based on the :
  , ,
  
(8)  //Perturbation Selection
(9)  Select where , having the maximum
(10)  //Perturbation Execution
(11)  Obtain the adjacency matrix by
(12)  Generate a new adversarial graph
(13)  Add to
(14)end
(15)return
3.2. P-NETTACK Method

In constraint P-NETTACK, we use test statistic to determine whether our generated adversarial graph and original graph have similar node degree distribution of pow-law distribution , in which denotes the probability of certain degree , and refers to scaling parameter. The test statistic can be calculated based on the following formulas.

In equation (11), is the minimum degree that a node has to be considered in the power-law test, and is the multiset containing the list of node degrees, where is the degree of node in [21]. Equation (12) is used to evaluate the log-likelihood for the sample [22]. Then, we can get final test statistic by equation (13). Similar to NETTACK, we only accept adversarial graph whose degree distribution fulfils and thus obtain the candidate perturbation set . In our P-NETTACK, the edge perturbations in must be linked to an attack target node.

To efficiently select the optimal perturbation from , NETTACK utilizes a linear surrogate model to approximate the nonlinear GCN model by removing the activation function . is calculated as follows:

In our P-NETTACK, given an attack target set , we utilize the sum of single surrogate losses for each as the new surrogate loss to support multinode attack:where is the value of class given to the node by the surrogate model. The multinode scoring function that evaluates the multinode surrogate loss obtained after adding/deleting an edge is defined aswhere is updated to by . Following the greedy approximate scheme in NETTACK, during each iteration, we select the optimal perturbation that has the highest value of multinode scoring function from the candidate perturbation set to execute. The above processes including candidate perturbation computation, determining optimal perturbation, and perturbation execution are repeated until the perturbation budget is reached. The pseudocode for P-NETTACK is given in Algorithm 2.

Input: , attack target set , perturbation budget
Output: perturbation set
(1) Train the surrogate model on original graph to obtain
(2) Initialize
(3) Initialize perturbation set
(4)for to do
(5)  Construct the valid candidate perturbations set
  
  
(6)  Select of the maximum multi-node scoring function value in
   
(7)  Obtain the adjacency matrix by
(8)  Generate a new adversarial graph
(9)  Add to
(10)end
(11)return
3.3. Filtering Mechanism

In this part, we propose a filtering mechanism that filters out target nodes that are successfully attacked from the target node set. After each perturbation, by the filtering mechanism, we obtain a filtered attack target set , which is used in the next iteration. If there are no nodes in , which means that all target nodes have been attacked successfully, and we reset the attack target set to the original attack target set . The pseudocode for filtering mechanism is given in Algorithm 3.

Input: perturbed graph , attack target set , node classification model
Output: filtered attack target set
(1) Initialize
(2)for each do
(3)  Predict the label of in by
(4)  if of ground truth is not equal to prediction result then
(5)  Remove from //filtering
(6)end
(7)ifthen
(8)   //reset
(9)return
3.4. Intersection and Supplement Mechanism

In this section, we construct intersection and supplement mechanism of perturbations. Given the perturbation sets and under a fixed perturbation budget , we first utilize intersection mechanism to extract their common perturbations . In general, the number of common perturbations is less than perturbation budget . Thus, we should provide a perturbation supplement mechanism to fill the budget.

We denote as the set consisting of the perturbations in but not in . Similarly, contains the perturbations in but not in . is the difference between and the number of . Besides, we use a supplementary factor to control the proportion of supplementary perturbations from . Specially, we randomly select and perturbations from and , respectively, and add them to the , forming the final unified perturbation set. The pseudocode for intersection and supplement mechanism of perturbations is given in Algorithm 4.

Input: , , supplementary factor , perturbation budget
Output:
(1) Execute the intersection of and to obtain
(2)ifthen
(3)  Obtain
(4)  Obtain
(5)  Calculate
(6)  Randomly add perturbations from to
(7)  Randomly add perturbations from to
(8)return

4. Experiments

4.1. Dataset and Environment

We use the well-known politician socialnet Polblogs [19] as our experimental dataset to evaluate our methods. The basic statistics are summarized in Table 1, and only the largest connected component is considered. We randomly choose 20% nodes in the dataset as the labeled nodes for training. The testing set consists of the rest of the unlabeled nodes.

We also give our experimental environment configuration in Table 2.

4.2. Target Parameters and Baselines

Our GCN as an attack target is constructed based on the program on the Github (https://github.com/tkipf/gcn). We train all models for a maximum of 200 epochs using Adam [23] with a learning rate of 0.01. We initialize weights using the initialization described in Glorot and Bengio [24] and accordingly (row-) normalize input feature vectors.

We compare our proposed attack method with comprehensive state-of-the-art adversarial attack methods including FGA and NETTACK. We use codes of the baselines provided by their authors.(i)FGA [16] extracts the gradient of pairwise nodes based on the adversarial network and then selects the pair of nodes with maximum absolute link gradient to realize the attack and update the adversarial network.(ii)NETTACK [17] designs adversarial attacks based on a static surrogate model and greedily selects the optimal perturbation through preserving the key structural features of a graph.(iii)Random attack randomly perturbs the edges related to target nodes.

5. Evaluations

5.1. Evaluation Metric
5.1.1. Attack Success Rate (ASR)

ASR is the ratio of the number of successfully attacked nodes to the total number of target nodes, which can be calculated as follows:where denotes the number of successfully attacked nodes and is the attack target set.

5.1.2. Average Attack Speed (AAS)

AAS refers to average running time of each attack, and it can be calculated as follows:where denotes the total attack time on target set , and is the perturbation budget.

5.1.3. Test Statistic

Test statistic is used to evaluate attack stealthiness (see equation (13)), which measures the structural difference between original graph and adversarial graph. A smaller means that the degree distribution of the adversarial graph is more similar to the original graph’s, and thus, the perturbations are more unnoticeable.

5.2. ASR Analysis

In our experiments, each attack target set consists of five target nodes, and all of them are from the test set that has been classified correctly in original graph. We divided the perturbation budgets into five levels according to the sum of degrees of all target nodes in the attack target set , i.e., , , , , .

As we can see from Table 3, for each , we compare ASR among P, P-NETTACK (k = 1), P-FGA (k = 0), NETTACK, FGA, and Random Attack, in which P is the best value of our unified approach. From Algorithm 4, we know that if k = 1, our unified method can be simplified as P-NETTACK; and if k = 0, our unified method can be simplified as P-FGA. P∗ has the highest ASR values of 0.715, 0.880, 1, and 1 at , , respectively. When there is a quite low budget , the ASR of P is over 15% higher than that of NETTACK or FGA. P-NETTACK (k = 1) and P-FGA (k = 0) have extremely close values for all budget . Figure 4 shows the visual comparison in Table 3.

In Table 4, we can see that, for , , our approach achieves highest ASR values of 0.715 (k = 0.5), 0.880 (k = 0.7), and 0.987 (k = 0.8), respectively. At , for many k settings, ASR values can reach 1. For example, at , ASR = 1 when k = 0.1, 0.2, 0.3, 0.4, 0.5, 0.7. Figure 5 shows the detailed ASR variation along with k increment.

5.3. Test Statistics Analysis

As we can see from Table 5, P-NETTACK (k = 1) has the lowest values of 0.003, 0.005, 0.005, and 0004 at , , respectively. Although P-NETTACK (k = 1) and NETTACK have the same constraint mechanism, the values of P-NETTACK (k = 1) are always lower than those of NETTACK. For P-FGA (k = 0) and FGA, which have not enforced the constraint, the values are extremely higher and continue increasing with the increment of . Figure 6 shows the visual comparison in Table 5.

In Table 6, we can see that, for all , with the increment of k, the test statistics keep decreasing, towards better results. Figure 7 clearly shows the variation along with k increment.

5.4. AAS Analysis

As we can see from Table 7, P-NETTACK is the most time-consuming adversarial attack method, with an average of 11.17s of each attack. Since the candidate perturbation set of P-NETTACK is larger than that of NETTACK, the AAS of P-NETTACK is much higher than that of NETTACK. Instead, P-FGA and FGA have extremely close AAS values, 0.17s and 0.14s, respectively.

5.5. Filtering Mechanism Analysis

In Table 8, we can see that, for all , the filtering mechanism can greatly improve ASR, with nearly 20% average increment. And for P-FGA, the ASR values at are higher than those of P-FGA (without filtering) at . Thus, we can see that the filtering mechanism plays a quite important role for P-NETTACK and P-FGA.

6.1. Politician Socialnet Analysis

In the last few years, social media has become an important political communication channel, attracting a lot of studies. Adamic and Glance [19] analyzed the political blogosphere over the period of two months preceding the US Presidential Election of 2004, measuring the degree of interaction between liberal and conservative blogs and revealing many interesting differences between the two communities such as linking patterns and discussion topics. Caton et al. [25] presented a Social Observatory, which focused on public Facebook profiles of 187 German politicians from five federal parties, observing how they interacted with constituents, measuring sentiment difference between the politicians and their followers, and analyzing online speech patterns of different parties. Stieglitz and Dang-Xuan [26] proposed a social media analytics framework in political context, aiming at continuously collecting, storing, monitoring, analyzing, and summarizing politically relevant user-generated content from different social media to gain a deeper insight into political discourse in social media.

However, few studies focus on the security analysis of politician socialnet including politician label classification from the perspective of adversarial graph attack. In comparison, we focus on studying security issues of politician socialnet based on graph structure, targeting a GCN model for politician label classification. Interestingly, politician socialnet is highly vulnerable, and the attack cost is quite cheap only by deleting few existing interactions or adding few new interactions. As an important communication bridge between politicians and citizens, the security analysis of politician socialnet should be highly valued.

6.2. Adversarial Attack on Graphs

Recently, some studies have investigated the adversarial attack on neural networks for graph structure. Zügner et al. [17] first revealed the existence of adversarial attack against GCN in node classification task, by slightly modifying graph structure or node attributions to lead to misclassification of a target node. Dai et al. [27] studied test-time nontargeted adversarial attacks on both node classification task and graph classification [28] task based on reinforcement learning. In addition to white-box attack scenario, they also extended their attack method into practical black-box and restricted black-box attack scenarios. Zhang et al. [29] systematically investigated the vulnerability of knowledge graph embedding for the first time. By adding or deleting facts in the knowledge graph, they destroyed the relation prediction model based on representative knowledge graph embedding methods including TransE [30] and RESCAL [31], which is also the first investigation on adversarial attack for heterogeneous graph. Chen et al. [16] explored the adversarial attack on both node classification task and community detection task [32] based on GCN-based gradient information.

However, most works about adversarial attack on node classification only focus on the per-node attack, aiming to achieve misclassification for a target node. Although, for those per-node attack methods, the multinode attack can be performed in a sequential way, the perturbation influence of different per-node attacks is overlooked. In comparison, our parallel attack method, which considers all target nodes and perturbation influence at the same time, is better for multinode attack. In addition, as the first to propose the parallel attack on graph structure, our work can provide an inspiration for adversarial attack on other tasks in a parallel way, such as parallel adversarial attack on prediction of multiple links.

In addition to the benefits mentioned above, the main drawback of our method is that it is time-consuming, especially the P-NETTACK (see Table 7), due to the reason that, at each iteration, more candidate perturbations are taken into computation compared with sequential per-node attack. One of the solutions is developing more computationally efficient test statistic function and scoring function. On the other hand, proposing a perturbation filtering mechanism to reduce the size of multinode candidate perturbations set is also an effective way. In addition, our method does not consider the constraints of attributed graphs [33], such as attribution-based node similarity constraint [34] and attribution cooccurrence constraint [17]. Parallel multinode adversarial attack on attributed graph and Heterogeneous Information Network (HIN) [35] still needs further exploration.

7. Conclusions

In this paper, we propose a multinode parallel adversarial attack framework on node classification in socialnet of graph structure, based on considering perturbation influence between per-node attacks. Through redesigning new loss function and objective function for nonconstraint and constraint perturbations, respectively, and constructing intersection and supplement mechanisms of perturbation, we integrate nonconstraint P-FGA and constraint P-NETTACK into a unified attack framework. Based on politician socialnet Polblogs of 1222 nodes and 16714 edges, we evaluate attack success rate, test statistics, and average attack speed for our approach. Our approach shows a high attack success rate of 71.5% at the lowest perturbation budget of 1/5 , keeping a satisfied test statistic of 0.005.

This work severs as a first step to take security analysis on multinode parallel adversarial attack in politician socialnet. It is expected to inspire a series of follow-up studies, including but not limited to (1) adversarial attack on prediction of multiple links; (2) more concrete defense design and implementation.

Data Availability

The dataset of Polblogs can be obtained from http://networkrepository.com/polblogs.php.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (61972025, 61802389, 61672092, U1811264, and 61966009), the National Key R&D Program of China (2020YFB1005604 and 2020YFB2103800), Fundamental Research Funds for the Central Universities of China (2018JBZ103 and 2019RC008), and Guangxi Key Laboratory of Trusted Software (KX201902).