Abstract

As one of the next generation networks, Named Data Networking (NDN) performs well on content distribution. However, it is vulnerable against a new type of denial-of-service (DoS) attacks, interest flooding attacks (IFAs), one of the fatal threats to NDN. The attackers request nonexist content to occupy the Pending Interest Table (PIT), and it causes the degradation of network performance. Because of the great harm and strong concealment of this attack, it is urgent to detect and throttle the attack. This paper proposes a detection mechanism based on Long Short-Term Memory (LSTM) with attention mechanism, which uses sequence with different treatments. Once IFA is detected, the Hellinger distance is used to recognize malicious Interest prefix. The simulation results show that the proposed scheme can resist IFA effectively compared to state-of-the-art schemes.

1. Introduction

The purpose of traditional network architecture based on TCP/IP is to meet the end-to-end data transmission, which cannot meet the diversified needs today. Therefore, the researchers began to study new network architectures. Information Centric Networking (ICN) [1] aims to build a new content-centric future network architecture, and it transforms the current host-centric communication mode into the content-centric network communication mode. Typical representative projects of ICN include information-oriented network architecture (Network of Information, NetInf) [2], publish/subscribe Internet routing paradigm, and publish/subscribe Internet topology (PSIRP/PURSUIT) [3], Data-Oriented Network Architecture (DONA) [4], Content Centric Networking (CCN) [5], and Named Data Networking (NDN) [6]. The most representative ICN architecture is NDN, which was proposed by Zhang Lixia of UCLA (University of California-Los Angeles) and Van Jacobson of Xerox PARC (Xerox Palo Alto Research Center) in 2010. The architecture of NDN is shown in Figure 1.

In the NDN network, there are two types of packets: Interest packet and Data packet [6]. The users send Interest packet to request content, and the returned content is called Data packet. There are three data structures in NDN: content store (CS), Pending Interest Table (PIT), and forwarding information base (FIB) [6]. NDN implements routing and forwarding via these three data structures:(i)FIB: it stores the interface information pointing to the specified content, and the Interest packet is forwarded according to the FIB.(ii)PIT: it records the unsatisfied Interest packet and the corresponding interfaces and can aggregate the Interest packets, and the Data packets are returned in the original way according to the interface information of the PIT.(iii)CS: the router caches the received Data packet to realize intranetwork caching and reduces the delay for users to obtain data.

The NDN forwarding process of Interest packet and Data packet is shown in Figure 2.

When an NDN router receives an Interest packet, first it checks if CS has a matching data. If so, the router returns the Data Packet. Otherwise, the router checks whether PIT has a matching entry. If it exists, the router adds incoming interface of the Interest packet to the entry. Otherwise, the router forwards Interest packet based on the FIB. When receiving a Data packet, the router first checks if PIT has a matching entry. If it exists, the router returns the Data packet based on the information of the PIT and caches the Data Packet. Otherwise, the router will drop the Data packet.

Denial of service (DoS) and distributed denial of service (DDoS) are rampant in the traditional TCP/IP architecture [8]. NDN can mitigate the impact of DDoS in TCP/IP architecture. However, the researchers discover a new type of DDoS attack called IFA [8]. As shown in Figure 3, the attacker forges a number of fake Interest packets to consume the memory resources of routers, which cause the degradation of network performance.

The IFA attack has great harm and strong concealment, and the researchers have tried various defend mechanisms, mainly including machine learning and statistical method. Due to the characteristics of network traffic, it is difficult to accurately identify attacks of a single time interval, resulting in low accuracy of attack detection. This paper uses past data through sliding window and proposes an attention-based Long Short-Term Memory (LSTM) [9] for IFA detection. Once IFA is detected, the Hellinger distance [10] is used to identify the malicious prefix.

The contributions of this paper are summarized as follows:(1)This paper uses the LSTM model with attention mechanism to detect IFA by exploiting the past data sequence and with different treatments(2)This paper proposes a Hellinger distance-based malicious Interest prefix identify mechanism(3)The simulation results show that the scheme proposed can detect IFA effectively

The rest of the paper is organized as follows: Section 2 gives a review of related works. Section 3 presents detection mechanism and mitigation mechanism in detail. Section 4 gives an evaluation of the proposed mechanism and compares the proposed mechanism with state-of-the-art mechanism. Finally, Section 5 concludes the paper.

Various literature works have been proposed on detecting and mitigating the IFA. Some approaches use machine learning to detect IFA. In paper [11], linear SVM and SVM with Gaussian radial basis kernel function were used to detect IFA. It consisted of two phases: the training phase and the test phase. In paper [12], the Isolation Forest was used to calculate the abnormal score of each Interest prefix at the end of each fixed time interval to detect abnormal Interest packet prefix. In paper [13], the deep reinforcement learning was used to detect IFA. In paper [14], the naïve Bayes (NB), J48 decision tree, multilayer perceptron with backpropagation (BP), and radial basis function (RBF) network were used to detect IFA. In paper [15], the authors used multilayer perceptron (MLP) with backpropagation (BP), radial basis function (RBF) network with particle swarm optimization (PSO), JAYA and teaching–learning-based optimization (TLBO), linear support vector machine (SVM), and fine k-nearest neighbours (KNN) to detect the attack. In paper [16], the authors used association rule algorithm to find the correlation between features and used decision tree algorithm to detect the attack.

Some approaches use the mathematical model to detect IFA. In paper [17], every NDN router computed the Gini impurity to detect IFA by measuring the Interest name in a router. In paper [18], the Theil index was used to detect IFA and the Interest packets were divided into groups by Theil entropy to evaluate the intragroup and intergroup difference of Interest name distribution. In paper [19], two traffic features were used to establish confidence interval, respectively, to detect IFA. In paper [20], the authors used mean and variance of packet hop counts to distinguish legitimate users from malicious users. In paper [21], the authors used hash-based security label to identify the malicious prefix. In paper [22], the authors used wavelet analysis to detect IFA. In paper [23], the routers used active queue management (AQM) to defend IFA. In paper [24], each edge router used token-based router monitoring policy (TRM) to mitigate the IFA by controlling the data requestors. The detection method used in the related work is shown in Table 1. The main drawback of existing IFA detection method is counting the traffic information on a fixed time interval, which ignores the temporal relationship of traffic.

3. Detection Mechanism Based on Attention Mechanism with LSTM

This section gives an overview of proposed defend mechanism, detection mechanism, and mitigation mechanism.

3.1. Overview

The defend mechanism mainly consists of five parts, the data collection module, the data preprocessing module, the detection module, the response module, and the mitigation module, as shown in Figure 4.

In the data collection module, the traffic data is collected and it is then input to the preprocessing module. In the preprocessing module, the traffic characteristics are extracted. The traffic characteristics are used to detect IFA in the detection module. Once IFA is detected, the response module will start identify the malicious prefix. Finally, the mitigation module uses malicious prefix to limit the malicious Interest packet.

3.2. Long Short-Term Memory

Deep learning is popular and is used in various applications. Recurrent neural network (RNN) [27] is a type of deep learning methods, which can be used to detect anomaly. However, there is a gradient vanishing problem in RNN [28]. Long Short-Term Memory (LSTM) [9] is an improved version of RNN, which solves the problem of RNN. The LSTM structure is shown in Figure 5.

It mainly includes three structures, input gate, forget gate, and output gate, which are used to update the LSTM cell as follows [9]:where is the weight, is the bias, is the hidden state at time step , and is the input at time step .

3.3. Attention Mechanism

The Attention mechanism is inspired by human attention behaviour and is well applied to deep learning.

In paper [29], the attention mechanism was proposed. Given an input , where is the length of input, , , and is the number of dimensions in each time step, the calculation of the attention mechanism is divided into two steps: first calculate the attention probability of all input and then calculate the weighted average of the input information according to the attention probability.

3.4. Detection Mechanism

This section presents the detection mechanism in detail. First, some used notations are listed and some features are defined. The notations used are listed in Table 2.

Definition 1. (Router PIT Utilization Size). It denotes the number of PIT entries in PIT during one time slice.

Definition 2. (Router Interest Satisfaction Ratio). It denotes the number of Data packets received to the number of Interest packets received in one time slice.

Definition 3. (Router Interest Request Frequency). It denotes the number of Interest packets received in one time slice.

Definition 4. (Router Data Reply Frequency). It denotes the number of Data packets replied in one time slice.The feature calculation is shown in Algorithm 1.

Input:
▷ The time slice size
Output:
▷ The request frequency
▷ The reply frequency
▷ The satisfaction ratio
(1)procedure IncomingInterest(slice)
(2)
(3)end procedure
(4)procedure IncomingData(slice)
(5)
(6)end procedure
(7)
(8)return s
The detection mechanism detects IFA through a sliding window, as shown in Figure 6.
A network traffic formally as a time series: , which consists of time steps. represents the th time step. For each sliding window, which consists of time steps, the detection model is used to classify the sliding window as legitimate or malicious.
Figure 7 shows the LSTM with attention mechanism for IFA detection. The attention mechanism can improve the performance of LSTM by discriminatively utilizing each step of hidden state information [30]. Therefore, this paper uses the traditional LSTM with attention mechanism to detect IFA. The hidden states of each step are multiplied with attention weights.
In LSTM layer, the input of each step is mapped to a hidden state.where is the hidden state at time step and is the input at time step .
In attention layer, the hidden state of each step is input to a subsequent attention layer. It takes the form as follows [31]:where is the weight for each time step and is a fully connected layer with ReLU activation and parameters .
The illustration of attention mechanism is shown in Figure 8.
In output layer, the attention layer results is input to a fully connected layer with sigmoid activation to obtain the final result.The detection mechanism is shown in Algorithm 2.
Input:
▷ The time slice size
▷ The sliding window size
▷ Detection threshold
Output:
Detection result
(1)Compute the metrics during time slice
(2)for the consecutive time step with length do
(3)fed the sequence to the detection model
(4)
(5)ifthen
(6)return legitimate
(7)else
(8)return malicious
(9)end if
(10)end for
The algorithm works as mentioned in the following steps:Step (1): count the traffic information in time slice , use Algorithm 1Step (2): when the sliding window size is , fed to the detection model, get output Step (3): if the detection result is legitimate, forward the sliding window and return to Step (2)Step (4): if the detection result is malicious, trigger the malicious prefix identification mechanism

3.5. Response Mechanism

This paper recognizes the malicious Interest prefixes based on the Hellinger distance [10]. The Hellinger distance is used to measure the deviation between two probability distributions independent of parameters.where and are two probability distributions, and are n-tuples , and , , and .

The malicious prefix recognition process is shown in Algorithm 3.

Input:
Interest prefix distribution when IFA is detected:
Interest prefix distribution before IFA is detected:
Interest prefix set:
Output:
Malicious prefix set
(1)
(2)fordo
(3)
(4)calculate the Hellinger distance
(5)ifthen
(6)add to malicious prefix set
(7)end if
(8)end for
(9)return malicious prefix set
3.6. Mitigation Mechanism

When malicious prefixes are recognized, the router will send notification packet that includes the malicious prefixes to the downstream router, as shown in Figure 9. The downstream routers extract the malicious prefix and limit its sending rate when receiving the notification packet.

4. Performance Evaluation

In order to evaluate the performance of the proposed scheme, this paper conducts a set of simulations in ndnSIM [32]. Then, this paper compares the proposed scheme with the state-of-the-art defend scheme. The simulations parameters are shown in Table 3.

This paper considers tree topology as shown in Figure 10. The tree topology which is one of the most severely affected by the IFA is widely used in detection mechanism evaluation of IFA.

In tree topology, denotes the NDN router, denotes the legitimate user, denotes the data provider, and denotes the malicious user. The red lines denote connections between the malicious user and NDN router, the green lines denote connections between the legitimate user and NDN router, the black lines denote connections between NDN routers, and the blue lines denote the connections between the data provider and NDN router.

In tree topology, there are 9 legitimate users and 7 malicious users. The simulation lasts 800s. The legitimate users issue Interest with the Zipf-Mandelbrot distribution [33], and the malicious users issue Interest with uniform distribution. In Zipf-Mandelbrot distribution, the content items with -th rank in the whole content popularity ranking list are requested with probability , where , , is the size of the popularity list, and and are parameters.

4.1. Performance Metrics

The performance of detection mechanism is evaluated by the confusion matrix, as shown in Figure 11, where TP represents the number of abnormal traffic, which is classified as abnormal, TN represents the number of normal traffic, which is classified as normal, FP represents the number of normal traffic, which is classified as abnormal, and FN represents the number of abnormal traffic, which is classified as normal. This paper compares the detection mechanism with SVM and LSTM from the following metrics:(i)Interest satisfaction ratio: it is defined as the ratio between the number of Data packets received and the number of Interest packets sent.(ii)PIT size: it is defined as the number of entries in the PIT.(iii)Accuracy: it is defined as the overall performance of the model and is calculated as follows:(iv)Recall: it is defined as the proportion of attack samples that are correctly identified as attacks, and it is calculated as follows:

4.2. Hyperparameter Tuning

The detection model’s architectures are built using Pytorch in Python on a machine with 32 GB RAM. This paper trains detection model for 50 epochs with Adam optimizer [34] at a learning rate of 0.001.

4.3. Loss Function

The binary cross entropy is a loss function that is used in binary classification problems. The objective of the detection mechanism is to label time window as normal or abnormal; therefore, this paper uses binary cross entropy loss function for training the LSTM and LSTM with attention mechanism, which is computed as follows:where is the binary label and is the total number of samples in training set.

4.4. Impact of the IFA

Attack intensity is defined as the ratio of malicious user’s sending rate to the legitimate user’s sending rate. In this section, this paper evaluates the impact of the IFA and considers two types of routers: the router only connected to legitimate user and the router connected to legitimate user and malicious user.

In Figure 10, this paper evaluates PIT size of the routers , , and under IFA and evaluates the Interest satisfaction ratio of normal users under the IFA.

Figure 12 shows the PIT size under IFA with different attack intensities. When there is no attack, the routers have a constant PIT size. When IFA is launched at the 400th second, the PIT size begins to increase and the greater the attack intensity, the greater the PIT size. The impact on PIT size is also different for routers in different locations; the router R11 is least affected by the attack because it is not connected to a malicious user; the router R10 is greatly affected by the attack because it has the most connections with malicious users.

Figure 13 shows the Interest satisfaction ratio of normal user under IFA with different attack intensities. The Interest satisfaction ratio is stable without IFA, and the Interest packet sent by the user can receive the corresponding Data packet. At the 400th second, the IFA is launched, the Interest packets sent by users can hardly receive the corresponding Data packets, and the Interest satisfaction ratio decreases instantaneously. Moreover, with the increase of attack intensity, more malicious Interest packets are sent and the impact on Interest satisfaction ratio of normal users is greater.

4.5. Performance of Detection Mechanism

In this section, this paper compares our detection mechanism with SVM and LSTM from detection accuracy and recall. Then, this paper evaluates the defend mechanism from Interest satisfaction ratio and PIT size with expired-PIT-based defend mechanism [35].

Firstly, the learning rate and batch size used in this paper are introduced. This paper selects learning rate and batch size by comparing the detection accuracy. The learning rate is 0.001, 0.005, and 0.01, respectively. The batch size is 512, 256, and 128, respectively. The simulation results of different learning rates and batch sizes on the detection accuracy are shown in Figures 1416, respectively.

Figure 14 shows the detection accuracy under different attack intensities with different learning rates when the batch size is 512. When the learning rate is 0.001, the accuracy is the highest.

Figure 15 shows the detection accuracy under different attack intensities with different learning rates when the batch size is 256. When the learning rate is 0.001, the accuracy is the highest.

Figure 16 shows the detection accuracy under different attack intensities with different learning rates when the batch size is 128.

Finally, this paper sets the batch size 512 and the learning rate is 0.001. As shown in Figures 17 and 18, with the increase in the number of epochs, the accuracy increases and the loss decreases. When the epochs are equal to 50, the model tends to be stable.

Next, this paper compares the accuracy and recall of the detection mechanism with SVM and LSTM, and the results are shown in Figures 19 and 20.

Figure 19 shows the detection accuracy of the proposed detection mechanism under different attack intensities. Compared with LSTM and SVM, the detection mechanism proposed in this paper has the highest accuracy.

Figure 20 shows the recall of the proposed detection mechanism under different attack intensities. Compared with LSTM and SVM, the detection mechanism proposed in this paper has the highest recall.

4.6. Performance of Mitigation Mechanism

This section evaluates our mitigation mechanism on the Interest satisfaction ratio and PIT size.

Figure 21 shows the Interest satisfaction ratio with the proposed defend mechanism and expired-PIT-based defend mechanism under attack. When the malicious users launch IFA at the 400th second, the Interest satisfaction ratio drops rapidly. Under high attack intensity, the proposed detection mechanism quickly detects the attack and limits the sending of malicious packets and the Interest satisfaction ratio returns to the normal level. This paper also tests the impact of the detection mechanism on the burst traffic of normal users, and the proposed detection mechanism will not misjudge the burst traffic of normal users.

Figures 22 and 23 show the PIT size with the proposed defend mechanism and expired-PIT-based defend mechanism under attack. When the attacker starts the attack at the 400th second, the PIT size rises rapidly. Under high attack intensity, the detection mechanism quickly detects the attack of different attack intensities and limits the sending of malicious packets and the PIT size returns to the normal level.

5. Conclusions

This paper proposes a defend mechanism for Interest flooding attack in NDN. The defend consists of three parts: detection, response, and mitigation. The LSTM with attention mechanism is used to detect IFA; once IFA is detected, the Hellinger distance is used to identify malicious Interest packet prefix. Finally, the malicious prefix is sent to the downstream routers to cooperate to limit the attack. The experimental results show that the LSTM with attention mechanism shows better performance than the LSTM and SVM. In future work, this paper will consider multiple attacks in NDN, such as collusive attack, low-rate IFA, and large-scale topology.

Data Availability

The data used to support the findings of this study have not been made available because the data also form part of an ongoing study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (61862046); Huhhot Science & Technology Plan (2021-KJXM-TZJW-04); and the Science and Technology Program of Inner Mongolia Autonomous Region (2020GG0188).