Abstract

We propose using multiple observed features of network traffic to identify new high-distributed low-rate quality of services (QoS) violation so that detection accuracy may be further improved. For the multiple observed features, we choose F feature in TCP packet header as a microscopic feature and, P feature and D feature of network traffic as macroscopic features. Based on these features, we establish multistream fused hidden Markov model (MF-HMM) to detect stealthy low-rate denial of service (LDoS) attacks hidden in legitimate network background traffic. In addition, the threshold value is dynamically adjusted by using Kaufman algorithm. Our experiments show that the additive effect of combining multiple features effectively reduces the false-positive rate. The average detection rate of MF-HMM results in a significant 23.39% and 44.64% improvement over typical power spectrum density (PSD) algorithm and nonparametric cumulative sum (CUSUM) algorithm.

1. Introduction

In recent years malicious quality of services (QoS) violation attacks have become one of the most serious security threats to the Internet. New QoS attacks are increasingly showing the trend of high-distributed low rate. In the literature, this kind of attacks has been called shrew attacks [1], pulsing denial of service (DoS) attacks [2], or reduction of quality (RoQ) attacks [3]. For simplicity, we call all of them LDoS (low-rate denial of service) attacks in the sequel.

LDoS attacks are stealthy, periodic, pulsing, and low rate in attack volume, very different from early flooding type of attacks. A traditional detection system against flooding attacks is based on traffic volume analysis method in the time domain. However, it almost has no effect on new LDoS attack [4]. This is because the average bandwidth consumption differs very little between normal and attack streams.

In this paper, we present a new approach to identify LDoS attacks by combining multiple observed features at the micro- and macrolevel. Multidimensional features are extremely valuable for describing slight changes of network properties and help us accurately differentiate attack flows. So our new approach can complement existing detection mechanisms based on one-dimensional feature and overcome the bottleneck of detection accuracy for LDoS violation.

In microscopic features, we calculate weighted summation of flag bits (WSFB) in TCP packet header to reflect the packet’s internal slight change with and without LDoS attacks. Macroscopically, the best distinguishing characteristic between LDoS and normal flow is different periodicity in frequency domain [5]. Based on this fact, we choose weighted average size of packet in queue (WASPQ) in router as an observed sequence. Then, we convert the WASPQ sequence into frequency-domain spectrum using discrete Fourier transform (DFT) and achieve the power spectrum density (PSD) of WASPQ as a macroscopic feature. Moreover, we calculate the difference between request/response flows (DRRF) as another macroscopic feature.

Based on above three-dimensional features, we develop a multistream fused hidden Markov model (MF-HMM) to detect LDoS violation hidden in legitimate TCP/IP traffic. In addition, we adjust the decision threshold value dynamically based on Kaufman algorithm for improving the detection accuracy. Notations, symbols, and abbreviations used in this paper are summarized in Notations section. Only brief definitions are given here; details are given in subsequent sections.

The rest of this paper is organized as follows. In Section 2, we present the related work. Section 3 describes MF-HMM, its advantages, and its training algorithm. Section 4 presents the overview of TF-HMM procedure and explains how to extract multiobserved features of network traffic to establish the corresponding component HMM of TF-HMM. Furthermore, we also introduce the threshold dynamic adjustment based on Kaufman algorithm. In Section 5, we compare our work with those of other researchers and discuss the training and recognition time of TF-HMM. Finally, we conclude our paper in Section 6.

Some scholars studied the mathematical model of LDoS attacks. By simulating various LDoS attacks, they discussed the properties of LDoS attacks and gave some suggestions on further research. Maciá-Fernández et al. [6] summarized the behavior of LDoS and proposed a mathematical model for the LDoS attack. They also discussed the development trend and made some recommendations for building defense techniques against this attack. He et al. [7] presented theoretical analyses, modeling, and simulations of various LDoS attacks. And they discussed the difficulties of defending and current solutions. Zhu et al. [8] discussed the vulnerabilities of TCP and the principle of low-rate attacks. Moreover, the simulation of attacks was investigated, and the further direction of research is suggested.

Most current LDoS-related studies focus on using the frequency domain method to detect LDoS attack and have made clear progress. A research group [9] proposed an approach of detecting LDoS attack based on the model of small signal. Furthermore, in paper [10], they presented the method of multiple sampling averaging based on missing sampling (MSABMS) to detect LDoS attacks. An eigenvalue-estimating matrix was established to estimate the attack period after the detection of LDoS attacks. In addition, they also indicated a scheme [11] of detecting LDoS attack based on time window sampling in time domain and capturing the periodicity by statistic analysis in frequency domain. Zhang et al. [12] proposed a detection method, which is similar to that of Yu et al. [13]. In this method, the sum of the power spectrum is computed within 1–50 Hz, and the intersection of the two fitting curves is taken as the judging threshold. Luo and Chang [2] proposed a two-stage scheme to detect LDoS attacks on a victim network. The first stage is a discrete wavelet transform (DWT) analysis of the network traffic. The second stage is to detect change points by using a nonparametric cumulative sum (CUSUM) algorithm. Liu [14] proposed an LDoS attack detection method by calculating the Holder based on binary discrete wavelet analysis. Shevtekar et al. [15] presented an approach of detecting the periodicity of attack flow based on autocorrelation of flow.

Some detection methods based on traditional traffic characteristics are proposed in recent years. These methods detect the LDoS attacks by searching and identifying the abnormal network traffic caused by the LDoS attacks. For example, the exponentially weighted moving average (EWMA) method was presented in papers [16, 17]. However, the EWMA algorithm may smooth not only the normal traffic but also the abnormal traffic. This will affect the detection accuracy for the LDoS attacks. Therefore, paper [18] proposed an adaptive EWMA method which used an adaptive weighting function instead of the constant weighting of EWMA algorithm. The adaptive EWMA can smooth the accidental error and retain the exceptional mutation. Thus, it is more efficient than EWMA method.

Unlike a popular deployment location of detection system, paper [19] proposed an adaptive detection method for LDoS attacks in source-end network. The method does not require the distribution assumption of the traffic samples. Moreover, they presented the automatic adjustment of the detection threshold according to the traffic conditions.

In particular, Xiang et al. [20] innovatively propose using two new information metrics to detect low-rate DDoS attacks by measuring the difference between legitimate traffic and attack traffic. The proposed generalized entropy metric and information distance metric outperform the existing popular approach as they can clearly enlarge the adjudication distance and then obtain the better detection sensitivity.

In summary, most researches use one-dimensional information of network traffic to establish algorithms for detecting LDoS attack. Though some algorithms are sophisticated, one-dimensional information is not enough to accurately differentiate stealthy LDoS attack hidden in legitimate traffic. Despite gratifying progress, the high false-positive rate is still a striking bottleneck.

3. Multistream Fused HMM

We first describe basic properties of multistream fused HMM and then give its mathematical description and training algorithm in detail.

3.1. Basic Properties

To accurately identify stealthy LDoS violation hidden in legitimate network traffic, the combination of multiobserved features is considered in our scheme by using multistream fused HMM [21]. According to the maximum entropy principle and the maximum mutual information (MMI) criterion, MF-HMM constructs a new structure linking multiple HMMs. MF-HMM is the generalization of two-stream fused HMM [22].

The main advantages of MF-HMM are as follows.(1)Every observation feature can be modeled by a component HMM, so the performance of every feature can be analyzed individually. The set of features can be modified according to the performance analysis.(2)Compared with other existing model fusion methods (e.g., CHMM [23], MHMM [24], etc.), MF-HMM reaches a better balance between model complexity and performance.(3)MF-HMM has stronger robustness. If one component HMM fails due to some reason, the other component HMM can still work. Thus, the final result is still a valuable reference for the recognition judgment.

3.2. Mathematical Description

HMM is the basis of MF-HMM. In brief, we only discuss MF-HMM, and paper [25] discussed the HMM definition and relevant algorithms in detail. The mathematical symbols in this paper are consistent with the standard HMM description symbol.

Let represent tightly coupled observing sequences. Assume that can be modeled by corresponding HMMs with hidden states . In MF-HMM, an optimal solution for is given according to the maximum entropy principle and the maximum mutual information criterion .

In order to calculate , firstly we need to calculate every component ; here . The th can be given through

And assume

It has a good record in recognizing and detecting LDoS attacks, though the conditional independence assumption is always violated in practice. The success is because of the small number of parameters to be estimated in assumption. Without this assumption, some complicated algorithms require more training data and are more susceptible to local maximum during parameter estimation.

So, the estimate of can be given by

There are different expressions to different . To our three-stream fused HMM (TF-HMM), (3) corresponds to (4a), (4b), and (4c) as follows;

In practice, if the component HMMs have different reliabilities, they may be combined by different weights to get a better result:

Here, .

3.3. Training Algorithm

The training algorithm of MF-HMM is a three-step process.(1) component HMMs are trained independently by representative algorithm, such as Baum-Welch algorithm, segmented K-means algorithm, or hybrid method EM algorithm.(2)The best hidden state sequences of the component HMMs are estimated by the Viterbi algorithm.(3)Calculate the coupling parameters between the HMMs.

To our three-stream fused HMM, step is to calculate (6a), (6b), and (6c):

Step is to calculate (7a), (7b), and (7c):

Step is to estimate the coupling parameters between HMM1, HMM2, and HMM3:

4. Identifying LDoS Violation Using TH-HMM

In this section, we first present the procedure of identifying LDoS violation by using TF-HMM. Then, we explain how to establish three-component HMMs of TF-HMM, including F-HMM, P-HMM, and D-HMM. At last, we introduce the threshold dynamic adjustment based on Kaufman algorithm.

4.1. Procedure Overview

In order to make it easier to understand, we firstly introduce the procedure of TH-HMM, as illustrated in Figure 1.

(1) Split into Subsequence. Let the length of the detected sequence be . Split the detected sequence with a length splitting window, so the set of these subsequences is ; here, .

(2) Extract Three Observed Features. Extract F feature, P feature, and D feature, and then form the three-dimensional observation state sequence.

(3) Calculate the Output Probability. Input state sequences to TF-HMM, and calculate the output probability of every subsequence, denoted by .

(4) Label a Questionable Subsequence. If is less than the threshold , it is labeled as a questionable subsequence (Q-s); otherwise it is marked as a normal subsequence (N-s).

(5) Count the Ratio of Questionable Subsequence. After computing and labeling all subsequences, count the ratio according to

(6) Adjust Threshold Value by Kaufman Algorithm. During the detection system run, the threshold value will be adjusted by using Kaufman algorithm. In practice, the average detection rate of TF-HMM has been effectively improved with it.

(7) Determine the Violation. At last, compare with the decision threshold value threshold: if , it is determined as LDoS violations; else, there is no violations.

4.2. Establishing Three-Component HMMs

In order to apply TF-HMM, we extract multiobserved features of network traffic, including WSFB feature, PSD of WASPQ feature, and DRRF feature. They constitute three-dimensional observation state sequence. Each sequence is modeled by a component HMM. Three-component HMMs together make up TF-HMM.

4.2.1. F-HMM

In order to reduce network QoS, spoofed TCP/IP packets must be used. In microscopic view, attackers usually use random number to fill internal attribute fields of the forged packet, resulting in vast differences with the real data packet. We choose flag bits in TCP packet header as a microscopic feature to describe a slight internal change of packet attribute fields.

To enlarge differences of flag bits between the forged packets with the real ones, we define different weights to different flag bits [26], as in Table 1.

Next, we achieve the weighted summation of flag bits (WSFB) by using

So we can construct a component HMM based on the observing sequence of WSFB; simply mark it as F-HMM.

4.2.2. P-HMM

Paper [27] indicates that attack data packet occupies a certain proportion in router buffer queue at LDoS attack, and the greater the damage is, the higher the proportion is. At the same time, paper [28, 29] concludes that attackers must use the data packet as short as possible to achieve a good attack effect, which results in an obvious decrease of the average size of packets in buffer queue under attacks than under normal conditions. We introduce the weighted average size of packet in queue (WASPQ) to describe this periodicity change in macroscopic view.

Let the number of packets in queue when at sampling time be and let each size of packet be , . In order to highlight the characteristic that the shorter the packet, the more important, we introduce weight , , and calculate the WASPQ value as follows:

In order to depict inherent periodic feature of LDoS attack, we take as the discrete signal series and sample it with a period of 0.1 sec. The change of value with and without attacks is modeled by a random process: , where is a constant time interval, which we assume 0.1 sec, and is a set of positive integers, and, at each time point , is a random variable, representing the total number of in .

To study the periodicity embedded in the sequence, we use its autocorrelation function in discrete time as follows:

The captures the correlation of the sequence and itself at interval . If there is any periodicity existing, autocorrelation function is capable of finding it.

To figure out the periodicity embedded in the sequence, we convert the autocorrelation time series by discrete Fourier transform (DFT) to generate the power spectrum density (PSD) as follows:where is the -point DFT, .

We note that we use the standard periodogram rather than Welch’s method of averaged periodogram [30]. This is because in our work we are interested in the detection and estimation of a single periodic feature, which is better achieved using the standard periodogram as discussed in [31].

Therefore, we can get the component HMM based on the PSD of WASPQ feature, simply referred to as P-HMM.

4.2.3. D-HMM

In a normal TCP session of two-way communications, the request flow is limited by the response flow [32]. In the macroscopic view, the difference value between them should remain relatively stable normally. In case of LDoS attacks, a huge number of forgery request packets will lead to a sharp increase of the difference. Therefore, we introduce the difference between request/response flows (DRRF) to represent the difference change.

Let the sequence be the difference value between request flow and response flow; where is a request flow and is a response flow.

Usually, is closely related to the network size, the number of hosts, and the sampling time. In order to counteract the influence of them, we convert it as follows:

In formula (15), could be expressed as a recurrence relation of , where is a custom constant, . Thus, by using formula (16), we can get , which will not be impacted by factors mentioned above. Instead, it is simply about current network traffic. We choose as another macroscopic feature to indicate the overall change of two-way communications caused by LDoS attacks.

So we can establish a component HMM based on DRRF feature, simply referred to as D-HMM.

4.3. Adjusting Threshold Dynamic

Enlightened by load-shedding method and Kaufman algorithm [33], we adjust the threshold value dynamically for improving the detection precision.

Let the denote the mapping variable of the system effective payload and our algorithm threshold in the th time span. Define . The range of values is in , where is a rather small but not constant. This is because if is , all data flows are not allowed to pass through it. Hypothesize that, right at the th time over, the actual payload in the system is , and is the maximum number of payload, so we get . could be presented in a recursive way as follows:

And since , we can get the final equation of ; that is,where .

In this way, threshold value could be computed out by .

5. Experiments and Performance Results

In this section, we firstly introduce experimental environment setup. Then, we compare the normal flow with the attack one in aspect of the periodicity of WASPQ and the output of TF-HMM. Based on the comparisons, we validate the sufficient sensitivity of TF-HMM. Finally, we evaluate the performance results of TF-HMM in terms of detection rate, false-positive rate, average detection rate, training time, and recognition time.

5.1. Experimental Environment Setup

Data acquisition in real LDoS attacks is very difficult. Enlightened by papers [3436], we construct experimental data by fusing controlled attack flows into real network background traffic.

To generate attack data, we have built a controlled experimental platform. 60 VMware hosts based on Windows XP system are chosen as user hosts. The collector and analyzer of network traffic are installed at Ubuntu 12.04 with Quad core 2.4 GHz CPU and 4 G RAM. We install Zombie tools at part of user hosts as bots. The controlled LDoS attack is launched by these bots, and then our experimental attack data could be achieved.

Accordingly, we choose a day’s network traffic of a primary node in CERNET backbone networks as our experimental background traffic. There are 305985 records in the time window of 10 minutes. After the preprocessing, the background data contains 19877 hosts. Then, we fuse the attack data into the background traffic to evaluate TF-HMM performance.

5.2. Periodicity Analysis of WASPQ in P-HMM

The most obvious contrast between LDoS and normal flow is different periodicity in frequency domain. We firstly compare the normal WASPQ value with the attack WASPQ value.

As illustrated in Figure 2(a), in normal condition, the value of WASPQ is relatively high, almost 1100, because of the small proportion of short data packet in cache queue. In case of LDoS attacks, attackers use massive number of very short data packet to launch suddenly, and the value of WASPQ declines abruptly as shown in Figure 2(b), from about 1100 to 50. This is due to the fact that we use the weighted approach and highlight the importance of short packet in WASPQ calculation. We go on to draw the according periodograms of Figures 2(a) and 2(b). As you can see in Figure 2(d), in case of LDoS attacks, the change of WASPQ has obvious periodicity, while normal flow has none in Figure 2(c).

Next, we draw the corresponding PSD of WASPQ, as shown in Figure 3. We can see that there is a very wide frequency band in normal condition, but when attacking, the PSD value is almost below 51.5 Hz, and there is no distribution in higher frequency bands. We calculate the cumulative traffic spectrum (CTS) [5] of PSD, as shown in Figure 4. 98.65% power of attack flow distributes under 51.5 Hz. Relatively, 39.44% power of normal flow is lower than 51.5 Hz. The huge difference can make P-HMM the better detection sensitivity.

5.3. Comparison Output of TF-HMM in Normal and in Attack

In order to validate the sensibility of TF-HMM, we extract 30 seconds normal flow fragment firstly. Secondly, we extract 30 seconds fragment of LDoS violation and overlap them to one time axis. As shown in Figure 5, in normal, the value fluctuate in the range of −40~−984, while, under attacking, the peak value could reach 2.4~55 times more than normal value, or even larger. The red curve in Figure 5 obviously shows the 5 impulse low-rate violations, so it can be seen that TF-HMM has enough detection sensitivity to identify LDoS attacks hidden in legitimate network traffic.

5.4. Detection Rate and False-Positive Rate

In this section, we compare TF-HMM with representative nonparametric CUSUM algorithm [14] and PSD method [12] in detail. We focus on the detection accuracy and false positives of three algorithms in different network traffic. In order to evaluate impartially, various network traffics are employed in the following experiments, including different network utilization rates and attack intensions with or without legitimate periodicity flows. For simplicity, we call legitimate periodicity flows the interference in the sequel.

First, define detection rate as

Here, is the number of attacks which have been detected correctly. is the number of real attacks existing.

Next, define false-positive rate aswhere is the number of alarms by the detection algorithm and the difference between and is the number of false positives.

The experiment results are shown as in Table 2.

(1) Without Attacks and without the Interference (See No. 1 Group). There are no periodicity flows, so periodicity-based algorithms (PSD and TF-HMM’s P-HMM) give no false positives. However, CUSUM algorithm shows 3 false positives. This is because it is based on traffic volume accumulated method in the time domain, having no analysis capabilities of frequency domain.

(2) Without Attacks and with the Interference (See No. 2 Group). When injecting the interference flows and increasing utilization rate of network, false positives start appearing in the PSD algorithm but not in TF-HMM. This is due to the fact that the PSD cannot differentiate between the periodicity of the interference flows and one of pulse attacks, just capturing the periodicity. While TF-HMM’s P-HMM can not only find the periodicity of flows but also analyze the WASPQ changes caused by LDoS attacks, it helps TF-HMM make an accurate distinction between legitimate periodicity flows and LDoS pulse flows.

(3) With Attacks and without the Interference (See No. 3 Group). Without the interference, the PSD and TF-HMM can identify exactly 2 times attacks hidden in background traffic based on the obvious periodicity of pulse attacks and show no false positives. But CUSUM still remains relatively high false positives because it is not a learning-oriented algorithm and is not also a frequency-domain-based one.

(4) With Attacks and with the Interference (See No. 4 and No. 5 Group). In No. 4 group, the result from TF-HMM is closer to REAL than other algorithms. Its is 12.12%. Conversely, the of CUSUM reaches up to 57.78%; the of PSD is 37.50%. With a growing intension of attacks and interferences, the of other two methods will be even higher.

In No. 5 group, we increased both of the attack intension and network utilization rate, and the advantages of TF-HMM based on multiple observed features becomes apparent. The of CUSUM is 76.82% and the of PSD is 57.48%, while its is 13.62%, far less than other algorithms. The reason for such low the of TF-HMM is that the two components of F-HMM and P-HMM play an important role.

When massive packets of legitimate periodicity flows and pulse attack flows arrive at the router, the PSD algorithm cannot accurately differentiate between them because it only uses the number of packet arrivals as a single periodic feature to find the periodicity in data sequence. Rather, the P-HMM can identify them because of WASPQ value abnormal decrease by pulse attacks (As illustrated in Figure 2(b)). Furthermore, the F-HMM can detect the packet’s internal attribute fields that have been tampered with, because spoofed packets in pulse attacks result in abnormal fluctuations of WSFB.

The additive effect of combining multidimensional features starts to dominate, so we see a lower false-positive rate of TF-HMM. These provide some of the advantages of detection accuracy in TF-HMM not only with the higher detection rate, but also with the lower false-positive rate.

5.5. Average Detection Rate

In order to evaluate three detection approaches objectively, we varied attack intension, network utilization rate, sampling time, and the interference. Thus, there are obvious differences between every two groups. From the 100 groups of data gained, we calculated their average detection rate as presented in Table 3.

In Table 3, the average detection rate of PSD is obviously higher than CUSUM algorithm because it takes into account the inherent periodicity of LDoS violation. But the false positive rate is not still reduced to a reasonably low level; it limits the improvement of the detection accuracy. In contrast, since TF-HMM combines multiobserved features, its average detection rate reaches 92.78%, which is 1.93 times over CUSUM and 23.39% over PSD. It efficiently overcomes the bottleneck of limiting further increases in detection accuracy.

5.6. Training Time and Recognition Time

The time complexity of algorithms is vital to fast detection and response to QoS violation. Relevant experiments on training time and recognition time of the TF-HMM are sketched as in Figures 6 and 7.

As shown in Figure 6, the most time-consuming one is Baum-Welch algorithm; it is about 5 to 10 times of the other two algorithms; the second one is hybrid algorithm and then K-means algorithm. Furthermore, Baum-Welch algorithm is most sensitive to the length of segment. For example, using the same training sequence, the training time of is 1.68 times more than the one of . But K-means and hybrid algorithms are insensitive to the length of segment.

And yet the recognition time of TF-HMM is short as shown in Figure 7. It is suitable for fast detection and responses to malicious QoS violations. Our ultimate goal is to achieve automated intrusion detection and responses in real time.

6. Conclusions

Current new LDoS violations are more and more characterized by high-distributed low rate. It is very difficult that fast detection and responses to stealthy LDoS streams are hidden in massive legitimate network traffic. The high false-positive rate is still the most striking bottleneck.

To overcome the bottleneck, our research contributions are summarized below in three technical aspects.

(1) Combining Multidimensional Features. Multiple micro- and macrofeatures, including WSFB, WASPQ, and DRRF, are combined together by using MF-HMM. The additive effects of combining multidimensional features make encouraging results on high detection rate with low false-positive rate.

(2) Synthesizing Methods in Frequency Domain and in Time Domain. Leveraging PSD analysis in the component P-HMM, we capture and identify the periodicity of LDoS pulse attacks in frequency domain. Furthermore, we calculate WSFB and DRRF feature in time domain by the components of F-HMM and D-HMM. These components make the accurate matching in detecting LDoS attacks at traffic streaming level.

(3) Adjusting Threshold Value Dynamically. Enlightened by load-shedding method and Kaufman algorithm, we adjust the threshold value dynamically to further reduce the false-positive rate.

For continued effort, we aim to improve the detection accuracy in complicated network traffic and ultimately to a fully automated process of detection and responses to LDoS attacks in real time.

Notations

CTS:Cumulative traffic spectrum
CUSUM:Cumulative Sum
D feature:DRRF feature
DDoS:Distributed denial of service
DFT:Discrete Fourier transform
D-HMM:The component HMM based on D feature
DoS:Denial of service
DR:Detection rate
DRRF:Difference between request/response flows
DWT:Discrete wavelet transform
F feature:WSFB feature
F-HMM:The component HMM based on F feature
LDoS:Low-rate denial of Service
:The output probability of TF-HMM
MF-HMM:Multistream fused hidden Markov model
N-s:Normal subsequence
P feature:PSD of WASPQ feature
P-HMM:The component HMM based on P feature
PSD:Power spectrum density
QoS:Quality of services
Q-s:Questionable subsequence
RoQ:Reduction of quality
TF-HMM:Three-stream fused hidden Markov model
WASPQ:Weighted average size of packet in queue
WSFB:Weighted summation of flag bits.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to acknowledge the support of this work by National Natural Science Foundation of China (Grants nos. 60703023, 90204014) and Technology Development Plan of Jilin Province of China (Grant no. 20090110).