Abstract

The aim of distributed denial-of-service (DDOS) flood attacks is to overwhelm the attacked site or to make its service performance deterioration considerably by sending flood packets to the target from the machines distributed all over the world. This is a kind of local behavior of traffic at the protected site because the attacked site can be recovered to its normal service state sooner or later even though it is in reality overwhelmed during attack. From a view of mathematics, it can be taken as a kind of short-range phenomenon in computer networks. In this paper, we use the Hurst parameter (H) to measure the local irregularity or self-similarity of traffic under DDOS flood attack provided that fractional Gaussian noise (fGn) is used as the traffic model. As flood attack packets of DDOS make the H value of arrival traffic vary significantly away from that of traffic normally arriving at the protected site, we discuss a method to statistically detect signs of DDOS flood attacks with predetermined detection probability and false alarm probability.

1. Introduction

IP Networks are subject to electronic attacks [1]. An intrusion detection system (IDS) collects information from a variety of systems and network sources to analyze the information of attack signs. A network-based IDS monitors the traffic on its network as a data source [2]. For distributed denial-of-service (DDOS) flood attack, an intruder bombs attack packets upon a site (victim) with a huge amount of traffic the sources of which are distributed over the world [3]. Hence the pattern of traffic under DDOS flood attack may suddenly differ significantly from the normal pattern of the arrival traffic. From the perspective of dynamical aspects for limited time interval in physics [4], one may regard this sudden change as a specific “pulse.” Though DDOS flood attack may not be a sole factor to make traffic pattern vary significantly, we assume that secure officers can distinguish significant variation of monitored traffic pattern caused by other known factors (e.g., normally heavy traffic) from DDOS flood attack. Without confusions causing, the term abnormal traffic used in this paper specifically implies a traffic series that has significant variation of traffic pattern caused by DDOS flood attack.

In this research, we ponder two fundamental issues in detection. One is feature extraction of monitored traffic time series. The other is detection scheme that can be used to assure predetermined detection probability and false alarm probability . The first issue will be discussed in Section 2 from a view of feature extraction of traffic based on self-similarity of traffic. The second will be dissertated in Section 3 based on statistical detection. Section 4 will explain the performance analysis of the present detection system. A case study is demonstrated in Section 5. Discussions are given in Section 6, which is followed by conclusions.

2. Feature Extraction of Traffic

2.1. Self-Similar Traffic

Computer scientists in the last decade discovered that traffic is a type of fractal time series. It has the properties of self-similarity, long memory, and multiscales (see e.g., [5]). A commonly used model in traffic engineering is fractional Gaussian noise (fGn) (see e.g., [68]).

Let be Wiener Brownian motion. Let be fractional Brownian motion with the Hurst parameter [9]. Let be Gamma function. Then by using fractional calculus, is expressed by

Let be the increment series of : where a is a real number. Then is fGn [9]. The autocorrelation function (ACF) of fGn in the discrete case is given by where is the intensity of fGn [10]. The normalized ACF of fGn is given by The relationship between the fractal dimension of fGn and H is given by

Approximating the right side of (2.3b) with the second-order differential of , see [9, H15, page 350], for , yields

Let y and R be a traffic series and its ACF, respectively. Then according to (2.5), where ~ implies the asymptotical equivalence under the limit and is a constant [11].

The ACF (2.5) is nonsummable for , implying long-range dependence (LRD). Hence H is a measure of LRD of traffic. It is kindly noted that LRD of traffic does not mean that DDOS attacking is a long-range phenomenon. On the contrary, DDOS attacking and its detection are short-range phenomena since both sides, namely, an attacker and its opponent, are engaged with each other during a short period of time. Such a battle makes local irregularity of traffic vary dramatically [12].

Without losing generality, we consider traffic series y in the discrete case. By dividing y into nonoverlapping blocks of size L and averaging over each block, we obtain another series given by According to the analysis in [5, 9, 11], in the fGn sense, one has where Var implies the variance operator. Thus the self-similarity is measured by H.

A series encountered in engineering is usually of finite length. Let y be a series of P length. Divide it into N nonoverlapping sections. Each section is divided into M nonoverlapping segments. Divide each segment into K nonoverlapping blocks. Each block is of L length. Let be the series with aggregated level L in the mth segment of the nth section . Let be the H value of Let be the measured ACF of in the normalized case. The theoretic ACF form corresponding in the fGn sense is given by The above expression exhibits the multifractal property of traffic as can be seen from [13].

Let be the cost function. Then one has Averaging in terms of index m yields representing the H estimate of the series in the nth section.

Usually, for . However, stationarity of traffic time series implies that at a specific site is a number falling within a certain confidence interval [5, Paragraph 5, Section 5, page 966]. In practical terms, a normality assumption for is quite accurate in most cases for regardless of probability distribution function of H [14]. Thus we take as a mean estimate of H of x, where E is the mean operator. It can be taken as a template of H of x for the purpose of statistical detection. The appendix gives a case of the H estimation of a real-traffic series to clarify the reasonableness of H in featuring traffic time series.

2.2. Characterizing Traffic Time Series with

Let x be normal traffic time series. Normally, the site serves x peacefully though x may sometimes be unpleasantly delayed because of the normal traffic jam. The arrival traffic x is contributed by many connections distributed all over the world. Figure 1 shows x contributed by traffic from d connections. From previous discussions, we see that x can be characterized by the Hurst parameter and we denote it as .

Assume that the site is intruded by DDOS flood attacking. Then actual arrival traffic (abnormal traffic) consists of normal traffic x and attack traffic a, see Figure 2, where a is contributed by e connections. We use as a feature of y.

3. Detection Method and System Structure

To explain our detection principle, we introduce three terms. Correctly recognizing an abnormal sign is termed detection; failing to recognize it, miss; mistakenly recognizing a normal as abnormal is a false alarm.

Let . Then represents the deviation of H of monitored traffic time series. Let be the threshold. Then the detection hypotheses are as follows. , implies detection, while represents false alarm, where stands for H which is not used as the template but obtained when there is no attacking. Clearly, and are random variables. Mathematically, there are many distance measures available [1517], but the following works well:

According to the previous discussions, we give the system diagram in Figure 3. The measured arrival traffic first passes through an H estimator. The result of H estimator goes to template database to produce the template . In addition, it outputs an online estimate of . and are compared in the distance detector. The comparison result is fed into threshold detector to compare with a given threshold V. In the stage of decision analysis, the output of the threshold detector is analyzed and its output gives a sign of detection according to preset detection probability and false alarm probability.

4. Performance Analysis

With the partition explained in Section 2, we see that there is a value of representing the deviation of H of y in each segment. Therefore, in each section, is a random sequence of M length. Denote as the expectation of in each section. Then is a random sequence of N length. In the case of well obeys Gaussian distribution [14]. For the simplicity, we still denote as .

4.1. Detection Probability

Let and be the expectation and the variance of , respectively. Then Let Then detection probability is given by

4.2. False Alarm Probability

Let and be the mean and the variance of . Then false alarm probability is given by

4.3. Miss Probability

Let be miss probability. Then

Generally, . Besides, the numeric computation in data processing can be arranged such that . In this case, three probabilities are given by Figures 46 show the curves of three distributions, respectively. As , high implies low and vice versa.

4.4. Threshold and Detection Region

As can be seen from the previous discussions, the selection of a threshold value is crucial to our system. In fact, given a false alarm probability f, we want to find the threshold such that . Clearly, If and when the selected precision is 4, we obtain Given a detection probability d, we want to find the threshold such that . Clearly, In the case of , Therefore, when and and are assured. That is, In the case of and , The constraint of (4.12) is given by .

Obviously, the detection region is the intersection of three probability functions. Under the condition of and , the detection region is shown in Figure 7.

5. A Case Study

Suppose the template as described in the appendix. Assume that the confidence level is 99.9999%. Thus we suppose or during the transition process of intrusion. In this case study, 1000 points of in or (0.7673,0.9900) are randomly selected to simulate the abnormal traffic deviating from the normal one. The error sequence is indicated in Figure 8. By the numeric computation, we obtain and . Therefore, we obtain the probability distributions for detection, false alarm and miss as shown in Figure 9. Under the conditions of and , we obtain and . Hence when we select , we have 99.9999% confidence to say that and are assured, which can be easily observed from Figure 9.

6. Discussions

Since Yahoo servers were successfully attacked in 2001, the issue of detecting DDOS flood attacking has been paid much attention to. Various methods and systems have been proposed, see, for example, [1825]. As known, traffic under DDOS flood attack must be significantly different from that of normal one [25]. Otherwise, DDOS flood attack would have no effect. From this point of view, the value of H of traffic under DDOS flood attacks is considerably different from that of normal one, see [12] for details.

For a stationary random time series of finite length, ACF and power spectrum density (PSD) function are commonly used in engineering for feature extraction in statistical classifications [16, 17]. However, the PSD of traffic does not exist in the domain of ordinary functions since it has long memory [8]. To avoid such a difficulty in mathematics, consequently, ACF of traffic is considered for feature extraction in our early work [25]. This paper focuses on detection of local variations of traffic based on the self-similarity of traffic. Thus it suggests a new method that substantially develops the work of [25], from the point of view of traffic pattern matching, because feature extraction of traffic time series by using a single parameter H makes pattern matching more efficient.

7. Conclusions

We have discussed the characterization of the local irregularity of traffic by . We have explained a principle of statistical detection to capture signs of DDOS flood attacking with predetermined detection probability and false alarm probability based on the variation of the local irregularity of traffic.

Appendix

Demonstration of Estimation of a Real-Traffic Series

This appendix gives a demonstration with a real-traffic series, named LBL-PKT-4 [26, 27]. Denote as the series of LBL-PKT-4, indicating the number of bytes in the ith packet. The length of that series is 1.3 million. The first 1024 points of that series is plotted in Figure 10(a). Divide into 32 nonoverlapping sections. Computing H in each section yields as shown in Figure 10(b). Its histogram is indicated in Figure 10(c).

figA1
Figure 10: Verification of statistical invariable H. (a) A real-traffic time series; (b) estimate ; (c) histogram of .

According to (2.13), we have . The confidence interval with 95% confidence level is [0.7670,0.7672]. Hence we have 95% confidence to say that the H estimate in each section of that series takes as its approximation with fluctuation not greater than . Further, it is easy to obtain that the confidence interval with 99.9999% confidence level is [0.7669, 0.7673]. Hence we have 99.9999% confidence to say that the H estimate in each section of that series takes as its approximation with fluctuation not greater than .

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under the project Grant no. 60573125. Wei Zhao’s work was also partially supported by the NSF (USA) under Contracts no. 0808419, 0324988, 0721571, and 0329181. Any opinions, findings, conclusions, and/or recommendations in this paper, either expressed or implied, are those of the authors and do not necessarily reflect the views of the agencies listed above.