- About this Journal ·
- Abstracting and Indexing ·
- Advance Access ·
- Aims and Scope ·
- Annual Issues ·
- Article Processing Charges ·
- Articles in Press ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents

Mathematical Problems in Engineering

Volume 2010 (2010), Article ID 962435, 14 pages

http://dx.doi.org/10.1155/2010/962435

## Note on Studying Change Point of LRD Traffic Based on Li's Detection of DDoS Flood Attacking

^{1}Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200240, China^{2}School of Information Security Engineering, Key Laboratory of Information Security Integrated Management Research, Shanghai Jiao Tong University, Shanghai 200240, China

Received 7 February 2010; Accepted 11 March 2010

Academic Editor: Ming Li

Copyright © 2010 Zhengmin Xia et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Distributed denial-of-service (DDoS) flood attacks remain great threats to the Internet. To ensure network usability and reliability, accurate detection of these attacks is critical. Based on Li's work on DDoS flood attack detection, we propose a DDoS detection method by monitoring the Hurst variation of long-range dependant traffic. Specifically, we use an autoregressive system to estimate the Hurst parameter of normal traffic. If the actual Hurst parameter varies significantly from the estimation, we assume that DDoS attack happens. Meanwhile, we propose two methods to determine the change point of Hurst parameter that indicates the occurrence of DDoS attacks. The detection rate associated with one method and false alarm rate for the other method are also derived. The test results on DARPA intrusion detection evaluation data show that the proposed approaches can achieve better detection performance than some well-known self-similarity-based detection methods.

#### 1. Introduction

DDoS flood attacks have been one of the most frequently occurring attacks that badly threaten the stability of the Internet. For DDoS flood attack, an intruder undermines the availability of computer systems or services by exploiting the inherent weakness of the Internet system architecture, and overwhelming the target with a huge amount of traffic flows launched through multiple zombies. The attack process is a relatively simple, yet very powerful technique to attack the Internet resources. Therefore, accurate detection of these attacks is critical to the Internet community.

As shown by Leland et al. [1], and supported by a number of later research [2–7], the measurements of local and wide-area network traffic, wire-line and wireless network traffic all demonstrate self-similarity and long range dependence (LRD) characteristics at large time scales. The work in [8] points out that self-similarity of the Internet traffic is attributed to a mixture of the actions of a number of individual users, hardware and software behaviors at their originating hosts, multiplexed through an interconnection network. In other words, this self-similarity always exists regardless of the network type, topology, size, protocol, or the type of services the network is carrying. On the other hand, it is reported in [9–15] that when DDoS attack happens, the self-similarity of network traffic will change significantly. Thus, by monitoring the change of the Hurst parameter, the key parameter to describe the self-similarity of a self-similar process, DDoS attacks may be detected.

Much work has been done to detect DDoS attack by recognizing the pattern of self-similarity in the literature. In [16], Li deduced the statistical characteristic of network traffic autocorrelation function under normal condition and DDoS attack and gave the detection threshold based on the preselected detection rate and false alarm rate. In [11], Li quantitatively described the statistics of abnormal traffic and suggested that the Hurst parameter of network traffic under DDoS attack tends to be significantly smaller than that of normal traffic. Li also demonstrated in [11] that the average Hurst parameter of fixed number of normal traffic pieces follows Gaussian distribution at large time scales and when the attack occurs, this statistical property may in general change.

Based on Li’s work, we propose a DDoS detection method by monitoring the Hurst variation. Specifically, we use an autoregressive (AR) system to estimate the Hurst parameter of normal traffic. If the actual Hurst parameter varies significantly from the estimation beyond a threshold, we assume that DDoS attack happens. Then we propose two methods to determine the change point of Hurst parameter, that is, to determine the threshold of Hurst variation that is used to distinguish attack traffic from normal traffic. The detection rate associated with one method and false alarm rate for the other method are also derived. The experiment results on Defense Advanced Research Projects Agency (DARPA) data sets indicate that the proposed detection methods are effective in detecting DDoS flood attacks, and can achieve better detection performance than some well-known self-similarity-based detection methods.

The rest of this paper is organized as follows. Section 2 briefly introduces the concept of self-similarity and the Hurst parameter estimation. Section 3 explains the proposed detection process based on the Hurst variation. Section 4 discusses the two methods for determining the change point of LRD traffic. Section 5 presents the performance evaluation and analysis of the proposed detection methods with traffic data from DARPA, followed by a brief conclusion in Section 6.

#### 2. Preliminaries

##### 2.1. Self-Similar Network Traffic

Self-similarity means that the sample paths of the process and those of rescaled version , obtained by simultaneously dilating the time axis *t* by a factor , and the amplitude axis by a factor cannot be statistically distinguished from each other. Equivalently, it implies that an affine dilated subset of one sample path cannot be distinguished from its whole. *H* is called the Hurst parameter. For a general self-similar process, *H* measures the degree of self-similarity.

Network traffic arrival process; is a discrete time process, so the discrete time self-similarity definition is given below. Let be a wide-sense stationary discrete stochastic traffic time series with constant mean , finite variance , and autocorrelation function , . Let be an -order aggregate process of ; then For each defines a wide-sense stationary stochastic process with autocorrelation function .

*Definition 2.1. *A second-order stationary process *X* is called exact second-order self-similar (ESOSS) with Hurst parameter , if the autocorrelation function satisfies
where and .

*Definition 2.2. *A second-order stationary process *X* is called asymptotical second-order self-similar (ASOSS) with Hurst parameter , if the autocorrelation function satisfies
where and .

In the field of network traffic theory, it is more practical to use ASOSS.

##### 2.2. Hurst Parameter Estimation

To date, several methods have been proposed to estimate the Hurst parameter. Some of the most popular ones include the aggregated variance, local whittle, and the wavelet-based methods [17–21]. In this paper, we use the method proposed by Li [11] to estimate the Hurst parameter of network traffic. The estimation process is summarized as follows. For more information please refer to [11].

Let be the autocorrelation function of . Then where stands for the asymptotical equivalence under the limit , , and .

By taking fractional Gaussian noise as an approximate model of , one has where and are the variances of -order aggregate process and .

Divide the traffic series into nonoverlapping sections, and each section is further divided into nonoverlapping segments. Then the autocorrelation function of the th segment in the th section is given by where is the Hurst parameter of the th segment in the th section traffic piece. Let be the cost function. Then one has

Averaging in terms of yields where represents the Hurst parameter in the th section.

#### 3. DDoS Detection Based on Hurst Variation

Given discrete network traffic trace time series , and , let and be normal traffic and abnormal traffic, respectively and the DDoS flood attack traffic during transition process of attacking. and are uncorrelated [11], so *Y* can be expressed as .

Figure 1 illustrates the components of normal traffic, attack traffic, and abnormal traffic. represents the number of bytes sent out by node at time for normal network services, stands for the number of bytes sent out by node at time for DDoS flood attack, and is the total traffic the target received at time .

Based on the theorems in [22], we understand that no matter whether is a self-similar process or not, as long as is a second-order stationary self-similar process, will be a self-similar process, but the degree of self-similarity may change. Let , and be the autocorrelation functions of , , and , respectively. Li in [11] proved that during the transition process of attacking, is significant, where . For each value of Hurst parameter in the range of , there is exactly one corresponding autocorrelation function [23]. Therefore, is significant means that changes significantly when attack occurs, where and are the Hurst parameters of and , respectively. Based on this observation, we propose a DDoS detection method by monitoring the Hurst variation in this paper. The details of the detection process are explained as follows.

After the Hurst parameter estimation of each section using (2.7), we apply autoregressive (AR) model to determine the self-similarity of traffic without attacks. That is, where is the estimated Hurst parameter of normal traffic section is the order of AR model, and are the coefficients of AR model, which can be obtained by using the least-squares method [24]. Other models such as moving average (MA) model and autoregressive moving average (ARMA) model also can be used in our method in the same way.

Since the Hurst parameter without any attack follows Gaussian distribution in most cases for [11], the probability distribution function of is given by where where is the number of traffic section. and are the mean and variance of the Hurst parameter , respectively.

Using linear estimation, the change of self-similarity is given by which can be regarded as the sum of independent Gaussian variables. So also follows Gaussian distribution. The mean and variance of are obtained by So the probability distribution function of is expressed by

The attack detection can be formulated as the following hypothesis testing problem.(A0) The change of self-similarity is within a threshold indicating normal network traffic.(A1)The change of self-similarity is outside the threshold indicating abnormal network traffic caused by DDoS attacks.

It can be seen that a proper threshold of is the key to successfully detect DDoS attacks. The threshold is also the change point of Hurst parameter whereby Hurst variation beyond this point implies DDoS attack. In the next section, we propose two methods for change point detection, one based on order statistic and the other based on maximum likelihood estimate.

#### 4. Determining Change Point of LRD Traffic

In the following discussion, the change point of self-similarity is equivalent to the threshold that is used to distinguish attack traffic from normal traffic. We propose two methods to determine the change point and calculate the associated detection rate for one method and false alarm rate for the other method.

##### 4.1. Order Statistic-Based Detection

For order statistic-based detection, are first sorted in an increasing order to reference cells as The detection threshold is obtained by selecting the th-order-ranked to represent the normal traffic plus measured noise. The input is multiplied to that cell by a scalar factor , and the threshold is expressed by

The traffic in section is considered normal if the change of self-similarity otherwise, the traffic is considered abnormal, indicating possible attacks in that section. is a random variable, and its probability distribution function is expressed by where is the probability distribution function of , and is the distribution function of .

We define the term detection as correctly recognizing an abnormal sign. The detection rate is obtained by averaging the conditional probability of detection under the given threshold over all possible values of the threshold. That is,

Substituting (3.6) and (4.3) into (4.4) yields

##### 4.2. Maximum Likelihood Estimate-Based Detection

Considering the independence between and , , the joint probability density function of is obtained by Taking the natural logarithm on both sides of (4.6), we have In order to get the maximum likelihood estimate (MLE) of and , we have By solving (4.8), one has So the probability distribution function of is expressed by

Let the detection threshold be . The traffic in section is considered normal if the change of self-similarity ; otherwise, the traffic is considered abnormal, indicating possible attacks in that section.

Define false alarm as mistakenly recognizing a normal traffic as abnormal traffic. The false alarm rate of the proposed detection system is expressed by So when given the preselected false alarm rate , the detection threshold is given by where is the standard normal distribution function.

#### 5. Experiments and Analysis

##### 5.1. Data Preparation

To evaluate the proposed detection methods, we use two traffic data sets from DARPA 1999 [25]. The DARPA 1999 data sets are from the Information Systems Technology Group, MIT Lincoln Laboratory, under DARPA ITO and Air Force Research Laboratory. These traffic data sets are the first standard for the evaluation of computer network intrusion detection systems. The first traffic set collected from 8:20:00.0 to 11:10:39, 1 March (Monday), 1999, named DARPA1999-week1-Monday-inside, is an attack free series. The second traffic set collected from 8:20:00.0 to 16:24:41.5, 8 March (Monday), 1999, named DARPA1999-week2-Monday-inside, is an attack contained series. 3 types of DDoS attacks are contained in this data set, which are pod, back, and land separately. We rename the first-attack free traffic set as D99-W1-1-i and second attack contained traffic set as D99-W2-1-i for short. The traffic traces for these two data sets are displayed in Figure 2. The merging time scale is 100 ms.

##### 5.2. Test Results and Analysis

After the 100 ms merging, the number of data in D99-W1-1-i is 102400 and the number of data in D99-W2-1-i is 290816. Combine these two traffic sets into one and name it as D99. D99 is divided into 64 sections () and each section is further divided into 12 segments (). So the length of each traffic segment is 512. We use (2.7) to estimate the Hurst parameter of the *w*th traffic segment in the *n*th section and , then average the in terms of . After that, we obtain the Hurst parameter in the th section, as shown in Figure 3.

We apply AR model with order to estimate the Hurst parameter of the traffic. The Hurst variation of the *n*th traffic section is obtained using (3.4). The results are shown in Figure 4.

For the order statistic-based detection method, we first sort in an increasing order and then choose the scale factor . After selecting a value *, *the detection threshold is calculated according to (4.2). Figure 5 shows the thresholds when is 40, 45, and 50, respectively.

Form Figure 5, we can see that when is smaller , the detection threshold is lower. In this case, more traffic sections will have Hurst variations above the threshold thus more attacks are declared. However, note that a smaller may also introduce more false alarms, mistakenly recognizing more normal traffic as attack traffic.

For the maximum likelihood estimate-based detection, we compute the detection threshold using (4.12). Figure 6 shows the resulted thresholds when the pre-selected false alarm rate is 1%, 5%, and 10%, respectively.

Form Figure 6, we can see that when the pre-selected false alarm rate is higher , the resulted threshold is lower. This is in accordance with our expectation because when the pre-selected false alarm rate is high, it is allowed to mistakenly treat some normal traffic as attacks, thus the detection threshold is low.

Figure 7 shows the detection rate versus false alarm rate for both of the detection methods. We can see from the figure that both of the two detection methods can achieve reasonable detection rate, but the detection performance of maximum likelihood estimate-based method is better than the order statistic-based method. Meanwhile, we can see that for both detection methods, a minor increase of the results in a significant increase in when is lower than 0.1. Which means if we allow a little bit more false alarm, the detection rate will be significantly improved. We can also observe from Figure 7 that when is higher than 0.9, a minor increase in will require a significant increase in . That is, if we want to improve the detection rate in the range greater than 0.9, we have to tolerate much more false alarms.

##### 5.3. Comparison with Existing Detection Methods

In this section, we compare our proposed two detection methods with Allen's method [26] and Ren's method [27], for these are two well-known self-similarity-based detection methods in the literature. Both of these methods define a range of Hurst parameter for normal traffic. For Allen's method, the Hurst range is 0.5, 0.99 and the range is 0.65, 0.85 in Ren's method. Traffic section with a Hurst outside the range is treated as abnormal traffic.

Table 1 compares the detection performance of Allen's method, Ren's method, and our proposed methods. Ren's detection method achieves higher detection rate than Allen's method at the cost of slightly higher false alarm rate . We first use the Allen's false alarm rate 34% as the false alarm rate of the proposed two detection methods. The proposed order statistic-based detection method can archive detection rate as high as 87%, and maximum likelihood estimate-based detection method archives detection rate as high as 92%, both higher than the detection rate of Allen's method. Similarly, we use the Ren's false alarm rate 38% as the false alarm rate of the proposed two detection methods. The detection rates of the proposed detection methods are also higher than that of the Ren's method.

#### 6. Conclusion

In this paper, we have proposed a DDoS detection method by monitoring Hurst variation based on Li's work on DDoS attack detection. Meanwhile, we have discussed two methods for determining the change point of LRD traffic, which can be used to distinguish attack traffic from normal traffic. Experiments have been conducted to evaluate the performance of our proposed scheme, and the test results show that the proposed detection methods outperform existing self-similarity based detection methods, and can significantly enhance the reliability and robustness of the DDoS flood attack detection.

#### Acknowledgments

This work was supported in part by the National High Technology Research and Development Program of China under Grant no. 2007AA01Z473 and the National Natural Science Foundation of China (NSFC) under Grants no. 60573125, no. 60873264, no. 60605019 and no. 60702047. The authors would also like to thank the reviewers for their constructive comments that have considerably increased the quality of this paper.

#### References

- W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, “On the self-similar nature of ethernet traffic (extended version),”
*IEEE/ACM Transactions on Networking*, vol. 2, no. 1, pp. 1–15, 1994. View at Publisher · View at Google Scholar · View at Scopus - V. Paxson and S. Floyd, “Wide area traffic: the failure of Poisson modeling,”
*IEEE/ACM Transactions on Networking*, vol. 3, no. 3, pp. 226–244, 1995. View at Publisher · View at Google Scholar · View at Scopus - O. Tickoo and B. Sikdar, “On the impact of IEEE 802.11 MAC on traffic characteristics,”
*IEEE Journal on Selected Areas in Communications*, vol. 21, no. 2, pp. 189–203, 2003. View at Publisher · View at Google Scholar · View at Scopus - M. Li, “Fractal time series—a tutorial review,”
*Mathematical Problems in Engineering*, vol. 2010, Article ID 157264, 26 pages, 2010. View at Publisher · View at Google Scholar · View at MathSciNet - M. Li and W. Zhao, “Representation of a stochastic traffic bound,”
*IEEE Transactions on Parallel and Distributed Systems*. In press. - M. Li and S. C. Lim, “Modeling network traffic using generalized Cauchy process,”
*Physica A*, vol. 387, no. 11, pp. 2584–2594, 2008. View at Publisher · View at Google Scholar · View at Scopus - M. Li and W. Zhao, “Variance bound of ACF estimation of one block of fGn with LRD,”
*Mathematical Problems in Engineering*, vol. 2010, Article ID 60429, 14 pages, 2010. View at Publisher · View at Google Scholar - W.-B. Gong, Y. Liu, V. Misra, and D. Towsley, “Self-similarity and long range dependence on the internet: a second look at the evidence, origins and implications,”
*Computer Networks*, vol. 48, no. 3, pp. 377–399, 2005. View at Publisher · View at Google Scholar · View at Scopus - W. Schleifer and M. Männle, “Online error detection through observation of traffic self-similarity,”
*IEE Proceedings: Communications*, vol. 148, no. 1, pp. 38–42, 2001. View at Publisher · View at Google Scholar · View at Scopus - J. T. Wang and G. Yang, “An intelligent method for real-time detection of DDoS attack based on fuzzy logic,”
*Journal of Electronics*, vol. 25, no. 4, pp. 511–518, 2008. View at Publisher · View at Google Scholar · View at Scopus - M. Li, “Change trend of averaged Hurst parameter of traffic under DDOS flood attacks,”
*Computers and Security*, vol. 25, no. 3, pp. 213–220, 2006. View at Publisher · View at Google Scholar · View at Scopus - C. S. Sastry, S. Rawat, A. K. Pujari, and V. P. Gulati, “Network traffic analysis using singular value decomposition and multiscale transforms,”
*Information Sciences*, vol. 177, no. 23, pp. 5275–5291, 2007. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus - M. F. Rohani, M. A. Maarof, A. Selamat, and H. Kettani, “Continuous LoSS detection using iterative window based on SOSS model and MLS approach,” in
*Proceedings of the International Conference on Computer and Communication Engineering (ICCCE '08)*, pp. 1005–1009, Kuala Lumpur, Malaysia, May 2008. View at Publisher · View at Google Scholar · View at Scopus - M. Li and W. Zhao, “Detection of variations of local irregularity of traffic under DDOS flood attack,”
*Mathematical Problems in Engineering*, vol. 2008, Article ID 475878, 11 pages, 2008. View at Publisher · View at Google Scholar · View at Scopus - M. Li, J. Li, and W. Zhao, “Experimental study of DDOS attacking of flood type based on NS2,”
*International Journal of Electronics and Computers*, vol. 1, no. 2, pp. 143–152, 2009. View at Google Scholar - M. Li, “An approach to reliably identifying signs of DDOS flood attacks based on LRD traffic pattern recognition,”
*Computers and Security*, vol. 23, no. 7, pp. 549–558, 2004. View at Publisher · View at Google Scholar · View at Scopus - C. Cattani and A. Kudreyko, “On the discrete harmonic wavelet transform,”
*Mathematical Problems in Engineering*, vol. 2008, Article ID 687318, 7 pages, 2008. View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - C. Cattani and A. Kudreyko, “Application of periodized harmonic wavelets towards solution of eigenvalue problems for integral equations,”
*Mathematical Problems in Engineering*, vol. 2010, Article ID 570136, 8 pages, 2010. View at Publisher · View at Google Scholar - C. Cattani, “Harmonic wavelet analysis of a localized fractal,”
*International Journal of Engineering and Interdisciplinary Mathematics*, vol. 1, no. 1, pp. 35–44, 2009. View at Google Scholar - E. G. Bakhoum and C. Toma, “Mathematical transform of traveling-wave equations and phase aspects of quantum interaction,”
*Mathematical Problems in Engineering*, vol. 2010, Article ID 695208, 15 pages, 2010. View at Publisher · View at Google Scholar · View at MathSciNet - G. Toma, “Specific differential equations for generating pulse sequences,”
*Mathematical Problems in Engineering*, vol. 2010, Article ID 324818, 11 pages, 2010. View at Publisher · View at Google Scholar · View at MathSciNet - S. Song, J. K. Y. Ng, and B. Tang, “Some results on the self-similarity property in communication networks,”
*IEEE Transactions on Communications*, vol. 52, no. 10, pp. 1636–1642, 2004. View at Publisher · View at Google Scholar · View at Scopus - J. Beran,
*Statistics for Long-Memory Processes*, vol. 61 of*Monographs on Statistics and Applied Probability*, Chapman and Hall, New York, NY, USA, 1994. View at MathSciNet - D. He and H. Leung, “Network intrusion detection using CFAR abrupt-change detectors,”
*IEEE Transactions on Instrumentation and Measurement*, vol. 57, no. 3, pp. 490–497, 2008. View at Publisher · View at Google Scholar · View at Scopus - http://www.ll.mit.edu/mission/communications/ist/index.html.
- W. H. Allen and G. A. Marin, “The LoSS technique for detecting new denial of service attacks,” in
*Proceedings of IEEE South East Conference*, pp. 302–309, Greensboro, NC, USA, March 2004. View at Scopus - X. X. Ren, R. C. Wang, and H. Y. Wang, “Wavelet analysis method for detection of DDoS attack on the basis of self-similarity,”
*Frontiers of Electrical and Electronic Engineering in China*, vol. 2, no. 1, pp. 73–77, 2007. View at Publisher · View at Google Scholar · View at Scopus