Research Article  Open Access
RealTime Detection of ApplicationLayer DDoS Attack Using Time Series Analysis
Abstract
Distributed denial of service (DDoS) attacks are one of the major threats to the current Internet, and applicationlayer DDoS attacks utilizing legitimate HTTP requests to overwhelm victim resources are more undetectable. Consequently, neither intrusion detection systems (IDS) nor victim server can detect malicious packets. In this paper, a novel approach to detect applicationlayer DDoS attack is proposed based on entropy of HTTP GET requests per source IP address (HRPI). By approximating the adaptive autoregressive (AAR) model, the HRPI time series is transformed into a multidimensional vector series. Then, a trained support vector machine (SVM) classifier is applied to identify the attacks. The experiments with several databases are performed and results show that this approach can detect applicationlayer DDoS attacks effectively.
1. Introduction
DDoS attacks have caused severe damage to servers and will cause even greater intimidation to the development of new Internet services. DDoS attacks are categorized into two classes: networklayer DDoS attacks and applicationlayer DDoS attacks. In networklayer DDoS attacks, attackers send a large number of bogus packets towards the victim server and normally attackers use IP spoofing. The victim server or IDS can easily distinguish legitimate packets from DDoS packets. In contrast, in applicationlayer DDoS attacks, attackers attack the victim server through a flood of legitimate requests. In this attack model, attackers attack the victim Web servers by HTTP GET requests and pulling large files from the victim server in overwhelming numbers. Also, attackers can run a massive number of queries through the victim’s search engine or database query to bring the server down.
To circumvent detection, the attackers increasingly move away from pure bandwidth floods to stealthy DDoS attacks that masquerade as flash crowd. Flash crowd [1, 2] refers to the situation when a very large number of users simultaneously access a website, which may be due to the announcement of a new service or free software download. Because burst traffic and high volume are the common characteristics of applicationlayer DDoS attacks and flash crowd, it is not easy to distinguish them. Therefore, application layer DDoS attacks may be stealthier and more dangerous for the websites than the general networklayer DDoS attacks.
Most wellknown DDoS countermeasure [3] techniques are against networklayer DDoS attacks. Those techniques cannot handle applicationlayer DDoS attacks. Countering applicationlayer DDoS attacks becomes a great challenge. Statistical methods is used to detect characteristics of HTTP sessions and employed ratelimiting as the primary defense mechanism in [4]. Constraint random request attacks by the statistical methods are used to defend against the applicationlayer DDoS attacks in [5]. A CAPTCHA puzzle is used to ensure that the response is generated by a human not by a machine in [6]. A semiMarkov model is proposed to describe the browsing Behaviors of Web surfers in [7], and an improved semiMarkov model is proposed to describe the dynamic behavior process of aggregated traffic in [8]. Recently, trustbased methods [9, 10] were introduced for resisting applicationlayer DDoS attacks. The common feature of these methods is that a defense system establishes credit records for each user. The credit value given to a sender is designed to be measured based on its history of communication patterns.
In applicationlayer DDoS attacks, attack sources have been programmed and worked according to their attack functions, so detection based on its pattern is possible. In this paper, the entropy of HTTP GET requests per source IP address (HRPI) is proposed, which reflects the essential features of applicationlayer DDoS attacks: the distribution of source IP address and HTTP GET request frequency. To increase the detection accuracy in various conditions, HRPI time series are transformed into a multidimensional vector by estimating the adaptive autoregressive (AAR) model parameters using Kalman filter. Furthermore, a support vector machine (SVM) classifier, which is trained by AAR parameters of HRPI time series, is applied to classify the state of current network traffic and identify the applicationlayer DDoS attacks.
The rest of the paper is organized as follows. Section 2 discusses the applicationlayer DDoS attacks and details their properties. Section 3 describes our approach to detect the applicationlayer DDoS attacks. In Section 4, experiments are presented to validate our detection model. Finally, the conclusion is given in Section 5 and it points out the future work.
2. ApplicationLayer DDoS Attacks
Applicationlayer DDoS attacks can be clustered into two types: bandwidth exhausting (HTTP flooding) and resources exhausting [11]. In bandwidth exhausting DDoS attacks, attackers attack the victim server through a flood of legitimate requests. Any zombie machine has to establish a TCP connection with the victim server, which requires a genuine IP address. Attacks mainly focus on the homepage or a hot webpage, and also different web pages. In this case, the sources of the traffic converge to a group of points and high HTTP Get request rate from the attackers.
Besides the flooding attack pattern, applicationlayer DDoS attacks may focus on exhausting the server resources such as Sockets, CPU, memory, disk/database bandwidth, and I/O bandwidth. With increasing computational complexity in Internet applications and larger network bandwidth, server resources may become the bottleneck of these applications. This type of attack is able to use fewer zombies but the attack has an even larger damage to the website. However, the traffic will be similar to the bandwidth exhausting DDoS. As a result, the sources of the traffic converge to a group of points but the targets of the traffic become dispersed in some extent. At the same time, the frequency of HTTP Get request from the attackers is highly large.
On the Web, flash crowd refers to the situation when a very large number of users simultaneously access a popular website, which produces a surge in traffic to the website and might cause the site to be virtually unreachable. DDoS attacks are absolutely different from flash crowd, DDoS attacks are due to an increase in the request rates for a small group of clients while flash crowd is due to an increase in the number of clients. The sources of flash crowd are definitely scattered, conversely, the sources of applicationlayer DDoS attacks converge to a group of points.
3. Our Approach
3.1. Definition of HRPI
For popular websites, the traffic targeted is a stream of successive HTTP Get requests.
Definition 1. HTTP Get requests in the certain time interval is given in the form of . For the (), is the source IP address and is the number of HTTP Get requests for .
Definition 2. Entropy of HTTP GET requests per source IP address (HRPI) is defined as where is the probability of HTTP Get requests belonging to , and .
HRPI as a summarization tool is used to quantify the degree of dispersal or concentration of HTTP Get request feature distributions. According to the analysis in Section 2, we deduced the following conclusion (DDoS as 1, normal as 2, flash crowd as 3):
In most cases, distribution form of source IP address of legitimate users is more uniformly scattered across the Internet; the distribution form of source IP address of attackers is more cumulative in someplaces. In DDoS attacks, several clusters of source IP addresses and lager number of HTTP GET requests are converged, so HRPI value dramatically drops when attacks happen. Conversely, the sources of flash crowd are scattered and there were no such clusters, so it will result in an abnormal increase in HRPI of the network.
3.2. Generation of HRPI Time Series
Adaptive autoregressive AAR () model [12] of degree is defined as where denotes the observation at instant and denotes the timevarying model parameters. As the traffic collecting device may cause measurement errors, stochastic variable is used to capture this error. The model uses a weighted sum of previous values to estimate the current observation value. The weights () are time dependent, and the current value can be predicted as a linear combination of past values. By using timevarying AAR model, we allow a model of normal behavior to adapt to the changes of the monitored system.
Kalman filter is an adaptive and recursive data processing algorithm that is suited for online estimation [13, 14]. Kalman filter can process traffic matrix as a whole and all traffic can be estimated simultaneously. This implies that we do not have to consider all the previous data again, to compute the optimal estimates; we only need to consider the estimates from the previous time step and the new measurement.
In our case, we estimate the AAR model parameters from the observed alert series . The true parameters cannot be observed directly and in state space terminology they are called the state . Now, assume that we have an observation model giving the relation between the unobservable state and the observations, and an evolution model describing the timevarying nature of the state. So the AAR model can be put in vector form as follows: where denotes an internal matrix and . denotes the state vector at instant and .
Without prior information, the evolution of the state is often described with a random walk model [15]. A linear equation is constructed as follows to build a prediction model to correlate and : where state noise and measurement noise are uncorrelated, zeromean whitenoise processes and with covariance matrices and , respectively.
For representing the Kalman filter equations, denotes the estimate of and denotes error covariance matrix for estimation error of the state at instant using observations accumulated at instant . When initial conditions, and error covariance matrix , the system state can be estimated iteratively by the following equations:
Using Kalman filter in practice requires initial values for state , error covariance , state noise covariance (), and observation noise covariance (). A common approach is to set , , and run the algorithm on a short segment from observation data backwards. The values obtained in this way for and are then used to initialize these values in the actual processing run. The adaptation speed increases with and the variance of state estimates is inversely proportional to the value of . Therefore, it should be chosen for a desired balance between state estimate variance and filter adaptation speed according to the application. needs to be set dynamically and according in application.
3.3. Kalman Filter Smoothing
There are three classical smoothing algorithms, fixedpoint smoother, fixedinterval smoother, and fixedlag smoother. We use fixedlag smoother, since it is suitable for online processing when a small, fixed delay of observations is allowed [16].
To estimate the state at instant with a fixedlag smoother, we will wait to have observations up to instant , where . The state and observation equations have now extended variables. The Kalman filter equations remain the same, and the observation (4) becomes The simplified state (5) can be written as
3.4. SVM Classifier
By sampling the network traffic with time interval , calculating the HRPI of every sample, the HRPI sample series is gotten, is the length of the series. Based on (6)–(8), multidimensional vector of degree can be used to describe the state features of network traffic. As a result, detecting DDoS attacks equates to classifying series virtually.
Support vector machine (SVM) is applied here, which is a wellknown data classification technique, to classify AAR parameters vector. SVM method can get the optimal solution whether the sample size tends to be finite or infinite. It can establish a mapping of a nonlinear kernel function, structuring the optimal hyperplane, so problem can be converted into a linearly separable one in the highdimensional feature space. Besides, it solves the dimension problem and its complexity has nothing to do with the sample’s dimension.
Since traffic is only considered as legitimate or attack, it is naturally a binary classification problem. The SVM classifier can be described as where is the classification result for the sample, is the Lagrange multiplies, is the category, and . is the kernel function and is the deviation factor.
The optimal hyperplane that SVM classifier created in the highdimensional feature space is where SV (Support Vector) denotes the support vector and means positive support vector, means negative support vector. The coefficient can be obtained by the following quadratic programming: where is the parameter to price the misclassification. Before the SVM can classify traffics, it should undergo a training process to develop a classification model. We use the LibSVM library [17] to implement SVM.
4. Experiments
In order to evaluate the performance of our scheme, we divided our study into two groups of experiments: to detect application layer DDoS in normal traffic and in flash crowd.
4.1. Dataset
Normal traffic is the reallife Internet traces collected from the traffic archive of Changzhou university WWW server. The traces contain two weeks worth of all HTTP requests to the web server. We implemented applicationlayer DDoS attack in a simulator. Simulations are carried out using NS2 network simulator on Linux platform. For generating attack traffic, there are 50 zombie machines and a web server. Attack rates are 20 HTTP Get requests/s, 30 HTTP Get requests/ HTTP Get requests/s, which simulate the attack rates of worm “Mydoom”, and every attack lasts 1800s. Flash crowd is collected from the World Cup 98 website [18]. As this is a high arrival rate, we expect our approach to detect this traffic as flash crowd.
We obtained HRPI time series by multiple sampling and calculation when the sampling interval is 0.1 s. As shown in Figure 1(a), HRPI of normal traffic varies with the time and its mathematical expectation is 9.26. Figure 1(b) shows HRPI of DDoS attack and its mathematical expectation is 3.58, and HRPI of flash crowd is shown in Figure 1(c) with mathematical expectation 11.57. We can see that the HRPI time series are sensitive to DDoS attack and flash crowd, so HRPI can distinguish three types of traffic distinctly.
(a) HRPI of normal traffic
(b) HRPI of DDoS attack
(c) HRPI of flash crowd
4.2. Model Parameters
There are three parameters which may affect the HRPI time series performances. The first one is the parameter of AAR model. In practice, the model degree is often fixed using some prior knowledge or guidelines. To optimize the goodness of fit verse, model complexity ratio, and also to ease the computational load, we settled as a degree which allowed the model to capture sufficiently well the normal traffic behavior.
The second one is state noise covariance . The adaptation speed of the Kalman filter is determined by the state noise covariance factor . It controls how fast the state adopts the changes in observations and gives a suitable balance in adapting to normal behavior and avoiding incorporating anomalous behavior in to the model. We experimented with different values and chose to use .
The third one is the lag of Kalman filter, and we noticed a significant increase in model accuracy when . The increase in accuracy was slower when further increasing . As larger means also longer delay in detection, we chose to use .
4.3. Experiments and Results
4.3.1. Evaluation Criteria
In this paper, a group of performance metrics in classification problems are used for the evaluation of the results, consisting of FPR, FNR, accuracy, precision, recall, and ROC. Let TP represent the normal test samples that have been correctly classified and let FP represent the ones that have been wrongly classified. Let TN represent the attacking test samples that have been correctly classified and let FN represent the ones that have been falsely classified. Thus, the FalsePositive Rate (FPR) and the FalseNegative Rate (FNR) are the proportions of wrongly classified normal test samples and attacking test samples, respectively (FPR = FP/(FP + TN), FNR = FN/(TP + FN)). Accuracy states the overall percentage of correctly classified attacking test samples (accuracy = (TP + TN)/(TP + FP + TN + FN)). Precision as the classifier’s safety, states the degree in which messages identified as attacking test samples are indeed malicious (precision = TP/(TP + FP)). Recall as the classifier’s effectiveness, states the percentage of attacking test samples that the classifier manages to classify correctly (recall = TP/(TP + FN)). Receiver Operating Characteristic (ROC) as a classifier’s balance ability between its FPR and its FNR is a function of varying classification threshold.
4.3.2. Experiment 1: Detect DDoS Attacks in Normal Traffic
We set that the sampling interval is 0.1 s, HRPI time series length is 100, so the detection time is 10 s. In this experiment, normal traffic contain 600 series, and DDoS attack traffic contain 450 series. Obtained dataset is divided into two parts: training data contains 60% of total data values, testing data contains the rest of the obtained dataset. The kernel function in SVM classifier is radial basis function (RBF) and the robustness of the classifiers is evaluated using 10fold crossvalidation. In order to test the robustness of our method to the disturbance of normal traffic, we do five experiments named as P20, P30,…, P60, in which traffic attacks are 20 HTTP Get requests/s,…, 60 HTTP Get requests/s mixing normal traffic at the same time.
Table 1 shows the performance results; the detection ratio of our approach increases when the attack traffic volume increases. When normal traffic is much larger than attack traffic, the detection ratio still keeps a high level. This means that our approach can identify the DDoS attack traffic with a high precision, and be sensitive to DDoS attack traffic.

4.3.3. Experiment 2: Detect DDoS Attacks in Flash Crowd
In this experiment, the sampling interval is 0.1 s, and HRPI time series length is 100, too. We sample flash crowd 500 series, and DDoS attack traffic 350 series. The training and testing method of SVM is the same as experiment 1. We do five experiments named as , by mixing attacking traffic 20 HTTP Get requests HTTP Get requests/s and flash crowd.
Table 2 shows the performance results, with the increment of flash crowd, the detection ratio of our approach does not decline rapidly. The FPR and FNR are reduced with the increase of attack rate, and the accuracy, precision, recall, and ROC are ascended with the increase of attack rate.

In the above two groups of experiments, the false negatives come mainly from two aspects: firstly, due to the increase of normal traffic or flash crowd, which makes the HRPI states learn to normal ones, thus making the difference too small for detection. Secondly, the network state shift caused by network random noise results in false negative.
5. Conclusion
Applicationlayer DDoS attacks detection is a hot and difficult research topic in the field of intrusion detection. Based on the characteristics of DDoS attack, this paper proposes a novel approach to detect DDoS attacks. The work provides two contributions: (1) HRPI is introduced to detect DDoS attacks, and it reflects the essential features of attacks and (2) a detection scheme against DDoS attacks is proposed, and it can achieve high detection efficiency and flexibility.
In our future work, we will make a detailed study of how to set all kinds of parameters in different application scenarios adaptively.
Acknowledgment
This work was supported by the National Natural Science Foundation of China under Contact (61070121).
References
 T. Thapngam, S. Yu, W. Zhou, and G. Beliakov, “Discriminating DDoS attack traffic from flash crowd through packet arrival patterns,” in Proceedings of the IEEE Conference on Computer Communications Workshops (INFOCOM '11), pp. 952–957, April 2011. View at: Publisher Site  Google Scholar
 G. Oikonomou and J. Mirkovic, “Modeling human behavior for defense against flashcrowd attacks,” in Proceedings of the IEEE International Conference on Communications (ICC '09), pp. 1–6, June 2009. View at: Publisher Site  Google Scholar
 H. Beitollahi and G. Deconinck, “Analyzing wellknown countermeasures against distributed denial of service attacks,” Computer Communications, vol. 35, pp. 1312–1332, 2012. View at: Google Scholar
 S. Ranjan, R. Swaminathan, M. Uysal, and E. Knightly, “DDoSresilient scheduling to counter application layer attacks under imperfect detection,” in Proceedings of the 25th IEEE International Conference on Computer Communications (INFOCOM '06), pp. 1–13, April 2006. View at: Publisher Site  Google Scholar
 W. Yen and M.F. Lee, “Defending application DDoS with constraint random request attacks,” in Proceedings of the AsiaPacific Conference on Communications, pp. 620–624, Perth, Australia, October 2005. View at: Publisher Site  Google Scholar
 L. Von Ahn, M. Blum, and J. Langford, “Telling humans and computers apart automatically,” Communications of the ACM, vol. 47, no. 2, pp. 56–60, 2004. View at: Publisher Site  Google Scholar
 Y. Xie and S.Z. Yu, “A largescale hidden semiMarkov model for anomaly detection on user browsing behaviors,” IEEE/ACM Transactions on Networking, vol. 17, no. 1, pp. 54–65, 2009. View at: Publisher Site  Google Scholar
 Y. Xie, S. Tang, and X. Huang, “Detecting latent attack behavior from aggregated Web traffic,” Computer Communications, no. 5, pp. 895–907, 2013. View at: Google Scholar
 J. Yu, C. Fang, L. Lu et al., “A lightweight mechanism to mitigate application layer DDoS attacks,” Scalable Information Systems, vol. 18, pp. 175–191, 2009. View at: Google Scholar
 P. Du and A. Nakao, “OverCourt: DDoS mitigation through creditbased traffic segregation and path migration,” Computer Communications, vol. 33, no. 18, pp. 2164–2175, 2010. View at: Publisher Site  Google Scholar
 H. Beitollahi and G. Deconinck, “Tackling Applicationlayer DDoS Attacks,” Procedia Computer Science, vol. 10, pp. 432–441, 2012. View at: Google Scholar
 Q.D. Sun, D.Y. Zhang, and P. Gao, “Detecting distributed denial of service attacks based on time series analysis,” Chinese Journal of Computers, vol. 28, no. 5, pp. 767–773, 2005. View at: Google Scholar
 R. Yan, Q. Zheng, and H. Li, “Combining adaptive filtering and IF flows to detect DDOS attacks within a router,” KSII Transactions on Internet and Information Systems, vol. 4, no. 3, pp. 428–451, 2010. View at: Publisher Site  Google Scholar
 S. Wen, W. Jia, W. Zhou, W. Zhou, and C. Xu, “CALD: Surviving various applicationlayer DDoS attacks that mimic flash crowd,” in Proceedings of the 4th International Conference on Network and System Security (NSS '10), pp. 247–254, Victoria, Australia, September 2010. View at: Publisher Site  Google Scholar
 S. Haykln, Adaptive Filter Theory, PrenticeHall, Upper saddle River, NJ, USA, 3rd edition, 1996.
 J. Viinikka, H. Debar, L. Mé, A. Lehikoinen, and M. Tarvainen, “Processing intrusion detection alert aggregates with time series modeling,” Information Fusion, vol. 10, no. 4, pp. 312–324, 2009. View at: Publisher Site  Google Scholar
 J. Platt, “Sequential minimal optimization: a fast algorithm for training support vector machines,” Tech. Rep. MSRTR9814, Microsoft Research, 1998. View at: Google Scholar
 M. Arlitt and T. Jin, “1998 World Cup Web Site Access Logs,” 1998, http://ita.ee.lbl.gov/html/contrib/WorldCup.html. View at: Google Scholar
Copyright
Copyright © 2013 Tongguang Ni et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.