Abstract

Reliable distinguishing DDOS flood traffic from aggregated traffic is desperately desired by reliable prevention of DDOS attacks. By reliable distinguishing, we mean that flood traffic can be distinguished from aggregated one for a predetermined probability. The basis to reliably distinguish flood traffic from aggregated one is reliable detection of signs of DDOS flood attacks. As is known, reliably distinguishing DDOS flood traffic from aggregated traffic becomes a tough task mainly due to the effects of flash-crowd traffic. For this reason, this paper studies reliable detection in the underlying DiffServ network to use static-priority schedulers. In this network environment, we present a method for reliable detection of signs of DDOS flood attacks for a given class with a given priority. There are two assumptions introduced in this study. One is that flash-crowd traffic does not have all priorities but some. The other is that attack traffic has all priorities in all classes, otherwise an attacker cannot completely achieve its DDOS goal. Further, we suppose that the protected site is equipped with a sensor that has a signature library of the legitimate traffic with the priorities flash-crowd traffic does not have. Based on those, we are able to reliably distinguish attack traffic from aggregated traffic with the priorities that flash-crowd traffic does not have according to a given detection probability.

1. Introduction

Attackers may take the advantages of the principles [1] of distributed systems (i.e., the internet), such as openness, resources sharing, assessability, and so on, to launch distributed denial of service (DDOS) attacks. The threats of DDOS attacks to the individuals are severe. For instance, any denial of service of a bank server implies a loss of money, disgruntling or losing customers.

According to the classification of the CERT Coordination Center (CERT/CC), DDOS attacks are divided into three categories [2]: (1) flood (i.e., bandwidth) attacks, (2) protocol attacks, and (3) logical attacks. This paper considers flood attacks. DDOS flood attacks consume resources (e.g., bandwidth) by sending flood packets in order to shut down the target or significantly degrade its performance. The flood packets may be generated by hundreds or thousands of machines distributed all over the world.

A network-based intrusion detection system (IDS) monitors the traffic on its network as a data source [3]. In this regard, there are two main approaches. One is misuse detection and the other anomaly detection. Solutions given by misuse detection are primarily based on a library of known signatures to match against network traffic. Hence, unknown signatures from new variants of an attack mean 100% miss positives. As a matter of fact, the form in which an attack takes place is usually determined by a large number of details many of which are unknown. This is particularly true for DDOS attacks [4]. Hence, anomaly detectors play a role in DDOS detection [2, 3, 512]. Anomaly detectors cannot replace signature-based systems [2, 3]. From a practical view, therefore, the combination of a signature-based system and anomaly detector is worth noting [2].

A traffic series is a packet flow. A packet consists of a number of fields, such as protocol, source IP, destination IP, ports, flag setting (in the case of TCP or UDP), message type (in the case of ICPM), timestamp, and length (packet size). Each may serve as a feature of a packet for statistical detection purpose, see for example, [8, 1315]. In addition, there are other available features of traffic, such as flow rate [16], the number of connections [17], and so on [6, 11, 12]. This paper takes traffic series in packet size (traffic series for short) as a monitored objective.

Usually, detections are expected to be adaptable to a wide range of network environments (e.g., [7, 8, 1117]). Nevertheless, it is obviously worth studying detections that are environment dependent. This paper studies detecting signs of DDOS flood attacks in the underlying network to use static-priority schedulers.

As known, two tough issues in detecting DDOS flood attacks are (1) reliable detection as can be seen from [2, 3, 5, 7, 9, 10], and (2) distinguishing attack traffic from aggregated traffic [7, 9, 16]. The solution to the first issue is crucial to practical applications because false positives can lead to inappropriate responses that cause denial of service to legitimate traffic. In addition, it is the basis to find the solution to the second.

It is noted that flash-crowd traffic and DDOS flood traffic may have similar statistics from a network view. DDOS flood is malicious but flash crowds legitimate. Flash crowds happen when a huge number of users try to access the same server simultaneously for some specific events (e.g., the NASA Pathfinder mission) [16]. Because an attacker aims at attacking the target such that it denies services of all legitimate traffic, we assume DDOS flood traffic has all priorities in all classes. On the other hand, according to the nature of differentiated services, we assume that flash-crowd traffic does not have all priorities. Further, we suppose that the protected site is equipped with a sensor that has a signature library of the legitimate traffic with the priorities flood crowds do not have. In these cases, DDOS flood attack traffic can be distinguished, according to a given detection probability, from aggregated traffic with the priorities flash crowds do not have.

The rest of paper is organized as follows. Section 2 introduces the randomized traffic regulator for feature extraction of arrival traffic. Section 3 considers the principle. A case study is demonstrated in Section 4; discussions are given in Section 5 and conclusions in Section 6.

2. Traffic Regulator and Its Randomization

There are two major areas of traffic modeling. One is based on random processes, see for example, [6, 8, 1830]. The other is deterministically modeling, for example, traffic regulator [18, 3033]. We take traffic regulator to characterize traffic in this research.

Definition 2.1 (see [31, 33]). Let be the instantaneous rate of arrival traffic at time . Then, the amount of traffic generated in the interval is upper bounded by where and are constants and . This property is written as that is called traffic regulator.
Practically, traffic is considered in the discrete case on an interval-by-interval basis. Thus, we generalize Definition 2.1 as follows.

Definition 2.2. Let be the instantaneous rate of arrival traffic at . Then, the amount of traffic generated in the th interval    is upper bounded by where represents the traffic regulator in the th interval, and is a positively real number.
For the simplicity, denote .

Definition 2.3. Let be the instantaneous rate of all flows of class with priority going through server from input link at . Then, the amount of generated in the th interval is upper bounded by . That is, .
Definition 2.3 provides a feature of arrival traffic on an interval-by-interval basis. Theoretically, can be any positively real number. In practice, however, is selected as a finite positive integer.
Usually, for . Therefore, is a random process. Computing the sample mean of in terms of yields Usually, for . In practice, if , quite accurately follows Gaussian distribution regardless of the distribution of [34]. Denote and , where and are operators of variance and mean, respectively. Then, one can use the sample distribution of as follows: , where follows the standard Gaussian distribution. Thus,

3. Principle

3.1. Detection Probability and Miss Probability

Normally, a server serves for a number of connections (clients) concurrently. Figure 1 illustrates a server that serves for connections of normal traffic and connections of attack traffic. Aggregated traffic consists of normal traffic and attack one .

In the case of , one has where is called confidence coefficient. Let be the confidence interval with confidence coefficient. Then,

The above expression exhibits that is a template of . Thus, we have confidence to say that normally takes the value of as its approximation with the variation less than or equal to .

Denote that . Then,

On the other hand,

For facilitating the discussion, two terms are explained as follows. Correctly recognizing an abnormal sign means detection and failing to recognize it miss. We explain the detection probability and miss probability by the following theorem.

Theorem 3.1 (Detection probability). Let be the detection threshold. Denote as detection probability. Denote as miss probability. Then,

Proof. The probability of is . Accordingly, the probability of is . Therefore, the detection probability for is . Hence, (3.6) holds. Since [8], .

In the case of and the computation precision being 4, one has The diagram of our detection is indicated in Figure 2.

3.2. About False Alarm

False alarm means mistakenly recognizing a normal as abnormal. In this mechanism, detection criterion is with and . Therefore, if happens in the case that comes from normal traffic and an alert is fired, then this alert will be a false alarm, which has the probability . Therefore, In the case of , one has .

3.3. Partly Distinguishing Attack Traffic

For the simplicity, suppose that traffic has two priorities and . We further suppose that flash-crowd traffic has the priority but does not have . Non-flash-crowd normal traffic has both and and DDOS flood traffic has both and . Then, implies a detection that the traffic contains attack traffic of class at the server from the link in the th interval. The detection probability is .

Denote , where and are normal traffic and attack traffic with , respectively. Note that does not have the components of flash-crowd traffic.

Usually, a signature-based sensor is designed such that it has a library that contains signatures of attack traffic. In the present mechanism, however, we use a signature-based sensor that has a library to contain signatures of legitimate traffic with the priorities that flash-crowd traffic does not have. In this way, traffic whose signatures cannot be matched by this signature-based sensor may be taken as flood traffic or suspicious. Thus, if occurs, the flows that are in and cannot be matched by the signature-based sensor are flood traffic of class with at the server from the link in the th interval. The reason to use a signature library of legitimate traffic instead of attack one is that attackers make efforts to create new variants of signatures but legitimate users usually do not. Figure 3 indicates the process of distinguishing attack traffic from .

4. A Case Study

We consider fractional Gaussian noise (FGN), which is an approximation model of traffic time series [18, 19, 21, 22, 35, 36]. The autocorrelation function of discrete FGN is given by where is the strength of FGN [37], is an integer, Γ(·) is the Gamma function, and the Hurst parameter.

In Figures 4, 5, 6, and 7, subscripts and superscripts of and are omitted. Consider TCP traffic series (Bytes)), indicating the number of bytes in a packet at . By simulating FGN, we have a series with as shown in Figure 4. According to Definition 2.2, we obtain (Bytes) as shown in Figure 5  . Figure 6 indicates (Bytes). The histogram of is given in Figure 7.

From Figure 7, we attain 3,105 and . Under the condition of , one has the interval [1720, 4467] and the threshold .

5. Discussions

5.1. DiffServ Architecture: A Flexible Foundation

The above explanations only take the simple case of two priorities. In fact, there may be several priorities in a DiffServ domain, where applications are differentiated by their classes, and a certain portion of bandwidth is reserved for each class traffic [38]. Usually, all the flows in a class are assigned the same priority on each router. However, it is also available that the flows in a class may be assigned different priorities, and flows from different classes may have the same priority as can be seen from [32, Paragraph 5, Section 1, page 327]. This paper considers a class to be assigned different priorities. On the other side, the DiffServ architecture distinguishes two types of routers (edge routers and core routers) [32, Paragraph 2, Section 3, page 327]. Thus, a detector can be installed with either edge routers or core ones. Consequently, the DiffServ architecture provides a flexible foundation to design effective IDS to distinguish flood traffic from aggregated one. This paper is simply a beginning on this track.

5.2. Applicability

Mathematical properties of traditionally aggregated traffic time series have been studied deeply in a way, see for example, [1822, 35]. However, math properties of aggregated traffic time series on a class-by-class basis for different priorities in the DiffServ domain are rarely seen. That is a main reason we use traffic regulator proposed by [33] because it is a tool particularly applicable in a flow-unaware environment. In addition to that, the traffic regulator is simple. Let and be the time for recording data and data processing, respectively. Suppose that we record a packet per 10 microsecond. Then, (second), where is the length of the series involved in computations. In the above case study, . Thus, . One the other hand, for a series of 256 length on an average Pentium IV PC is neglectable in comparison with . This exhibits that the detection time is short enough to meet real-time use in practice.

It is worth noting that is a traffic pattern. In the present method, signs of DDOS flood attacks are identified by , meaning traffic pattern under attacking must be significantly different from that of normal traffic. As a matter of fact, if an attacker were able to attack a target such that it would be overwhelmed by creating the floods that well mimic or be near to normal traffic, the target would be overwhelmed at its normal state even if there were no flood packets. This is obviously impossible even if the attacker knows normal traffic pattern exactly before attacking.

5.3. Future Work

The previous presentation is quite academic in the following senses. The detection mechanism previously exhibited was discussed based on postulated traffic models without analyzing real-traffic data. For this reason, we shall work on the traffic models in this paper with real-traffic data for anomaly detections. In addition, we will derive a general mechanism to reliably identify and distinguish attack traffic from aggregated traffic for the flows of class with all priorities. In addition to that, we shall explore statistical learning methods discussed in other fields, see for example, [3949].

6. Conclusions

This paper suggests a reliable method to detect signs of DDOS flood attacks in the DiffServ environment with static-priority schedulers. The present method can, with the combination of a signature-based sensor, partly but reliably distinguish attack traffic from aggregated traffic at a given server for a given link in a given time interval according to a predetermined detection probability. Given that static-priority schedulers are widely supported in current routers, it is our belief that this approach may be practical and effective in engineering.

Acknowledgments

This work was partly supported by the National Natural Science Foundation of China (NSFC) under the project Grant nos. 60873264, 61070214, and the China national 973 plan under the project number 2011CB302801/2011CB302802.