A Universal High-Performance Correlation Analysis Detection Model and Algorithm for Network Intrusion Detection System

Zhu, Hongliang; Liu, Wenhan; Sun, Maohua; Xin, Yang

doi:https://doi.org/10.1155/2017/8439706

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Security and Privacy Protection of Social Networks in Big Data Era

View this Special Issue

Research Article | Open Access

Volume 2017 | Article ID 8439706 | https://doi.org/10.1155/2017/8439706

A Universal High-Performance Correlation Analysis Detection Model and Algorithm for Network Intrusion Detection System

Hongliang Zhu,¹Wenhan Liu,¹Maohua Sun,²and Yang Xin¹

Academic Editor: Zonghua Zhang

Received02 Feb 2017

Accepted03 May 2017

Published23 May 2017

Abstract

In big data era, the single detection techniques have already not met the demand of complex network attacks and advanced persistent threats, but there is no uniform standard to make different correlation analysis detection be performed efficiently and accurately. In this paper, we put forward a universal correlation analysis detection model and algorithm by introducing state transition diagram. Based on analyzing and comparing the current correlation detection modes, we formalize the correlation patterns and propose a framework according to data packet timing and behavior qualities and then design a new universal algorithm to implement the method. Finally, experiment, which sets up a lightweight intrusion detection system using KDD1999 dataset, shows that the correlation detection model and algorithm can improve the performance and guarantee high detection rates.

1. Introduction

(A) Background. Intrusion detection is a kind of technology which recognizes the intrusion by collecting and analyzing the protected system information [1]. The crucial functions are monitoring Internet and computer system, discovering and distinguishing the intrusion behaviors or attempts, and generating intrusion alarm in real time [2]. Intrusion detection can be thought as a binary technology that distinguishes whether the system state is “normal” or “attack” [3]. The requirements of the intrusion detection system are the detection rate, that is, detection accuracy, followed by real time. Only in high detection speed, it can deal with massive data transmitted in Internet in time, get rid of missing information for low speed [4], cause false negatives and false positives, and minimize the losses brought by the intrusion. However, with the diversification in kind and increasing in number of network attack means, there is a key issue of low detection rate for intrusion detection system [5]. In addition, the traditional intrusion detection system detects slowly and consumes large amounts of resources. With the quick development of network speed, it can not process the massive data transmitted in real time, resulting in a large increase of false positives rate and false negatives rate [6]. Also these problems are becoming more and more serious. Detection rate and detection speed have become important indicators of intrusion detection system real-time requirements [7]. How to build a high detection rate and detection speed intrusion system has become the focus of current research. Figure 1 gives an overview of a universal network intrusion detection framework. A key point in this figure is the use of deep analysis modules to process the associated events.

The deep analysis module plays an important role in intrusion detection system. We can see that the data of deep analysis modules come from two parts, one part is the result of detection on the upper layer and the other part is the raw data. Various data packets or events are processed by correlation algorithm, such as a correlation detection of event frequency and correlation detection for multiple parallel events and so on. The performance of correlation detection influences directly the detection rate and detection speed of intrusion detection system. However, the factors which influence the correlation detection result are various, so it is difficult to extract a unified correlation detection algorithm. Therefore, it is very necessary to find a universal correlation detection algorithm to increase detection rate and speed.

(B) Related Works. In order to detect anomalies in network, correlate parameters from different layers should be combined [8]. Some papers focus on building a new hierarchical framework for intrusion detection as well as data processing based on the feature classification and selection [9–11].

Intrusion detection system has been studied by means of machine learning, and the detection rate has got improvements [12–19]. In addition, intrusion detection has been performed by using feature association technique, and the data set has been used for analysis [20–25].

(C) Contribution. In this paper, we propose a novel method to increase the detection rate of intrusion detection system and improve the detection speed. This method is a correlation analysis detection model based on data packet timing and behavior quality, aiming to solve the problem of versatility, consistency, and the integrity of packet detection. This method enables us to overcome the disadvantage of traditional intrusion detection system.

The rest of this paper is organized as follows. In Section 2, we analyze and compare the current common data packet correlation detection modes briefly. Section 3 presents the generating process of the algorithm in detail. In Section 4, we present the detection process for intrusion detection system and make some experiments. In the end, we conclude the paper in Section 5.

2. System Overview

In intrusion detection system, for single session, there will be false positives when describing threatening events only by single feature, in order to reduce the behavior features of attraction events accurately. However, some papers have pointed out that there is relevance among different attack events [26]. If every session is analyzed separately, we can not identify the attack behavior exactly. While when we consider the related sessions correctly, we can identify an attack event completely. Nowadays, the majority of correlation detecting methods of intrusion detection system are as follows: correlation detection of event frequency, correlation detection for multiple parallel events, correlation detection for multiple serial events, correlation detection based on source IP of the event, correlation detection based on destination IP of the event, correlation detection based on resource of events and destination IP, and correlation of session [27, 28]. There are more and more weaknesses in traditional correlation detection, such as low detection rate and poor accuracy. Thus, we put forward a unified correlation detection algorithm and build a data pack correlation detection model based on the data packet timing and behavior quality, aiming to solve the problem of the versatility, consistency, and integrity of intrusion detection. Figure 2 gives an overview of a data packet correlation detection mode.

According to the behavior features, there are two kinds of intrusion events: one is unconditional trigger and the other is conditional trigger.(i)Unconditional trigger: correlated detecting based on the order of event occurring, including correlation detection of event frequency, correlation detection for multiple parallel events, and correlation detection for multiple serial events.(ii)Conditional trigger: correlated detection based on behavior feature, including single behavior feature and complex behavior feature.

3. Correlation Analysis Detection Model

In this section, we present the correlation analysis detection model as follows.

3.1. Concept Definition

Definition 1 (distributed packet flow). The time series is given as . The number of nodes is. A distributed packet flow is defined as ; each item of is , a single data packet, which is the original event collection on .

Definition 2. The primitive event is two-tuple ().(i) is the timestamp of the original event, that is, the time node of the event on the time series.(ii) is the behavioral characteristics of the original event.

Definition 3. The LAMBDA syntax is used to define the different relationships between primitive events: .(i)Existence: indicates whether the event exists or not.(ii)Parallel: indicates that the events and are parallel relations.(iii)Serial: ; indicates that the events and are serial relations.

Definition 4. The initial state of the system in the state diagram is , the intermediate state is , and is the termination status. Each node in the following graph represents the current state of the event, and if there is only one transition condition, an arrow arc exists between the two nodes to indicate the transition of the event state. The mark on the arc represents the transition condition.

3.2. Formal Expression of Behavior Detection

3.2.1. Unconditional Trigger Type

(i) A Correlation Detection of Event Frequency. This method detects that the original event that contains the threat behavior feature directly, and then it performs response processing.