Abstract

An intrusion detection system (IDS) helps to identify different types of attacks in general, and the detection rate will be higher for some specific category of attacks. This paper is designed on the idea that each IDS is efficient in detecting a specific type of attack. In proposed Multiple IDS Unit (MIU), there are five IDS units, and each IDS follows a unique algorithm to detect attacks. The feature selection is done with the help of genetic algorithm. The selected features of the input traffic are passed on to the MIU for processing. The decision from each IDS is termed as local decision. The fusion unit inside the MIU processes all the local decisions with the help of majority voting rule and makes the final decision. The proposed system shows a very good improvement in detection rate and reduces the false alarm rate.

1. Introduction

Intrusion detection system (IDS) monitors the behavior of a given environment and identifies the activities that are malicious or legitimate. There are two common approaches to intrusion detection: misuse detection and anomaly detection. Misuse detection via signature verification compares a user’s actions with the known signatures of attackers attempting to enter a system. It is useful for finding known intrusion types, but it cannot detect new attacks [1]. Anomaly detection identifies behavior that differs from well-known statistical patterns for users, systems, or networks. Machine learning techniques are used to capture the normal usage patterns and classify the new behavior as either normal or anomalous. In spite of their capability in detecting unknown attacks, anomaly detection systems result in high false alarm rate [2]. Anomaly detection can be combined with signature verification to identify attacks.

Feature selection is the most crucial step in constructing any intrusion detection system [3]. A set of attributes or features that are identified to be the most effective are extracted in order to construct a suitable IDS. Identifying the features that are relevant to the learning algorithm is a challenge. In some cases, redundant features can lead to noisy data that distract the learning algorithm and degrade the accuracy of the IDS, and this slows down the training and testing processes. Feature selection is proved to have a high impact on the performance of the classifiers. Experiments show that feature selection can reduce the building and testing time of a classifier.

Multiclassifier Systems (MCSs) focus on the grouping of classifiers with heterogeneous or homogeneous modeling backgrounds to give the final outcome. MCSs perform well when there is very sparse data sample for learning. In the scarcity case, MCSs can use bootstrapping methods such as bagging or boosting [4]. MCSs allow training classifiers on a data set’s partitions and combining their results using appropriate combination rules. Two canonical topologies work in the designing of MCSs. They are parallel and serial topologies. In parallel topology, each classifier supplies the same input data so that the last decision of the combined classifier result is made on the basis of the outputs of each classifier obtained separately. Alternatively, in the serial (or conditional) topology, each classifier is applied in a certain order implying some kind of grade or ordering over them.

The rest of the paper is organized as follows. Section 2 enumerates related works. The proposed methodologies are elaborately dealt with in Section 3 with the algorithms for training and testing multiple IDS. Section 4 discusses the performance evaluation of the experiments in detail with the results. Section 5 presents the sum-up of the study.

Thomas and Balakrishnan [5] have optimized the performance of IDS using fusion of multiple IDS. The assignment of weight for each IDS is outlined in this paper, and the weights are aggregated to take a correct decision. DARPA 1999 data set is used to evaluate the IDSs which are outdated. It contains more redundant records, and so it affects classifier accuracy. In their method, binary values are used to decide attack or normal. Giacinto et al. [6] proposed a pattern-recognition approach based on the fusion of multiple classifiers for network intrusion detection. It provides a better tradeoff between generalization abilities and false alarm generation. Unfortunately, the performances of fusion rules on unknown attacks show no improvement over the results of the individual networks that are obtained. No fusion rule provides improvements on the performances of the neural network trained on the overall feature set that attains the same performance of oracle. Siraj et al. [7] proposed the Decision Engine of an Intelligent Intrusion Detection System (IIDS) that fuses information from different intrusion detection sensors using an artificial intelligence technique. Like neural networks it cannot do self-learning and self-training. There is no functionality for customizing the standard attack. Parikh and Chen [8] proposed ensemble of classifiers to combine data from various sources and reduce the cost of false alarm. DLEARNIN and DCMS algorithms are used for the abovementioned purpose. In their paper, sum and product rules are not used. Outputs are not directly compared. Giacinto et al. [9] proposed an unsupervised anomaly-based IDS. Combination of one-class classifiers is used in their work for designing each module with distinct features for training. For high values of false alarm rate, the system gives low detection rate. Li et al. [10] constructed a compact data set by clustering redundant data into a compact one. Features are reduced from 41 to 19 using clustering, and the use of ant colony optimization improved the efficiency of intrusion detection. The combination of the critical features used in this method could not distinguish the attackers and normal users. Sung and Mukkamala [11] have removed one feature at a time to carry out an experiment on SVM and neural network. KDDCup’99 data set has been used to verify this technique. For five-class classification, out of 41 features only 19 of the most significant features are used. Li et al. [12] proposed a wrapper-based feature selection algorithm to construct lightweight IDS. They applied a modified Random Mutation Hill Climbing (RMHC) for search strategy and modified the linear SVM for valuation criterion. This method speeds up the process of selecting features and gives a high detection rate for IDS. Since the types of intruders are wider in nature in today’s information era, the scope for the designing of improved IDS is high motivating the proposed work.

3. The Proposed System

3.1. Motivation

With the advent of online business and the social network, the genuineness of the information available in the internet has become a question. Many human and robot based intruders are playing in an aggressive manner to gain advantages of the information. Also the kind of attacks in the Internet is nondeterministic in nature making it very complex task to detect and react. Most of the present day stand-alone intrusion detection systems are not capable of achieving a reasonably high detection rate and low false alarm rate. Most of the existing works on IDSs show distinct performance in detecting a certain class of attack with improved accuracy while performing moderately for the other classes of attacks. It has become possible to obtain a more reliable and accurate decision for a wider class of attacks by combining the decisions of multiple intrusion detection systems.

Nowadays, the processors are working in an unimaginable speed. So combining multiple IDSs is not a big issue in the computation point of view and best-of-breed solutions have been achieved earlier. A better analysis of existing data gathered by various individual IDSs can detect many attacks that currently go undetected. From the literature survey, it is learnt that the usage of appropriate feature selection techniques simplifies the models to make them easier to interpret, shorter the training times, and enhance the generalization by reducing overfitting. The challenges in designing and deploying IDS are increasing due to the wider reach of the Internet services and nonavailability of standard procedure for characterizing the intruders.

3.2. The Proposed System Architecture

The anomaly-based IDSs identify the abnormal, unusual behaviors on a network and tag them as attacks. It does not need any specific knowledge. The disadvantage of this method is that it produces more number of false alarms. The signature-based IDS is well versed in detecting attacks that match a predefined pattern, and it produces very minimum number of false alarms and the fusion of signature-based and anomaly-based techniques is done for three main reasons. First, the false alarm rate should be minimum, and it is only possible in signature-based IDS. Second, any IDS has to identify new attacks and it is possible through anomaly-based techniques. Third the idea is that every IDS is efficient in detecting specific types of attack. For example, anomaly-based IDS is suitable for detecting DOS and R2L type attacks, and signature-based IDS is good for detecting U2R and PROBE which can be inferred from Table 6. The fusion of signature-based and anomaly-based techniques will be able to detect more attacks with less false alarm rate. The proposed system consists of a Multiple IDS Unit (MIU) which contains five IDS units following five different algorithms.

The proposed system architecture is shown in Figure 1. It contains three phases of work. In the first phase, feature selection is done with the help of information gain (IG) and genetic algorithm (GA). There are totally 41 features present in KDDCup’99 data set. Certain features are irrelevant or not needed for the IDS.

When all the 41 features of the input traffic are taken for processing, there is a delay in processing and inefficient output is produced. Experimenting with all the combinations of the features is exponentially complex in nature. Hence, only the relevant features are chosen with the help of genetic algorithm (Algorithms 1 and 2). The selected features are given as input. The feature selection phase will help in drawing out the relevant features. This increases classifier accuracy and reduces computation speed.

Input: Feature set FS
Output: An array IG populated with information gain value for each feature.
Initialize ;
foreach ( in FS)
   IG [] = IGR();
   ++;
endfor

Input: Binary chromosome 41
Output: Information gain sum with Feature count
for ( to 40)
    if (chromosome [] == 1)
    then igsum = igsum + IG [];
    fcnt = fcnt + 1;
    endif
endfor

In the second phase, the output from the first phase (i.e., input traffic with selected feature alone) is given as an input to the MIU, and the output is the local decision () which categorizes the input traffic (DOS, PROBE, U2R, R2L, and NORMAL). Five IDSs, each with a unique algorithm, are present in the MIU. The five different types of IDS algorithms used are Support Vector Machines (SVM) [13], IBK, RandomForest, J48, and BayesNet. SVM, IBK, and RandomForest come under the category of anomaly-based IDS [1, 2]. J48 and BayesNet come under the category of signature-based IDS [1]. Every IDS algorithm in the MIU (Algorithm 3) receives the input traffic data record and does the classification for every input record, and five outputs (local decisions) , to are obtained.

Algorithm: MIU
Input: Input traffic data record set of all features
Output: Return whether traffic data record is (ATTACK or NOT_A_ATTACK)
Process:
 () Find information gain for each feature in and store it in IG following Algorithm 1.
 () Using Algorithm 2 as the fitness function in the genetic algorithm, the features are selected.
 () Pass the input traffic data record with into classification algorithm (SVM), which returns the
    attack category for each input traffic data record.
 () Repeat Step () on other classification algorithms IBK, J48, RandomForest and BayesNet.
 () For each input traffic data record, now there are five local decision from five
    classification algorithms.
 () The local decision is labeled as or
    —stands for ATTACK
    —stands for NOT_A_ATTACK
    If ( == “DOS” ∥ == “PROBE” ∥ == “U2R” ∥ == “R2L”)
      Then
         =
      Else
         =
 () For each input traffic data record, decision from five IDS units is either or count
    the number of and
      If ( > 3)
          Final decision =
      Else
          Final decision =

In the third phase, the output from each in MIU, considered as local decision (), is passed on to the categorization unit. The input traffic category is divided into two groups, ATTACK and NOT_A_ATTACK groups. The traffic categories DOS, PROBE, U2R, and R2L are labeled as ATTACK group. Normal is labeled as NOT_A_ATTACK group. For example, if the output () from the IDS 2 is PROBE, then it falls under the attack group. Fusion process is depicted in Figure 2. The output from the categorization unit for each local decision () is taken to the decision unit, and the global decision () is taken based on the majority voting rule. If 3 out of 5 outputs from categorization unit suggest (Attack), then the decision unit decides that the input traffic is of ATTACK type; else it is NOT_A_ATTACK.

3.3. Feature Selection
3.3.1. Information Gain Ratio (IGR)

Let be a set of training set samples with their corresponding labels. Suppose there are classes and the training set contains samples of class and is the total number of samples in the training set; expected information gain ratio is needed to classify a given sample. It is calculated by using the equationFeature with values can divide the training set into subsets , where is the subset which has the value for feature . Furthermore, let contain samples of class . Entropy of the feature isInformation gain for can be calculated as

3.3.2. GA-Based Feature Selection

To reduce the dimensionality and to get better accuracy, the relevant features have to be selected. Feature selection is done using genetic algorithm. Genetic algorithm fitness function is designed in such a way that the number of features selected has to be minimum and the sum of their information gain value should be maximum. The genetic algorithm is designed to have a population size of 40. The binary chromosome of length 41 is constructed with each bit representing a feature. This binary chromosome is given as input to the fitness function (Algorithm 2). The information gain value (IG) of the selected features (i.e., bit set as 1) is summed up to get the total information gain value (igsum). The total number of 1’s set in the chromosome gives the feature count (fcnt). For example, consider the following chromosome:11011100011110101100111001110110011010001

Here bit 5 is set (i.e., value = 1); then it indicates that the 5th feature is selected for processing. In this chromosome, totally 24 bits are set, so the feature count (fcnt) is 24. The total information gain value (igsum) obtained by summing up the information gain (IG) of 24 selected features is 0.37586. The genetic algorithm parameter values are listed in Table 1.

Table 2 gives the various eminent feature combinations obtained for different attack types using genetic algorithm. The features that are mostly repeated in the list are selected for the experiment.

The proposed implementation steps are given in Algorithm 3.

4. Performance Evaluation and Results

4.1. NSL-KDD Data Set

One of the main drawbacks in the KDDCup’99 data set is repetition of records, which causes the learning algorithms to be partial towards the repeated records. Thus it prevents them from learning irregular records which are usually more harmful to networks in U2R and R2L attacks. In addition, the occurrences of these redundant records in the test set will cause biased result in the performance.

The NSL-KDD benchmark data set [14] has the following benefits over the KDDCup’99 data set:(i)It does not include repeated records in the training set, and so the classifiers will not be partial towards more repeated records.(ii)There is no replica record in the testing sets. Therefore, the performances of the learners are not biased.(iii)The number of selected records from each group of difficulty level is inversely proportional to the percentage of records in the original KDDCup’99 data set and thus helps an accurate evaluation of different learning techniques. As a result, the classification rates of various machine learning methods vary in a wider range, which makes it more efficient to detect different types of attacks. The sample distributions on the training and testing data sets with the corrected labels of NSL-KDD data set are shown in Table 3.

4.2. Performance Evaluation Metrics

The performance of the proposed intrusion detection system is evaluated with the help of confusion matrix. The classification performance of IDS is measured by false alarm rate, detection rate, and accuracy. They can be calculated using the confusion matrix in Table 4. Confusion matrix is a matrix, where the rows represent actual classes, while the columns have the corresponding values to the predicted classes:

In this section, the performance of the proposed intrusion detection system is studied with the help of an experiment. In this experiment, only the relevant features are selected, using the information gain algorithm and genetic algorithm. The selected features and training data set are given as input to the MIU unit, and the performance measures such as accuracy, detection rate, and false alarm rate are considered for evaluation. The results are tabulated and plotted as graphs.

4.3. Experiment Results

All experiments were performed on a Windows platform having configuration Intel core 2 Duo CPU 2.49 GHZ, 2 GB RAM. Simulations and the analysis of experimental results are performed with the use of Weka machine learning tool [15] and JAVA.

Selected features are considered for training the fusion IDS in this experiment, and test data with 28.39% of novel (new attack) data is taken.

From Table 5 it is inferred that, for J48 classifier, there is 57% of reduction in testing time, when considering 28 features instead of taking all features.

From Table 6 it is inferred that detection rate and false alarm rate of intrusion detection systems with feature selection using single classifier like SVM, IBK, J48, RandomForest, and BayesNet are inferior to those of the fusion IDS unit. For example, in U2R type of attack, the detection rate achieved by SVM classifier is 86%, IBK classifier is 83%, J48 is 82.5%, and BayesNet is 80.5%. When a fusion IDS unit with multiple heterogeneous IDS is used, a higher detection rate of 99% is achieved.

False alarm rate (FAR) is reduced a lot when a fusion IDS unit with multiple heterogeneous IDS is used. For example, the FAR found for DOS attack type using SVM is 0.7, IBK is 0.3, J48 is 0.1, RandomForest is 0.2, and BayesNet is 0.3. When the fusion IDS is used, the FAR is achieved at 0.0.

Detection rate (DTR) and false alarm rate (FAR) of the proposed system for the different types of attack using selected features of the test data set of KDDCup’99 data set are tabulated in Table 7. On an average, 98.4% of detection rate is achieved. The average false alarm rate achieved is 0.68.

The experimental results of Thomas and Balakrishnan [5] paper are taken for a comparative study. Table 7 gives the detection rate of the proposed system and the Thomas and Balakrishnan [5] work. The detection rate for DOS is 64% in previous [5] work and it is 99% for the proposed system. Similarly for PROBE, U2R, and R2L, there is a high improvement in detection rate while comparing with previous work [5]. Particularly for R2L, there is improvement in the detection rate. Similarly, the false alarm rate for DOS is 36.20 in the work of Thomas and Balakrishnan [5], but in the proposed work, the value is minimized to 1.0 and for PROBE, U2R, and R2L also the false alarm rate value has decreased drastically.

Figures 3 and 4 present a comparative study of detection rate and false alarm rate of the proposed and existing fusion methods.

5. Conclusion

The key idea behind the study is that any IDS is efficient in detecting some specific attack category. Different IDSs which are good in detecting different attacks are combined together, and an MIU is framed. This paper uses only relevant features of the input traffic data for processing, and the promising classification result is obtained from the MIU which is the fusion of heterogeneous IDSs. In comparison with the work of Thomas and Balakrishnan [5], good improvement in the detection rate and false alarm rate is achieved. When the detection rate and false alarm rate of single IDS unit are compared with fusion IDS unit, there is a vast improvement in the performance. The feature selection done with genetic algorithm has extracted the relevant features from the 41 features. As a result, there is improvement in training and testing speed and good accuracy found. The binary interpretation of anomaly score can be avoided in future work. The anomaly score can be normalized and multiplied with the respective weights used as in the basic probability assignments.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.