Review Article

Data Fusion for Network Intrusion Detection: A Review

Table 2

The performance of different feature reduction algorithms.
(a)

Feature fusion techniques Metrics
ArticleDatasetNumber of training/testing dataNumber of featuresClassifierIdentified attack typesValidityEfficiencyData securityScalability
ACCPRRRF-ScoreFPRFNRFARTraining time (s)Testing time (s)

NN [8]DARPA996819/367984/37MLFAll××
[9]KDD997000/700041/13NNAttack/Normal99.41%×
[10]KDD99_10%41/34NNAttack/Normal81.57%18.19%0.25%9.22%××

PCA [11]KDD99_10%5000/500041/10SVMProbe99.78%99.85%99.70%99.77%276××
KDD99_10%5000/500041/10SVMR2L99.70%99.50%99.39%99.53%237××
[12]Kyoto 20606+31360/4704018/5MLPAttack/Normal97.12%4.29%1.44%2.87%22.14××
[13]NSL_KDD125971/2200041/23NNAll86.49%83.95%83.78%××

Fisher-Score [14]KDD99_10%1500/150041/13RBF-NNAttack/Normal85.33%5.40%6.710.13××
41/1991.27%5.20%8.380.13××
41/2895.70%6.31%9.070.18××
41/4196.51%11.74%13.050.41××

RF [15]KDD99_10%16919/4983841/34RFAttack/Normal94.20%1.10%××

GA-LR [16]KDD99_10%1000/100041/18RFAttack/Normal99.90%99.81%0.11%××
UNSW-NB152000/200049/20C4.581.42%6.39%××

Filter-MISF [17]KDD99_10%15246/47877541/6LS-SVMAttack/Normal99.90%99.93%99.53%0.07%63.1927.32××
Filter41/1999.75%99.43%99.34%0.17%87.8330.64××
MISF41/2599.70%99.38%99.34%0.23%××

GFR [18]KDD99_10%550/11570541/19Multi-class SVMAll98.62%0.124.63××
FRM-SFM41/1098.68%0.167.8××

SVM [10]KDD99_10%41/30Multi-class SVMAll99.61%81.186.36××
[19]KDD99_10%424/10641/17SVMDos and Probe99.30%××
[9]KDD997000/700041/13SVMAttack/Normal99.52%0.50%1631.06××

SA-SVM [20]KDD9993969/1044141/23SA-DTAll99.96%××

Chi-Square [21]NSL_KDD8325/2497541/31Multi-class SVMAll98.00%0.13%

LR: logistic regression; MISF: Mutual Information-Based Feature Selection; GFR: gradually feature removal method; FRM: feature removal method; SFM: sole feature method; SA: simulated annealing; MLF: multilayer feed-forward; LS: least square; and DT: decision tree. given. mentioned. Number of features : and represent the number of features before and after fusion, respectively.
(b)

Feature fusion techniquesMetrics
ArticleDatasetNumber of training/testing dataNumber of featuresClassifierIdentify attack typesValidityEfficiencyData securityScalability
ACCPRRRF-ScoreFPRFNRFARTraining time (s)Testing time (s)

GeFS-CFS [19] KDD99_10% 424/10641/4.5 C4.5 Dos and Probe99.20%××
GeFS- mRMR 41/1899.60%××
Markov-Blanket41/17BN98.70%××

BN [22]KDD994700/470041/30BNAll83.13%37.44××
[23]KDD995092/689041/17BNAll91.06%112.1159.4××

CART [23]KDD995092/689041/12CARTAll88.52%3.860.19××
[19]KDD99_10%424/10641/12CARTDos and Probe94.30%××

FMIFS [24]KDD9941/19LS-SVMAttack/Normal99.79%99.46%0.13%××
NSL_KDD99.91%98.76%0.28%××
Kyoto 2006+99.77%99.64%0.13%××

MIFS [24]KDD9941/25LS-SVMAttack/Normal99.70%99.38%0.23%××
NSL_KDD97.96%95.96%0.53%××
Kyoto 2006+99.32%98.59%0.16%××

FLCFS [24]KDD9941/17LS-SVMAttack/Normal97.63%89.26%0.34%××
NSL_KDD96.75%93.26%0.47%××
Kyoto 2006+99.12%98.10%0.58%××

CFS-GA [25]NSL_KDD125973/2254441/4J48Attack/Normal91.86%0.22××

BBAL-NB [26] NSL_KDD 9566/450041/15NB Attack/Normal91.62%5.73%0.76××
BBAL-SVM41/16SVM95.87%2.89%68.77××

FVBRM [27]NSL_KDD56687/629941/24NBAll97.78%9.42××

ML [28]KDD9990000/1000041/16SVMAll90.36%××
NSL_KDD90000/1000041/16Attack/Normal89.35%××
Kyoto 2006+90000/1000023/8Attack/Normal87.12%××

ARM [29]KDD9941/11NBAll62.02%××
UNSW-NB1549/1137.50%××

HVS [12]Kyoto 2006+31360/4704018/5MLPAttack/Normal98.28%3.05%0.35%1.70%64.470.02××
PLS18/5PLSAttack/Normal94.72%6.52%4.02%5.27%0.020.03××

CFS [30]NSL_KDD25192/1185041/8RandomTreeAll97.49%2.50%

GeFS: generic feature selection; mRMR: minimal redundancy maximal relevance; BN: Bayesian networks; CART: classification and regression tree; FMIFS: flexible mutual information feature selection; MIFS: mutual information feature selection; FLCFS: flexible mutual information feature selection; BBAL: binary bat algorithm with lévy flights; FVBRM: feature vitality based reduction method; ML: feature vitality based reduction method; ARM: association rule mining; HVS: heuristic for variable selection; and PLS: partial least squares regression. given. mentioned. Number of features : and represent the number of features before and after fusion, respectively.