Security and Communication Networks

Research Article

Empirical Evaluation of Noise Influence on Supervised Machine Learning Algorithms Using Intrusion Detection Datasets

Table 4

The distribution of the network traffic between different categories before and after noise filtering in both the training and the testing portion of the UNSW-NB15.


Traffic categories	The training portion				The testing portion
	Before noise filtering		After noise filtering		Before noise filtering		After noise filtering
	No. of instances	Distribution (%)	No. of instances	Distribution (%)	No. of instances	Distribution (%)	No. of instances	Distribution (%)

Normal	56,000	31.9377	7,503	13.7093	37,000	44.9399	10,045	31.9213
Analysis	2000	1.1406	1,275	2.3296	677	0.8222	534	1.6969
Reconnaissance	10491	5.9831	7,313	13.3622	3,496	4.2462	2,801	8.9011
Shellcode	1133	0.6461	801	1.4635	378	0.4591	292	0.9279
Fuzzers	18184	10.3706	8,253	15.0797	6,062	7.3628	3,851	12.2378
Worm	130	0.0741	10	0.0182	44	0.0534	11	0.0349
Generic	40000	22.8126	5,650	10.3235	18,871	22.9206	7,052	22.4100
DoS	12264	6.9943	8,868	16.2034	4,089	4.9664	2,442	7.7602
Exploits	33393	19.0446	13,672	24.8990	11,132	13.5208	3,981	12.6509
Backdoors	1746	0.9957	1,384	2.5288	583	0.7081	459	1.4586