Security and Communication Networks

Research Article

BLATTA: Early Exploit Detection on Network Traffic with Recurrent Neural Networks

Table 1

Related works in exploit detection. Unlike previous works, Blatta does not have to read until the end of application layer messages to detect exploit traffic.


Paper	Features	Detection method	Dataset(s)	Learning type	Protocol(s)	Early prediction

PAYL [7]	Relative frequency count of each 1-gram	Based on statistical model and Mahalanobis distance	D, SG	U	HTTP, SMTP, SSH	No
RePIDS [8]	Mahalanobis distance map which is originated from relative frequency count of each 1-gram, filtered by PCA.	Based on statistical model and Mahalanobis distance	D, M	U	HTTP	No
McPAD [9]	-grams	Multi one-class SVM classifier	D, M	U	HTTP	No
HMMPayl [10]	Byte sequences of the L7 payload.	Ensemble of HMMs	D, M, DI	U	HTTP	No
Oza et al. [11]	Relative frequency count of each 1-gram.	Based on statistical model	D, M, SG	U	HTTP	No
OCPAD [12]	High-order -grams (n > 1).	Based on the occurrence probability of an -grams in a packet	M, SG	U	HTTP	No
Bartos et al. [13]	Information from HTTP request headers and the lengths	SVM	SG	S	HTTP	No
Zhang et al. [14]	Packet header information and HTTP and DNS messages	Naïve Bayes, Bayesian network, SVM	D, SG	S	DNS, HTTP	No
Decanter [15]	HTTP messages	Clustering	SG	U	HTTP	No
Golait and Hubbali [16]	Byte sequence of the L7 payload	Probabilistic counting deterministic timed automata	SG	U	SIP	No
Duessel et al. [17]	Contextual -grams of the L7 payload	One-class SVM	SG	U	HTTP, RPC	No
Min et al. [18]	Words of the L7 payload	CNN and random forest	I	S	HTTP	No
Jin et al. [19]	-grams	Multi one-class SVM classifier	M	U	HTTP	No
Hao et al. [20]	Byte sequence of the L7 payload	Variant gated recurrent unit	I	S	HTTP	No
Schneider and Bottinger [21]	Byte sequence of the L7 payload	Stacked autoencoder	O	U	Modbus	No
Hamed et al. [22]	-grams of base64-encoded payload	SVM	I	S	All protocols in the datasets	No
Pratomo et al. [23]	Byte frequency of application layer messages	Outlier detection with deep autoencoder	SW	U	HTTP, SMTP	No
Qin et al. [24]	Byte sequence of the L7 payload	Using a recurrent neural network	O	S	HTTP	No
Liu et al. [25]	Byte sequence of the L7 payload	Using a recurrent neural network with embedded vectors	D, O	S	HTTP	No
Zhao and Ahn [26]	Disassembled instructions of bytes in network traffic	Employing Markov chain-based model and SVM	SG	S	Not mentioned	No
Shabtai et al. [27]	-grams of a file and -grams of opcodes in a file, then calculated TF/IDF of those -grams	Various ML algorithm, e.g., random forest, decision tree, Naïve Bayes, and few others	SG	S	File classification	No
SigFree [28]	Disassembled instructions of bytes in application layer payload	Analyses of instruction sequences to determine if they are code	SG	Non-ML	HTTP	No
Proposed approach	High-order -grams of application layer messages	Uses of recurrent neural network to early predict exploit traffic	SW, SG	S	HTTP, FTP	Yes

D = DARPA99; M = McPAD attacks dataset [9]; I = ISCX 2012; SG = self-generated; DI = DIEE; SW = UNSW-NB15; O = others; U = unsupervised; S = supervised; non-ML = non-machine learning approach.