Review Article

Deep Learning Methods for Malware and Intrusion Detection: A Systematic Literature Review

Table 2

Summary of the metadata extracted from the literature on Windows-based malware detection.

Ref.Description: method and features used to train and evaluate modelDL algorithm usedLibrary/framework usedPlatformDataset usedAccuracy/F1 score

[26]Malware classification by extracting static features and converting to gray imagesCNNNot statedWindowsKaggle by Microsoft98.86%

[27]Malware classification by converting malware binary file to gray image through code mapping, texture partitioning, and texture extractionCNNNot statedWindowsBIG 201599%

[28]Malware classification by extracting series of system calls having malicious behaviorNot statedNot statedWindowsSelf-generated95.6%

[29]Malware detection and classification by using the op-code and API calls data of malware and benign-wareCNN, BPNNNot statedWindowsSelf-generated95%

[30]Multilevel deep learning system for malware detection using different static and dynamic featuresProposed MLDLSNot statedWindowsSelf-generatedNot stated

[31]Ransomware detection system based on n-gram op-code with deep learningCNNKeras, TensorFlowWindowsSelf-generated89.5%

[32]Malware detection by transforming PE file to op-code sequences and representing the op-code as n-gram vectorDBNNot statedWindowsSelf-generatedAbout 98%

[33]Malware detection by visualizing the malware binary file as gray imageCNNMatConvNet in MATLABWindowsSelf-generated

[34]Malware detection using API calls of Windows’ executable filesDAE, RBMNot statedWindowsComodo Cloud Security Center’s datasetAround 98%

[35]Malware detection based on API calls sequence and statistical featuresLSTM, RNNTensorFlowWindowsSelf-generated95.7%

[36]Identifying executable files as malware or benign using static and dynamic analysis and categorizing the malware to the corresponding familyCNN, LSTMTensorFlow, KerasWindowsMalimg, EMBER, self-generated98.8%

[37]Hybrid image-based technique for malware detection by converting malware binaries to gray imagesCNN, LSTMTensorFlow, KerasWindowsBIG 201596–97%

[38]Malware detection by extracting API call sequences of malware using dynamic analysis and generating feature imagesCNNNot statedWindowsVirusShare datasetAbout 99%

[39]Predicting malicious behavior of executable program based on small amount of behavioral data within the first few seconds of executionRNNKeras, scikit-learnWindowsSelf-generated96%

[40]DLMD: malware detection technique based on static features using byte and ASM filesCNNPyTorchWindowsBIG 201597.5%

[41]Malware detection extracting control flow graph of the sample by lazy binding and transforming it into an imageCNNNot statedWindowsMALICIA, VirusShare, VXHeaven92%–97.7%

[42]Deep learning system with two hidden layers for malware detection using dependency of malware sequence and avoiding back-propagationTELMNot statedWindowsKaggle, VXHeavenAbove 99%

[43]Malware classification by transforming malware binary file to grayscale imagesCNN, LSTMTensorFlow, KerasWindowsBIG 201598.2%

[44]Zero-day malware detection by generating fake malware and learning to distinguish it from the real malwareDAE, DCGANKerasWindowsKaggleAbout 99%

[45]Malware detection by visualizing the malware as grayscale imageCNNTensorFlowWindowsMalimg, Microsoft dataset99.97%

[46]Detecting threats in the cloud-assisted Internet of things by extracting API calls data from malwareDBNNot statedWindowsVXHeavenUp to 99.78%

[47]Malware classification by visualizing the malware as grayscale imageCNNNot statedWindowsMalimg, BIG 201597.5%

[48]Malware variants detection by visualizing malware samples as grayscale imagesCNNCaffe NN frameworkWindowsDataset by Vision Research Lab94.5%

[49]Malware detection by converting malware executable to grayscale image and using NSGA-II algorithm to deal with data imbalanceCNNTensorFlowWindowsDataset by Vision Research Lab97.6%

[50]Malware detection by visualizing the malware sample as a grayscale imageDeep transfer learningNot statedWindowsNot stated99.25%

[51]Malware detection by using static analysis to extract features of the malware samplesLSTMKeras, TensorFlowWindowsSelf-generated dataset named MC-dataset-multiclass90.63%

[52]Malware detection by visualizing the malware sample as a grayscale imageCNNTensorFlowWindowsMalimg80.5%

[53]Malware detection by extracting features, like file activity, registry activity, service activity, processes, runtime DLLs and network activities, etc., and applying big data analytics techniquesNot statedKerasWindowsSelf-generated97%

[54]Malware classification by converting malware binaries to Markov imagesCNNKeras, TensorFlowWindowsMicrosoft dataset, Drebin dataset97.3% for Drebin, 99.3% for Microsoft

[55]A comparative study of CNN and ELM-based detection systems using malware represented as grayscale imagesCNNKerasWindowsMalimg96.3% for CNN, 97.7% for ELM

[56]Metamorphic malware detection using API calls made on the operating systemLSTMKerasWindowsSelf-generated API sequence datasetUp to 98.5%

[57]Malware detection by extracting features of PE files, including import functions feature, general information feature, and bytes entropy featureNot statedNot statedWindowsSelf-generatedAUC up to 0.989

[58]Cryptomining malware detection by static and dynamic analysis of the op-code sequences of PE filesCNN, LSTM, ATT-LSTMNot statedWindowsSelf-generated97% on average

[59]Malware classification using malware samples represented as grayscale imagesCNNKeras, TensorFlowWindowsMalimg99.72%

[60]Malware classification by extracting features including API calls, sequence of assembly language instructions, and malware’s binary contentsCNNTensorFlowWindowsKaggle99.7%

[61]Image-based malware classification system using an ensemble of CNNCNNTensorFlow, Keras, scikit-learnWindowsMalimg99.5%

[62]Malware detection by black-and-white embedding of malware images rather than grayscale to avoid bit loss in byteCNNKeras, TensorFlowWindowsKISA dataset92.8%

[63]Malware classification by generating a low-dimensional vector and using op-codes and API function calls to train modelBi-LSTMNot statedWindowsMicrosoft dataset96.8%

[64]Malware detection by extracting the API calls sequence and generating the API pixel vector and finally visualizing the malwareCNNNot statedWindowsSelf-generated94.7%