Table of Contents Author Guidelines Submit a Manuscript
Journal of Computer Networks and Communications
Volume 2019, Article ID 4612474, 9 pages
https://doi.org/10.1155/2019/4612474
Research Article

Malicious Domain Names Detection Algorithm Based on N-Gram

1School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
2Department of Mathematics and Computer Science, Fort Valley State University, Fort Valley, GA 31030, USA

Correspondence should be addressed to Hong Zhao; moc.qq@005682495

Received 21 November 2018; Accepted 15 January 2019; Published 3 February 2019

Guest Editor: Saman S. Chaeikar

Copyright © 2019 Hong Zhao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Malicious domain name attacks have become a serious issue for Internet security. In this study, a malicious domain names detection algorithm based on N-Gram is proposed. The top 100,000 domain names in Alexa 2013 are used in the N-Gram method. Each domain name excluding the top-level domain is segmented into substrings according to its domain level with the lengths of 3, 4, 5, 6, and 7. The substring set of the 100,000 domain names is established, and the weight value of a substring is calculated according to its occurrence number in the substring set. To detect a malicious attack, the domain name is also segmented by the N-Gram method and its reputation value is calculated based on the weight values of its substrings. Finally, the judgment of whether the domain name is malicious is made by thresholding. In the experiments on Alexa 2017 and Malware domain list, the proposed detection algorithm yielded an accuracy rate of 94.04%, a false negative rate of 7.42%, and a false positive rate of 6.14%. The time complexity is lower than other popular malicious domain names detection algorithms.