Table of Contents Author Guidelines Submit a Manuscript
Scientific Programming
Volume 2017, Article ID 3273891, 16 pages
https://doi.org/10.1155/2017/3273891
Research Article

An Efficient Platform for the Automatic Extraction of Patterns in Native Code

1Computer Science Department, University of Oviedo, Calvo Sotelo s/n, 33007 Oviedo, Spain
2Cork Institute of Technology, Computer Science Department, Rossa Avenue, Bishopstown, Cork, Ireland

Correspondence should be addressed to Francisco Ortin; se.ivoinu@nitro

Received 30 September 2016; Revised 26 December 2016; Accepted 17 January 2017; Published 28 February 2017

Academic Editor: Raphaël Couturier

Copyright © 2017 Javier Escalada et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Different software tools, such as decompilers, code quality analyzers, recognizers of packed executable files, authorship analyzers, and malware detectors, search for patterns in binary code. The use of machine learning algorithms, trained with programs taken from the huge number of applications in the existing open source code repositories, allows finding patterns not detected with the manual approach. To this end, we have created a versatile platform for the automatic extraction of patterns from native code, capable of processing big binary files. Its implementation has been parallelized, providing important runtime performance benefits for multicore architectures. Compared to the single-processor execution, the average performance improvement obtained with the best configuration is 3.5 factors over the maximum theoretical gain of 4 factors.