Table of Contents Author Guidelines Submit a Manuscript
Security and Communication Networks
Volume 2017, Article ID 9838169, 10 pages
https://doi.org/10.1155/2017/9838169
Research Article

New Hybrid Features Selection Method: A Case Study on Websites Phishing

College of Computer Science and Information System, Najran University, Najran, Saudi Arabia

Correspondence should be addressed to Khairan D. Rajab; moc.liamg@rnariahk

Received 4 November 2016; Revised 26 February 2017; Accepted 8 March 2017; Published 19 March 2017

Academic Editor: Muhammad Khurram Khan

Copyright © 2017 Khairan D. Rajab. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Phishing is one of the serious web threats that involves mimicking authenticated websites to deceive users in order to obtain their financial information. Phishing has caused financial damage to the different online stakeholders. It is massive in the magnitude of hundreds of millions; hence it is essential to minimize this risk. Classifying websites into “phishy” and legitimate types is a primary task in data mining that security experts and decision makers are hoping to improve particularly with respect to the detection rate and reliability of the results. One way to ensure the reliability of the results and to enhance performance is to identify a set of related features early on so the data dimensionality reduces and irrelevant features are discarded. To increase reliability of preprocessing, this article proposes a new feature selection method that combines the scores of multiple known methods to minimize discrepancies in feature selection results. The proposed method has been applied to the problem of website phishing classification to show its pros and cons in identifying relevant features. Results against a security dataset reveal that the proposed preprocessing method was able to derive new features datasets which when mined generate high competitive classifiers with reference to detection rate when compared to results obtained from other features selection methods.