Table of Contents Author Guidelines Submit a Manuscript
Advances in Bioinformatics
Volume 2016, Article ID 5670851, 6 pages
http://dx.doi.org/10.1155/2016/5670851
Research Article

Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants

1Computer Science, The College of Sakhnin, 30810 Sakhnin, Israel
2The Institute of Applied Research, The Galilee Society, P.O. Box 437, 20200 Shefa Amr, Israel
3Molecular Biology and Genetics, Izmir Institute of Technology, Urla, 35430 Izmir, Turkey
4Bionia Incorporated, IZTEKGEB A8, Urla, 35430 Izmir, Turkey

Received 31 October 2015; Accepted 16 March 2016

Academic Editor: Paul Harrison

Copyright © 2016 Malik Yousef et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.