Table of Contents Author Guidelines Submit a Manuscript
Applied Computational Intelligence and Soft Computing
Volume 2018, Article ID 4084850, 20 pages
https://doi.org/10.1155/2018/4084850
Research Article

A Comparison Study on Rule Extraction from Neural Network Ensembles, Boosted Shallow Trees, and SVMs

1Department of Computer Science, University of Applied Sciences and Arts Western Switzerland, Rue de la Prairie 4, 1202 Geneva, Switzerland
2Department of Computer Science, University of Geneva, Route de Drize 7, 1227 Carouge, Switzerland
3Department of Computer Science, Meiji University, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan

Correspondence should be addressed to Guido Bologna; hc.egseh@angolob.odiug

Received 27 July 2017; Revised 17 November 2017; Accepted 4 December 2017; Published 9 January 2018

Academic Editor: Erich Peter Klement

Copyright © 2018 Guido Bologna and Yoichi Hayashi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

One way to make the knowledge stored in an artificial neural network more intelligible is to extract symbolic rules. However, producing rules from Multilayer Perceptrons (MLPs) is an NP-hard problem. Many techniques have been introduced to generate rules from single neural networks, but very few were proposed for ensembles. Moreover, experiments were rarely assessed by 10-fold cross-validation trials. In this work, based on the Discretized Interpretable Multilayer Perceptron (DIMLP), experiments were performed on 10 repetitions of stratified 10-fold cross-validation trials over 25 binary classification problems. The DIMLP architecture allowed us to produce rules from DIMLP ensembles, boosted shallow trees (BSTs), and Support Vector Machines (SVM). The complexity of rulesets was measured with the average number of generated rules and average number of antecedents per rule. From the 25 used classification problems, the most complex rulesets were generated from BSTs trained by “gentle boosting” and “real boosting.” Moreover, we clearly observed that the less complex the rules were, the better their fidelity was. In fact, rules generated from decision stumps trained by modest boosting were, for almost all the 25 datasets, the simplest with the highest fidelity. Finally, in terms of average predictive accuracy and average ruleset complexity, the comparison of some of our results to those reported in the literature proved to be competitive.