Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2013, Article ID 312067, 21 pages
http://dx.doi.org/10.1155/2013/312067
Research Article

Empirical Study of Homogeneous and Heterogeneous Ensemble Models for Software Development Effort Estimation

1Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
2College of Engineering, Tanta University, Tanta, Egypt

Received 13 March 2013; Revised 3 June 2013; Accepted 9 June 2013

Academic Editor: Ren-Jieh Kuo

Copyright © 2013 Mahmoud O. Elish et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Accurate estimation of software development effort is essential for effective management and control of software development projects. Many software effort estimation methods have been proposed in the literature including computational intelligence models. However, none of the existing models proved to be suitable under all circumstances; that is, their performance varies from one dataset to another. The goal of an ensemble model is to manage each of its individual models’ strengths and weaknesses automatically, leading to the best possible decision being taken overall. In this paper, we have developed different homogeneous and heterogeneous ensembles of optimized hybrid computational intelligence models for software development effort estimation. Different linear and nonlinear combiners have been used to combine the base hybrid learners. We have conducted an empirical study to evaluate and compare the performance of these ensembles using five popular datasets. The results confirm that individual models are not reliable as their performance is inconsistent and unstable across different datasets. Although none of the ensemble models was consistently the best, many of them were frequently among the best models for each dataset. The homogeneous ensemble of support vector regression (SVR), with the nonlinear combiner adaptive neurofuzzy inference systems-subtractive clustering (ANFIS-SC), was the best model when considering the average rank of each model across the five datasets.