Research Article

Empirical Analysis of Machine Learning Algorithms for Multiclass Prediction

Table 1

Comparison of ML- and DL-based classifiers.

ClassifierUnderlying methodologyClassifier applicabilityNature of prediction/label classAdvantage(s)Disadvantage(s)


Naïve Bayes [35]Bayes theoremClassificationCategoricalLess parameter tuning, less data learning requirements, computationally fastConditional independence between attributes
Decision trees [36]Iterative Dichotomiser 3 (ID3)Classification, regressionCategorical, continuousSimple to interpret, shows higher accuracyTarget attribute must have discrete values; dataset must not have complex and many attributes (i.e., imbalance); uses greedy approach for generating DTs; prone to overfitting
Random forest [12]Aggregation of (decision) trees using bagging with C4.5 algorithmClassification, regressionCategorical, continuousNot susceptible to overfitting, reduces error rate while generating DTsGenerates parallel DTs, computationally slow on large and complex datasets
Gradient boosted trees [37, 38]Adaptive boosting using C4.5 algorithmClassification, regressionCategorical, continuousBoosting reduces error by reducing bias and to some extent variance sequential tree generation with improved learning in each iterationUses shallow weak learner trees, computationally faster than RF, harder parameter tuning
Deep learning [13]Convolutional neural networksClassification, regressionCategorical, continuousHigher accuracy sometimes exceeds human-level performance; DL algorithms scale with data; CNNs require relatively little preprocessingRequires large amounts of labeled data and substantial computing power