|
ML techniques | Advantages | Disadvantages | Related articles |
|
Polynomial regression (PR) | (i) Works on any size of the dataset (ii) Gives information about the relevance of features | (i) The correct polynomial degree should be chosen for a good bias | [24, 25, 27, 30, 31, 33, 34, 40ā42, 45, 47] |
Neural networks (NN) | (i) Efficiency (ii) Continuous learning (iii) Data retrieval (iv) Multitasking | (i) Hardware dependent (ii) Complex algorithms (iii) Black-box nature (iv) Approximate results (v) Data-dependency | [12, 35, 36, 42, 43, 46, 48, 49] |
Support vector machine (SVM) | (i) Works very well on nonlinear problems (ii) Easily adaptable (iii) Ignores outliers | (i) Requires feature scaling | [38, 46] |
Gradient boost (GBoost) | (i) Performs very well on medium and small datasets (ii) Easy to interpret (iii) Prevents overfitting (iv) A great approach for enhancing classification and regression solutions | (i) Sensitive to outliers (ii) Hardly scalable (iii) Poor results on unstructured data | [42] |
Random forest (RF) | (i) Accurate (ii) Powerful (iii) Works very well on linear/nonlinear problems | (i) The number of trees should be chosen (ii) No interpretability | [42] |
K-nearest neighbor (KNN) | (i) Fast (ii) Easy to implement | (i) Need feature scaling (ii) Data should be cleaned | [42] |
Hierarchical clustering (HC) | (i) Easy to implement (ii) Does not require the number of clusters | (i) Long runtime | [26, 30, 51] |
K-means | (i) Easy to implement (ii) Large datasets (iii) Convergence (iv) Different shapes and sizes with generalization | (i) Requires the number of clusters (ii) Requires initial values (iii) Requires dimensionality reduction tools if the number of dimensions is high (iv) Does not ignore outliers | [37, 50] |
Spatial clustering (SC) | (i) Does not require the number of clusters (ii) Different shaped clusters (iii) Ignores outliers | (i) Poor results for datasets with various densities (ii) Poor results for unstructured data (iii) Not deterministic | [32] |
Principal component analysis (PCA) | (i) Eliminates correlated features (ii) Enhances algorithm performance (iii) Reduces overfitting (iv) Enhances visualization | (i) Less interpretable (ii) Data has to be uniformed (iii) Loss of information | [26, 27, 44, 52] |
Kernel density estimator (KDE) | (i) Smooth visualization (ii) Works with various shapes and sizes | (i) Biased at the boundaries (ii) Information loss by oversmoothing | [28] |
Self-organizing maps (SOM) | (i) Interpretable (ii) Applicable for large datasets | (i) Requires initial weights | [29] |
|