Review Article
Complex Power System Status Monitoring and Evaluation Using Big Data Platform and Machine Learning Algorithms: A Review and a Case Study
Table 2
Comparisons of open-source machine learning tools/algorithms for Big Data.
| Category | Algorithm | Open source/free software | Weka | R | Shogun | Mahout | MLib | Orange | Oryx |
| Classification | Logistic regression | | | | | | | | (Complementary) naive Bayes | | | | | | | | Decision tree | | | | | | | | Neural networks | | | | | | | | SVM | | | | | | | | Random forest | | | | | | | | Hidden Markov models | | | | | | | |
| Regression | Linear regression | | | | | | | | Generalized linear models | | | | | | | | Lasso/ridge regression | | | | | | | | Decision tree regression | | | | | | | |
| Clustering | -means | | | | | | | | Fuzzy -means | | | | | | | | Gaussian mixture model (GMM) | | | | | | | | Streaming -means | | | | | | | |
| Collaborative filtering | Alternating least squares (ALS) | | | | | | | | Matrix factorization-based | | | | | | | |
| Dimensionality Reduction | Singular value decomposition (SVD) | | | | | | | | Principal component analysis | | | | | | | |
| Optimization primitive | Stochastic gradient descent) | | | | | | | | Limited-memory BFGS (L-BFGS) | | | | | | | |
| Feature extraction | TF-IDF | | | | | | | | Word2Vec | | | | | | | |
| Frequent pattern mining | FP growth | | | | | | | | Association rules | | | | | | | |
|
|