Review Article

Complex Power System Status Monitoring and Evaluation Using Big Data Platform and Machine Learning Algorithms: A Review and a Case Study

Table 1

Open-source/free software of Big Data machine learning method brief descriptions.

NameDateDeveloperBrief descriptions

Octave1993James Rawlings, University of Wisconsin-Madison; John EkerdtA high-level language for numerical computations; suitable for solving linear and nonlinear problems; mostly compatible with Matlab, batch-oriented language [64].
Weka1994University of WaikatoCan be applied directly or called from a self-developed Java code and well-suited for developing new machine learning schemes [65].
R1996Ross Ihaka, Robert GentlemanA language and environment for statistical computing and graphics; provides more than 70 packages of statistical learning algorithm; highly extensible [66].
Shogun1999Soeren Sonnenburg and Gunnar RaetschIt provides a wide range of unified machine learning methods; easily combines multiple data representations, algorithm classes, and general purpose tools; rapid prototyping of data pipelines and extensibility of new algorithms [67].
http://AForge.net2008Andrew Kirillov, Fabio CaversanIt is an open-source C# framework in the fields of Computer Vision and Artificial Intelligence; image processing, neural networks, genetic algorithms, fuzzy logic, machine learning, robotics, etc. [68].
Mahout2009Grant Ingersoll, Apache Software FoundationIt is an environment for quickly creating scalable machine learning applications; a framework to build scalable algorithms; has mature Hadoop MapReduce algorithms; suitable for Scala + Apache Spark, H2O, and Apache Flink [69].
MLlib2009UC Berkeley AMPLab, The Apache Software Foundation.It is the Spark implementation of machine learning algorithms; easy to write parallel programs; and has potential to build new algorithms [70].
scikit-learn2010David Cournapeau, Matthieu Brucher, etc.It is built on NumPy, SciPy, and matplotlib in Python environment; accessible, reusable in various contexts, and with simple and efficient tools [71].
Orange2010Bioinformatics Lab, University of Ljubljana, SloveniaIt is a data visualization and data analysis software; has interactive workflows with a large toolbox and a visualized process design based on Qt graphical interface [72].
CUDA-Convnet2012Alex KrizhevskyIt is a machine learning library with a built-in GPU acceleration; has been written by C++; with the CUDA GPU processing technology by NVidia [73].
ConvNetJS2012Andrej Karpathy, Stanford UniversityIt is a JavaScript library for training deep learning models in the browser; is able to specify and train convolutional networks; comprises an experimental reinforcement learning module [74].
Cloudera Oryx2013Sean Owen, Cloudera Hadoop DistributionIt provides simple real-time large-scale machine learning and predictive analytics infrastructure; is able to continuously build/update models from large-scale data streams and query models in real time [75].