Research Article

A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming

Table 1

Summary description of the used datasets.

S. No.Dataset#Attributes#Data types#Instances#File size (MB)#Year

1Combined cycle power plant4Multivariate95681.932014
2Wave energy converters49Multivariate2880001232019
3Year prediction MSD (subset of million song dataset)90Multivariate5153454332011
4Superconductivity data81Multivariate2126326.82018