Research Article

A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming

Algorithm 1

Map function of MapReduce-I
Function MAP-1 (training dataset)
Begin
   Input: training dataset D with m instances and n attributes
   Partition the dataset D into s partitions as p1, p2, p3….. ps
   Read x_train[], y_train for each partitioned dataset
      Compute intercept and correlation coefficients for each block of instances
      Convert it into (key, value > pair as < Dataset_id, (intercept, coefficients)>
   Output < Dataset_id, <(intercept, coefficients[])>
end