Research Article

Distributed Nonparametric and Semiparametric Regression on SPARK for Big Data Forecasting

Table 2

Fragments of datasets.
(a) Synthetic dataset


2.310.871.26
1.451.27−0.36
−0.50.47−0.06
−1.9−0.941.51
−1.510.331.72
−0.09−0.1−1.71
0.170.241.64
1.80.070.77
−0.50.45−0.22
0.760.941.32

(b) Hanover dataset

: travel time: length: speed: stops: congestion: tr. lights: left turns

2562107.5130.302.1042.432250
2842349.7422.364.8985.562894
1621248.5119.339.2785.91811
4482346.8020.588.3986.602891
248352.6719.339.2785.91251
327907.3023.543.9686.951000
443.51093.2922.015.4488.661690
294348.3523.683.8189.33250
125.51236.6218.9710.6585.21811
511.5357.2319.967.6684.85251

(c) Airlines dataset

DepDelay DayOfWeekDistanceMeanVisMeanWindThunderstormPrecipitationmmWindDirDegreesNumDestDepTime

035882414012.733318SNA2150
637546221301.781532GEG2256
1435919241809.1430827MCI2203
−465992216011.6816123SFO2147
46368241407.6215122LAS2159
195188203501.0217023IDA2204
25729123190012822BOI2200
16585171006.614425SJC2151
03507281309.9135324PHX2150
3825902823001767LAX2243