Research Article
Distributed Nonparametric and Semiparametric Regression on SPARK for Big Data Forecasting
Table 2
Fragments of datasets.
(a) Synthetic dataset |
| | | |
| 2.31 | 0.87 | 1.26 | 1.45 | 1.27 | −0.36 | −0.5 | 0.47 | −0.06 | −1.9 | −0.94 | 1.51 | −1.51 | 0.33 | 1.72 | −0.09 | −0.1 | −1.71 | 0.17 | 0.24 | 1.64 | 1.8 | 0.07 | 0.77 | −0.5 | 0.45 | −0.22 | 0.76 | 0.94 | 1.32 |
|
|
(b) Hanover dataset |
| : travel time | : length | : speed | : stops | : congestion | : tr. lights | : left turns |
| 256 | 2107.51 | 30.30 | 2.10 | 42.43 | 225 | 0 | 284 | 2349.74 | 22.36 | 4.89 | 85.56 | 289 | 4 | 162 | 1248.51 | 19.33 | 9.27 | 85.91 | 81 | 1 | 448 | 2346.80 | 20.58 | 8.39 | 86.60 | 289 | 1 | 248 | 352.67 | 19.33 | 9.27 | 85.91 | 25 | 1 | 327 | 907.30 | 23.54 | 3.96 | 86.95 | 100 | 0 | 443.5 | 1093.29 | 22.01 | 5.44 | 88.66 | 169 | 0 | 294 | 348.35 | 23.68 | 3.81 | 89.33 | 25 | 0 | 125.5 | 1236.62 | 18.97 | 10.65 | 85.21 | 81 | 1 | 511.5 | 357.23 | 19.96 | 7.66 | 84.85 | 25 | 1 |
|
|
(c) Airlines dataset |
| DepDelay | DayOfWeek | Distance | MeanVis | MeanWind | Thunderstorm | Precipitationmm | WindDirDegrees | Num | Dest | DepTime |
| 0 | 3 | 588 | 24 | 14 | 0 | 12.7 | 333 | 18 | SNA | 2150 | 63 | 7 | 546 | 22 | 13 | 0 | 1.78 | 153 | 2 | GEG | 2256 | 143 | 5 | 919 | 24 | 18 | 0 | 9.14 | 308 | 27 | MCI | 2203 | −4 | 6 | 599 | 22 | 16 | 0 | 11.68 | 161 | 23 | SFO | 2147 | 4 | 6 | 368 | 24 | 14 | 0 | 7.62 | 151 | 22 | LAS | 2159 | 19 | 5 | 188 | 20 | 35 | 0 | 1.02 | 170 | 23 | IDA | 2204 | 25 | 7 | 291 | 23 | 19 | 0 | 0 | 128 | 22 | BOI | 2200 | 1 | 6 | 585 | 17 | 10 | 0 | 6.6 | 144 | 25 | SJC | 2151 | 0 | 3 | 507 | 28 | 13 | 0 | 9.91 | 353 | 24 | PHX | 2150 | 38 | 2 | 590 | 28 | 23 | 0 | 0 | 176 | 7 | LAX | 2243 |
|
|