Table 11: Factors influencing accuracy (RMSE) for each algorithm (standard beta coefficient): -MEANS_CLUSTERING.

Data characteristictrees.J48BayesNetSMORegressionLogisticIBk

N_attributes−.080**−.078**−.181**−.068**.117**.009
N_cases−.079**−.049**.012−.017−.033−.047**
C_imbalance.136**.240**.263**.524**.145**.206**
R_missing.057*.079**.041.084**.079**.057*
SE_HS.236**.289**.183**.271**.315**.264**
SE_VS−.009−.013−.006−.013−.014−.011
Spread−.362**−.439**−.262**−.440**−.474**−.363**
P_missing_dum1−.037−.042−.036−.032−.038−.046
P_missing_dum2.002.013.001.014.009.004

Note  1: N_attributes: number of attributes, N_cases: number of cases, C_imbalance: degree of class imbalance, R_missing: missing data ratio, SE_HS: horizontal scatteredness, SE_VS: vertical scatteredness, spread: missing data spread, and missing patterns: univariate (P_missing_dum1 = 1, P_missing_dum2 = 0), monotone (P_missing_dum1 = 0, P_missing_dum2 = 1), and arbitrary (P_missing_dum1 = 1, P_missing_dum2 = 1)
Note  2: RMSE indicates error; therefore, lower values are better.
Note  3: * < 0.05, ** < 0.01.