Research Article

Missing Values and Optimal Selection of an Imputation Method and Classification Algorithm to Improve the Accuracy of Ubiquitous Computing Applications

Table 9

Factors influencing accuracy (RMSE) for each algorithm (standard beta coefficient): Hot_deck.

Data characteristictrees.J48BayesNetSMORegressionLogisticIBk

N_attributes−.080**−.073**−.176**−.071**.115**.007
N_cases−.081**−.049**.012−.018−.034*−.047**
C_imbalance.135**.237**.261**.524**.133**.211**
R_missing.062**.083**.044.084**.075**.070**
SE_HS.225**.275**.183**.271**.313**.254**
SE_VS−.009−.013−.006−.013−.014−.010
Spread−.365**−.428**−.265**−.427**−.441**−.361**
P_missing_dum1−.035−.037−.034−.033−.048−.038
P_missing_dum2.012.015.004.012−.004.009

Note  1: N_attributes: number of attributes, N_cases: number of cases, C_imbalance: degree of class imbalance, R_missing: missing data ratio, SE_HS: horizontal scatteredness, SE_VS: vertical scatteredness, spread: missing data spread, and missing patterns: univariate (P_missing_dum1 = 1, P_missing_dum2 = 0), monotone (P_missing_dum1 = 0, P_missing_dum2 = 1), and arbitrary (P_missing_dum1 = 1, P_missing_dum2 = 1)
Note  2: RMSE indicates error; therefore, lower values are better.
Note  3: * < 0.05, ** < 0.01.