Research Article

Missing Values and Optimal Selection of an Imputation Method and Classification Algorithm to Improve the Accuracy of Ubiquitous Computing Applications

Table 1

The characteristics of missing data.

VariablesMeaningCalculation

Missing data ratioThe number of missing values in the entire dataset as compared to the number of nonmissing valuesThe number of empty data cells/total cells
Patterns of missing dataUnivariate Ratio of missing to complete values for an existing feature compared to the values for all features
Monotone
Arbitrary
Horizontal scatterednessDistribution of missing values within each data recordDetermine the number of missing cells in each record and calculate the standard deviation
Vertical scatterednessDistribution of missing values for each attributeDetermine the number of missing cells in each feature and calculate the standard deviation
Missing data spreadLarger standard deviations indicate stronger effects of missing dataDetermine the weighted average of the standard deviations of features with missing data (weight: the ratio of missing to complete data for each feature)