Research Article

Distance Variance Score: An Efficient Feature Selection Method in Text Classification

Table 1

Details of the two UCI data sets.

Data setDBWorld data setCNAE data set

Number of documents641080
Number of terms3721856
Data preprocessing taskStop word removalā€”
Associated tasksClassificationClassification
Date donated2011-11-062012-08-03
Number of web hits1343817858