Research Article

A Novel Feature Selection Technique for Text Classification Using Naïve Bayes

Table 3

Characteristics of the datasets.

DatasetsNumber of documentsNumber of termsNumber of classes/categories

CNAE-910808569
Hotel5033602
Gender32321002
Prosncons200014932
CookWare5023702
MyMail19444662
Reuters*27931703
Computers5033582
Flipkart40030432
SpamHam557266312
Books5033002
DBWorld6437232
NYdtm3104558727

The datasets can be mostly found at [11, 12].
*Data for three classes have been used for Reuters.