Research Article

Semi-Supervised Predictive Clustering Trees for (Hierarchical) Multi-Label Classification

Table 3

HMLC datasets and their characteristics.

DatasetDomain

Danish farms [42]Ecology1944132/5Tree7237
Enron [34]Text17021001/0Tree5323.38
Slovenian rivers [40]Ecology10600/16Tree724425
ImCLEF07A [43]Images110060/80Tree9631
ImCLEF07D [43]Images110060/80Tree4631
Diatoms [44]Images31190/371Tree37730.94
Cellcycle-GO [25]Genomics37660/77DAG41261235.91
Church-GO [25]Genomics37641/26DAG41261235.89
Derisi-GO [25]Genomics37330/63DAG41201235.99
Eisen-GO [25]Genomics24250/79DAG357411.1239.04
Expr-GO [25]Genomics37884/547DAG41321235.87
Pheno-GO [25]Genomics159269/0DAG31281236.43

is the number of examples, is the number of descriptive variables (nominal/continuous), is the type of the label hierarchy, is the number of nodes in the hierarchy, is the maximal depth of the hierarchy, and is the average number of labels per example.