The Scientific World Journal

Research Article

A Dynamic Ensemble Framework for Mining Textual Streams with Class Imbalance

Clustering forest.

Output: Class label of the testing instance
Input: : The data chunk at each time stamp
: The number of CTs in CF
: The maximum number of CTs selected in the clustering forest
: The current ensemble model
: Misclassified subset at the -th time stamp
: Rare-class subset at the -th time stamp.
: The Rare-class subset as a proportion of the training chunk; : The threshold of
(1) at the -th time stamp
(2) built the training set by Algorithm 2
(2) create a new CT using
(3) if
(4)
(5) Endif
(6) if
(7)
(8) Endif
(9) compute the accuracy weight of each clustering tree
(10) UPDATE() based on the adaptive selection method
(11) obtain the misclassified subset and the rare-class subset
(12) for each testing instance do:
(13) PREDICT() by the voting method.
(14) Endfor.