Complexity

Research Article

An Approach to Data Reduction for Learning from Big Datasets: Integrating Stacking, Rotation, and Agent Population Learning Techniques

Stacked generalization with rotation.

Input: Dataset D with the feature set A; number of iterations q (i.e. the number of stacking folds); natural number T (defined by the user); option – the Boolean parameter determining the type of the transformation in the feature space (deterministic or nondeterministic)
Output: – set of the base classifiers
Begin
Allocate randomly instances from D into q disjoint subsets .
For to q do
Let
Partition randomly the feature set A into T subsets {A_it:t ≤ T} obtaining subsets D’_it, each with the identical number of features, smaller than the number of features in the original dataset.
For to T do
Generate training set D’_it with features A_it, through bootstrapping with the size of 75% of the original dataset.
If option then
Run PCA or ICA over the transformed D’_it and produce new training datasets D”_it, with features A’_it, using the axis rotation;
Else
Run the PLA for feature selection on D’_it and produce new training datasets D”_it described on the set A’_it.
End If
Partition D’_it into clusters using the KFCM procedure or SC procedure.
Run PLA for the prototype selection obtaining (i.e. subsets of the selected prototypes).
Induce base classifier h_it based on using D_i with features A’_it, as the testing set.
End for
End for
Return h_it,…,h_qT.
End.