Research Article

An Approach to Data Reduction for Learning from Big Datasets: Integrating Stacking, Rotation, and Agent Population Learning Techniques

Algorithm 2

Stacked generalization with rotation.
Input: Dataset D with the feature set A; number of iterations q (i.e. the number of stacking folds); natural number T (defined by the user); option – the Boolean parameter determining the type of the transformation in the feature space (deterministic or nondeterministic)
Output: – set of the base classifiers
Begin
Allocate randomly instances from D into q disjoint subsets .
For to q do
 Let
 Partition randomly the feature set A into T subsets {Ait:t ≤ T} obtaining subsets D’it, each with the identical number of features, smaller than the number of features in the original dataset.
 For to T do
  Generate training set D’it with features Ait, through bootstrapping with the size of 75% of the original dataset.
  If option then
  Run PCA or ICA over the transformed D’it and produce new training datasets D”it, with features A’it, using the axis rotation;
  Else
  Run the PLA for feature selection on D’it and produce new training datasets D”it described on the set A’it.
  End If
  Partition D’it into clusters using the KFCM procedure or SC procedure.
  Run PLA for the prototype selection obtaining (i.e. subsets of the selected prototypes).
  Induce base classifier hit based on using Di with features A’it, as the testing set.
 End for
End for
Return hit,…,hqT.
End.