BioMed Research International

Research Article

enDNA-Prot: Identification of DNA-Binding Proteins by Applying Ensemble Learning

The pseudocode of Unbalanced-AdaBoost.

Input:
positive train dataset ;
negative train dataset ;
Base learning algorithm ;
Number of learning rounds .
Output:

Process:
(1) //Initialize the weight distribution on
(2) For :
(3) ; //sampling negative samples from the negative train dataset with weight
distribution
(4) ; //combine the positive train dataset and sampled dataset into a dataset
(5) ; //train the base learner on the dataset
(6) ; //test on the negative dataset and calculate its predicted error
(7) ; //calculate the voting weight of the base learner
(8) //calculate the weight distribution for the next learning round,
where size denotes the number of samples in and used to ensure that is a distribution.
(10) End