Research Article

enDNA-Prot: Identification of DNA-Binding Proteins by Applying Ensemble Learning

Algorithm 1

The pseudocode of Unbalanced-AdaBoost.
Input:
positive train dataset ;
negative train dataset ;
Base learning algorithm ;
Number of learning rounds .
Output:
        
Process:
(1)     //Initialize the weight distribution on
(2) For   :
(3)     ; //sampling negative samples from the negative train dataset with weight
  distribution
(4)   ; //combine the positive train dataset and sampled dataset into a dataset
(5)   ; //train the base learner on the dataset
(6)   ; //test on the negative dataset and calculate its predicted error
(7)   ; //calculate the voting weight of the base learner
(8)   //calculate the weight distribution for the next learning round,
  where size denotes the number of samples in and used to ensure that is a distribution.
(10) End