Input: |
–– Initial training dataset ; |
–– Size of random subset at each iteration; |
–– Number of hidden nodes; |
–– Number of iterations; |
––Loss function which is twice differentiable; |
–– Regularization factor, . |
Use to train the initial written as , where the |
input weights and hidden biases are randomly selected |
within the range of and the output-layer weights are |
determined analytically by , and record the initial |
base learner ; |
for do |
Randomly generate a permutation of the integers , |
and then a stochastic subset of the whole training dataset is defined |
as ; |
Calculate the first order gradient statistics on the loss function |
with regard to the predicted output of the current ensemble model |
for each training instance in the subset as |
; |
Calculate the second order gradient statistics on the loss function |
with regard to the predicted output of current ensemble model for each |
training instance in the subset as |
; |
For the training instances in the subset, compute the current |
pseudo residuals , where |
; |
Determine the output weights used as a heuristic item |
for the derivation formula based on the modified training |
dataset as follows |
, |
where is calculated according to the randomly selected |
input weights and hidden biases ; |
Use the derivation formula in Algorithm 1 to obtain the optimal |
output-layer weights of ; |
Add the -th individual learner () to |
the current ensemble learning model as |
; |
end for |
output: |
The final ensemble model . |