Research Article
Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies
Algorithm 1
Stochastic inverse-probability oversampling.
Input: Observed sample of size , IP weights | Output: Unbiased prediction for new unbiased data | (1) Perform IP oversampling, resulting in reconstructed sample of size | (2) for to do | for to do | (a) Estimate of distribution | (b) Draw noise vector from of length | (c) Rebuild original stratum as | end | (a) Combine strata to sample: | | (b) Fit classifier | end | (3) Output the ensemble of learners | (4) Aggregate predictions on new data set by averaging: |
|