Research Article
Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies
Figure 1
(a) Stratified random selection process of a two-phase case-control study. Feature characteristics known about a whole finite population are typically features which are inexpensive to measure and called characteristics recorded in Phase 1. The expensive characteristics are recorded only in Phase 2—in the final sample |
(b) Exemplary cross table for data before (left) and after (right) the selection process of a two-phase case-control study. There is a clear dependency between exposure and disease in the population. After the sampling process, this dependency vanishes completely for the final sample |