Research Article
[Retracted] ECG-ViT: A Transformer-Based ECG Classifier for Energy-Constraint Wearable Devices
Figure 3
(a) Traditional knowledge distillation. (b) Two-stage optimization of distillation, which has to pretrain a large-scale teacher model. (c) Online distillation using either mutual learning or ensemble learning, which does not involve a teacher model.
(a) |
(b) |
(c) |