Research Article

[Retracted] ECG-ViT: A Transformer-Based ECG Classifier for Energy-Constraint Wearable Devices

Figure 3

(a) Traditional knowledge distillation. (b) Two-stage optimization of distillation, which has to pretrain a large-scale teacher model. (c) Online distillation using either mutual learning or ensemble learning, which does not involve a teacher model.
(a)
(b)
(c)