Research Article
Object Detection Based on Swin Deformable Transformer-BiPAFPN-YOLOX
Table 2
Results of ablation experiment using Reconstructed Deformable Self-Attention at different stages on the COCO 2017 validation set.
| Stage/reconstructed deformable self-attention | AP (%) | APS | Param (M) | GFLOPs | Infer time (ms) | FPS | Stage 1 | Stage 2 | Stage 3 | Stage 4 | Block 1 | Block 2 | Block 1 | Block 2 | Block 1 | Block 2 | Block 1 | Block 2 |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 49.7 | 31.2 | 44.5 | 110.7 | 10.1 | 98.7 | | ✓ | ✓ | ✓ | 49.6 | 31.0 | 51.8 | 144.3 | 11.3 | 88.5 | | | ✓ | ✓ | 49.4 | 30.6 | 60.3 | 175.2 | 11.7 | 85.5 | | | | ✓ | 48.9 | 30.3 | 68.7 | 193.6 | 12.9 | 77.5 | Swin transformer-BiPAFPN-YOLOX | 48.4 | 29.3 | 79.4 | 211.8 | 13.7 | 73.0 |
|
|