Research Article
Channel-Wise Spatiotemporal Aggregation Technology for Face Video Forensics
Table 5
Binary classification accuracy (%) (higher is better) of different backbones on frames.
| Model | DF | F2F | FS | NT | DFDC-P | Celeb-DF |
| EfficientNet B0 [5] | 99.31 | 99.69 | 99.53 | 99.13 | 81.97 | 93.97 | Xception [22] | 99.22 | 99.62 | 99.56 | 99.00 | 80.75 | 94.84 | Inception V3 [23] | 98.84 | 99.78 | 99.47 | 98.24 | 79.72 | 66.19 | MobileNet V1 [24] | 99.16 | 98.75 | 99.53 | 98.47 | 79.09 | 66.69 | EfficientNet B0(w/o skip) | 83.56 | 58.62 | 58.84 | 60.94 | 76.31 | 66.66 | Xception (w/o skip) | 94.91 | 58.80 | 64.62 | 53.91 | 65.44 | 67.50 |
|
|