Research Article

Channel-Wise Spatiotemporal Aggregation Technology for Face Video Forensics

Table 5

Binary classification accuracy (%) (higher is better) of different backbones on frames.

ModelDFF2FFSNTDFDC-PCeleb-DF

EfficientNet B0 [5]99.3199.6999.5399.1381.9793.97
Xception [22]99.2299.6299.5699.0080.7594.84
Inception V3 [23]98.8499.7899.4798.2479.7266.19
MobileNet V1 [24]99.1698.7599.5398.4779.0966.69
EfficientNet B0(w/o skip)83.5658.6258.8460.9476.3166.66
Xception (w/o skip)94.9158.8064.6253.9165.4467.50