Research Article
Channel-Wise Spatiotemporal Aggregation Technology for Face Video Forensics
Require: Training face video clips ; Corresponding label . | 1: for each do | 2: Decompose into the sequence of frames ; | 3: Detect and crop faces frames from , then denote them as ; | 4: end for | 5: Feed into the backbone, producing a set of feature maps ; | 6: Decompose into ; | 7: Combine by going through and stacking that has the equal , producing ; | 8: Feed into weights-sharing classifier, producing ; | 9: Calculating binary classification error between and ; | 10: Update the parameters of the model by back propagation; | Ensure: Optimal model for fake face detection |
|