Research Article

Vision Transformer-Based Video Hashing Retrieval for Tracing the Source of Fake Videos

Figure 4

Overview of our proposed networks. Our method consists of two networks: ViTHash and localizator, and two basic modules: upper sampling and transformer block. ViTHash and localizator are composed of these basic modules. ViTHash trains hash centers from triplet videos, which include the original video and two randomly related fake videos. The trained hash centers are used to trace the source of fake videos. The localizator is designed to analyze the differences between the traced video and the fake video, which are not affected by the video quality or cropping. The different areas of the two videos are represented by generated masks.