Vision Transformer-Based Video Hashing Retrieval for Tracing the Source of Fake Videos

<div>The distribution of learned hash centers during training. (a) Prior to training, the hash distribution of the dataset is scattered. (b) As training progresses, the hash distribution of each group in the dataset gradually clusters, resulting in changing hash centers for the dataset. (c) Eventually, the hash distribution of the dataset becomes sparse, the Hamming distance between the hash distributions of each group of data becomes very small, and the average Hamming distance between groups approaches half of the hash bits. The hash centers (represented by black points) are far apart from each other, and each group of data is located around a hash center: (a) raw, (b) training, and (c) final.</div>

Security and Communication Networks

fig2

Figure 2

Figure 2: Vision Transformer-Based Video Hashing Retrieval for Tracing the Source of Fake Videos