Research Article

Vision Transformer and Deep Sequence Learning for Human Activity Recognition in Surveillance Videos

Figure 5

Class-wise accuracy of UCF50 dataset on the proposed ViT and multilayer LSTM model.