Research Article

Vision Transformer and Deep Sequence Learning for Human Activity Recognition in Surveillance Videos

Figure 6

Class-wise accuracy of HMDB51 dataset on the proposed ViT and multilayer LSTM model.