Research Article
Is Vehicle Plate Corner Prediction by Vision Transformer Better than CNNs?
Table 1
Different configurations of the proposed ViT model, given the input image resolution of according to combinations of four parameters: patch size , depth , number of multiheads , and embedding dimension . In our experiments, a total of 600 possible configurations can be generated under configuration constraints.
|