Research Article
Fast Vehicle and Pedestrian Detection Using Improved Mask R-CNN
Figure 6
The RPN operation uses each pixel as an anchor point on each feature map and simultaneously generates candidate frames of three sizes and three ratios. All candidate frames are then subjected to NMS screening according to the score, and a certain number of candidate frames are selected and saved, which are used for subsequent training and prediction.