Research Article

Fast Vehicle and Pedestrian Detection Using Improved Mask R-CNN

Figure 6

The RPN operation uses each pixel as an anchor point on each feature map and simultaneously generates candidate frames of three sizes and three ratios. All candidate frames are then subjected to NMS screening according to the score, and a certain number of candidate frames are selected and saved, which are used for subsequent training and prediction.