Research Article

Cascaded Hierarchical CNN for RGB-Based 3D Hand Pose Estimation

Figure 1

Basic framework diagram of 4CHNet. The cropped color images are used as the input of 4CHNet to, respectively, estimate masks of hands and heatmaps of hand through the mask estimation stage and the 2D hand pose estimation stage, then to estimate the 2D heatmaps of fingers and palms through the hierarchical estimation stage, and finally to estimate 3D hand poses, 3D finger poses, and 3D palm poses through the 3D hand pose estimation stage, respectively.