Research Article

Deep Metric Learning-Assisted 3D Audio-Visual Speaker Tracking via Two-Layer Particle Filter

Figure 3

(a–c) 3D tracking results of VO, AO, and AV trackers on azimuth (rad), elevation (rad), and radius (m) on seq11c2, where green line represents ground truth. (d) MAE in 3D (m). (e, f) 3D trajectories on seq08c1 and seq11c2, where green lines indicate ground truth trajectories.