Research Article
Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information
Table 1
The architecture of the used CNN.
| Name | Patch size/stride | Output size |
| Conv 1 | 3 × 3/1 | 32 × 128 × 64 | Conv 2 | 3 × 3/1 | 32 × 128 × 64 | Max pool 3 | 3 × 3/2 | 32 × 64 × 32 | Residual 4 | 3 × 3/1 | 32 × 64 × 32 | Residual 5 | 3 × 3/1 | 32 × 64 × 32 | Residual 6 | 3 × 3/2 | 64 × 32 × 16 | Residual 7 | 3 × 3/1 | 64 × 32 × 16 | Residual 8 | 3 × 3/2 | 128 × 16 × 8 | Residual 9 | 3 × 3/1 | 128 × 16 × 8 | Dense 10 | — | 128 | Batch and normalization | — | 128 |
|
|