Research Article
Ensemble Convolution Neural Network for Robust Video Emotion Recognition Using Deep Semantics
Algorithm 3
Occlusion detection from patched image.
| Input: Keyframes | | Output: Representation of occluded face | | Input the extracted keyframe as a face image | | Generate a feature map (FM) from each keyframe | | Return 24 local patches (, … ) | | For each local patch | | Decomposes the feature map into 24 subfeature-maps (… ) | | Encode a weighted vector (wv) of local feature (lf) by a PG-Unit | | PG-Unit computes the weight by an attention net based on its obstructed-ness | | Concatenate the weighted local features | | Return the representation of the occluded face. | | End For |
|