Research Article

Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System

Table 2

Extended MOS and MAE results with 8 samples.

MAETarget MOSObtained MOS
ValenceArousalValenceArousal

10.1938.08.64.43 ± 1.995.14 ± 2.10
20.1423.85.25.57 ± 1.845.57 ± 1.59
30.2346.48.04.71 ± 1.585.00 ± 1.78
40.2036.46.84.71 ± 1.393.71 ± 1.03
50.2806.67.06.43 ± 1.296.57 ± 1.05
60.2626.06.83.00 ± 0.534.29 ± 0.88
70.2067.26.85.86 ± 1.464.71 ± 2.05
80.2195.27.25.43 ± 1.185.86 ± 1.36