Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis
Table 7
Correlations of prosodic and Laryngograph measures, which were in the best models for the human rating, with each other.
Feature
DurNorm
DurNorm
F0Min
F0Mean
F0Onset
F0OffPos
EnNorm
EnNorm
MeanJitter
MeanShimmer
StandDevShimmer
#+Voiced
RelNum+/−Voiced
CFx
CQx
Context
WPW
W
W
W
W
W
WPW
W
15 W
15 W
15 W
15 W
15 W
DurNorm
WPW
0.02
0.01
0.93
0.03
0.07
0.01
0.20
0.04
DurNorm
W
0.10
−0.30
−0.30
0.01
0.78
0.22
0.00
0.08
0.13
0.13
0.05
F0Min
W
0.02
−0.30
0.53
0.56
0.34
−0.54
−0.31
−0.70
−0.58
−0.26
F0Mean
W
−0.31
0.62
0.68
0.39
−0.32
0.02
0.12
F0Onset
W
0.00
0.62
0.62
0.29
0.05
0.07
0.02
0.06
F0OffPos
W
−0.32
0.33
0.20
0.27
−0.26
0.08
−0.32
−0.32
0.01
EnNorm
WPW
0.92
0.07
0.02
0.07
0.06
0.14
0.02
0.08
0.00
0.14
0.00
EnNorm
W
0.19
0.68
0.24
0.15
0.07
0.02
MeanJitter
15 W
0.18
−0.30
0.00
0.08
0.03
0.40
0.38
0.62
0.57
0.35
0.23
MeanShimmer
15 W
−0.27
−0.36
0.21
0.75
0.43
0.40
0.15
StandDevShimmer
15 W
−0.28
0.17
0.75
0.34
0.37
0.15
0.04
#+Voiced
15 W
0.13
−0.63
−0.29
−0.35
0.00
0.51
0.34
0.31
0.89
0.30
0.15
RelNum+/−Voiced
15 W
0.16
−0.56
−0.30
0.51
0.31
0.31
0.93
0.20
0.07
CFx
0.08
0.24
−0.35
0.05
0.10
0.40
0.12
0.00
0.45
0.37
0.65
CQx
0.02
0.18
0.07
0.15
0.11
0.08
0.64
Upper right triangle: Pearson’s ; lower left triangle: Spearman’s ρ. Contexts: W: word, WPW: word-pause-word, 15 W: 15 words (“global” feature). All and ρ correlations with an absolute value of larger than 0.25 (0.33) are significant on the 0.05 (0.01) level.