Hiroki Mori
2016
Accuracy of Automatic Cross-Corpus Emotion Labeling for Conversational Speech Corpus Commonization
Hiroki Mori
|
Atsushi Nagaoka
|
Yoshiko Arimoto
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
There exists a major incompatibility in emotion labeling framework among emotional speech corpora, that is, category-based and dimension-based. Commonizing these requires inter-corpus emotion labeling according to both frameworks, but doing this by human annotators is too costly for most cases. This paper examines the possibility of automatic cross-corpus emotion labeling. In order to evaluate the effectiveness of the automatic labeling, a comprehensive emotion annotation for two conversational corpora, UUDB and OGVC, was performed. With a state-of-the-art machine learning technique, dimensional and categorical emotion estimation models were trained and tested against the two corpora. For the emotion dimension estimation, the automatic cross-corpus emotion labeling for the different corpus was effective for the dimensions of aroused-sleepy, dominant-submissive and interested-indifferent, showing only slight performance degradation against the result for the same corpus. On the other hand, the performance for the emotion category estimation was not sufficient.