Session-level Language Modeling for Conversational Speech

Wayne Xiong, Lingfeng Wu, Jun Zhang, Andreas Stolcke

[How to correct problems with metadata yourself]


Abstract
We propose to generalize language models for conversational speech recognition to allow them to operate across utterance boundaries and speaker changes, thereby capturing conversation-level phenomena such as adjacency pairs, lexical entrainment, and topical coherence. The model consists of a long-short-term memory (LSTM) recurrent network that reads the entire word-level history of a conversation, as well as information about turn taking and speaker overlap, in order to predict each next word. The model is applied in a rescoring framework, where the word history prior to the current utterance is approximated with preliminary recognition results. In experiments in the conversational telephone speech domain (Switchboard) we find that such a model gives substantial perplexity reductions over a standard LSTM-LM with utterance scope, as well as improvements in word error rate.
Anthology ID:
D18-1296
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2764–2768
Language:
URL:
https://aclanthology.org/D18-1296
DOI:
10.18653/v1/D18-1296
Bibkey:
Cite (ACL):
Wayne Xiong, Lingfeng Wu, Jun Zhang, and Andreas Stolcke. 2018. Session-level Language Modeling for Conversational Speech. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2764–2768, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Session-level Language Modeling for Conversational Speech (Xiong et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/D18-1296.pdf
Video:
 https://preview.aclanthology.org/teach-a-man-to-fish/D18-1296.mp4