Abstract
Contextualized language modeling using deep Transformer networks has been applied to a variety of natural language processing tasks with remarkable success. However, we find that these models are not a panacea for a question-answering dialogue agent corpus task, which has hundreds of classes in a long-tailed frequency distribution, with only thousands of data points. Instead, we find substantial improvements in recall and accuracy on rare classes from a simple one-layer RNN with multi-headed self-attention and static word embeddings as inputs. While much research has used attention weights to illustrate what input is important for a task, the complexities of our dialogue corpus offer a unique opportunity to examine how the model represents what it attends to, and we offer a detailed analysis of how that contributes to improved performance on rare classes. A particularly interesting phenomenon we observe is that the model picks up implicit meanings by splitting different aspects of the semantics of a single word across multiple attention heads.- Anthology ID:
- 2020.sigdial-1.24
- Volume:
- Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
- Month:
- July
- Year:
- 2020
- Address:
- 1st virtual meeting
- Editors:
- Olivier Pietquin, Smaranda Muresan, Vivian Chen, Casey Kennington, David Vandyke, Nina Dethlefs, Koji Inoue, Erik Ekstedt, Stefan Ultes
- Venue:
- SIGDIAL
- SIG:
- SIGDIAL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 196–202
- Language:
- URL:
- https://aclanthology.org/2020.sigdial-1.24
- DOI:
- 10.18653/v1/2020.sigdial-1.24
- Cite (ACL):
- Adam Stiff, Qi Song, and Eric Fosler-Lussier. 2020. How Self-Attention Improves Rare Class Performance in a Question-Answering Dialogue Agent. In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 196–202, 1st virtual meeting. Association for Computational Linguistics.
- Cite (Informal):
- How Self-Attention Improves Rare Class Performance in a Question-Answering Dialogue Agent (Stiff et al., SIGDIAL 2020)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/2020.sigdial-1.24.pdf