Using Paraphrasing and Memory-Augmented Models to Combat Data Sparsity in Question Interpretation with a Virtual Patient Dialogue System
Lifeng Jin, David King, Amad Hussein, Michael White, Douglas Danforth
Abstract
When interpreting questions in a virtual patient dialogue system one must inevitably tackle the challenge of a long tail of relatively infrequently asked questions. To make progress on this challenge, we investigate the use of paraphrasing for data augmentation and neural memory-based classification, finding that the two methods work best in combination. In particular, we find that the neural memory-based approach not only outperforms a straight CNN classifier on low frequency questions, but also takes better advantage of the augmented data created by paraphrasing, together yielding a nearly 10% absolute improvement in accuracy on the least frequently asked questions.- Anthology ID:
- W18-0502
- Volume:
- Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Venue:
- BEA
- SIG:
- SIGEDU
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 13–23
- Language:
- URL:
- https://aclanthology.org/W18-0502
- DOI:
- 10.18653/v1/W18-0502
- Cite (ACL):
- Lifeng Jin, David King, Amad Hussein, Michael White, and Douglas Danforth. 2018. Using Paraphrasing and Memory-Augmented Models to Combat Data Sparsity in Question Interpretation with a Virtual Patient Dialogue System. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 13–23, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- Using Paraphrasing and Memory-Augmented Models to Combat Data Sparsity in Question Interpretation with a Virtual Patient Dialogue System (Jin et al., BEA 2018)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/W18-0502.pdf