Abstract
We study the degree to which neural sequence-to-sequence models exhibit fine-grained controllability when performing natural language generation from a meaning representation. Using two task-oriented dialogue generation benchmarks, we systematically compare the effect of four input linearization strategies on controllability and faithfulness. Additionally, we evaluate how a phrase-based data augmentation method can improve performance. We find that properly aligning input sequences during training leads to highly controllable generation, both when training from scratch or when fine-tuning a larger pre-trained model. Data augmentation further improves control on difficult, randomly generated utterance plans.- Anthology ID:
- 2020.emnlp-main.419
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5160–5185
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.419
- DOI:
- 10.18653/v1/2020.emnlp-main.419
- Cite (ACL):
- Chris Kedzie and Kathleen McKeown. 2020. Controllable Meaning Representation to Text Generation: Linearization and Data Augmentation Strategies. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5160–5185, Online. Association for Computational Linguistics.
- Cite (Informal):
- Controllable Meaning Representation to Text Generation: Linearization and Data Augmentation Strategies (Kedzie & McKeown, EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2020.emnlp-main.419.pdf