Improved Natural Language Generation via Loss Truncation

Daniel Kang, Tatsunori B. Hashimoto


Abstract
Neural language models are usually trained to match the distributional properties of large-scale corpora by minimizing the log loss. While straightforward to optimize, this approach forces the model to reproduce all variations in the dataset, including noisy and invalid references (e.g., misannotations and hallucinated facts). Even a small fraction of noisy data can degrade the performance of log loss. As an alternative, prior work has shown that minimizing the distinguishability of generated samples is a principled and robust loss that can handle invalid references. However, distinguishability has not been used in practice due to challenges in optimization and estimation. We propose loss truncation: a simple and scalable procedure which adaptively removes high log loss examples as a way to optimize for distinguishability. Empirically, we demonstrate that loss truncation outperforms existing baselines on distinguishability on a summarization task. Furthermore, we show that samples generated by the loss truncation model have factual accuracy ratings that exceed those of baselines and match human references.
Anthology ID:
2020.acl-main.66
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
718–731
Language:
URL:
https://aclanthology.org/2020.acl-main.66
DOI:
10.18653/v1/2020.acl-main.66
Bibkey:
Cite (ACL):
Daniel Kang and Tatsunori B. Hashimoto. 2020. Improved Natural Language Generation via Loss Truncation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 718–731, Online. Association for Computational Linguistics.
Cite (Informal):
Improved Natural Language Generation via Loss Truncation (Kang & Hashimoto, ACL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2020.acl-main.66.pdf
Video:
 http://slideslive.com/38929231