Large Margin Neural Language Model

Jiaji Huang, Yi Li, Wei Ping, Liang Huang


Abstract
We propose a large margin criterion for training neural language models. Conventionally, neural language models are trained by minimizing perplexity (PPL) on grammatical sentences. However, we demonstrate that PPL may not be the best metric to optimize in some tasks, and further propose a large margin formulation. The proposed method aims to enlarge the margin between the “good” and “bad” sentences in a task-specific sense. It is trained end-to-end and can be widely applied to tasks that involve re-scoring of generated text. Compared with minimum-PPL training, our method gains up to 1.1 WER reduction for speech recognition and 1.0 BLEU increase for machine translation.
Anthology ID:
D18-1150
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1183–1191
Language:
URL:
https://aclanthology.org/D18-1150
DOI:
10.18653/v1/D18-1150
Bibkey:
Cite (ACL):
Jiaji Huang, Yi Li, Wei Ping, and Liang Huang. 2018. Large Margin Neural Language Model. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1183–1191, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Large Margin Neural Language Model (Huang et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/D18-1150.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-1/D18-1150.mp4