Direct Output Connection for a High-Rank Language Model

Sho Takase; Jun Suzuki; Masaaki Nagata

doi:10.18653/v1/D18-1489

Direct Output Connection for a High-Rank Language Model

Abstract

This paper proposes a state-of-the-art recurrent neural network (RNN) language model that combines probability distributions computed not only from a final RNN layer but also middle layers. This method raises the expressive power of a language model based on the matrix factorization interpretation of language modeling introduced by Yang et al. (2018). Our proposed method improves the current state-of-the-art language model and achieves the best score on the Penn Treebank and WikiText-2, which are the standard benchmark datasets. Moreover, we indicate our proposed method contributes to application tasks: machine translation and headline generation.

Anthology ID:: D18-1489
Volume:: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:: October-November
Year:: 2018
Address:: Brussels, Belgium
Editors:: Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:: EMNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4599–4609
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/D18-1489/
DOI:: 10.18653/v1/D18-1489
Bibkey:
Cite (ACL):: Sho Takase, Jun Suzuki, and Masaaki Nagata. 2018. Direct Output Connection for a High-Rank Language Model. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4599–4609, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Direct Output Connection for a High-Rank Language Model (Takase et al., EMNLP 2018)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/D18-1489.pdf
Code: nttcslab-nlp/doc_lm
Data: Penn Treebank, WikiText-2

PDF Cite Search Code Fix data