Abstract
In this work, we propose a new language modeling paradigm that has the ability to perform both prediction and moderation of information flow at multiple granularities: neural lattice language models. These models construct a lattice of possible paths through a sentence and marginalize across this lattice to calculate sequence probabilities or optimize parameters. This approach allows us to seamlessly incorporate linguistic intuitions — including polysemy and the existence of multiword lexical items — into our language model. Experiments on multiple language modeling tasks show that English neural lattice language models that utilize polysemous embeddings are able to improve perplexity by 9.95% relative to a word-level baseline, and that a Chinese model that handles multi-character tokens is able to improve perplexity by 20.94% relative to a character-level baseline.- Anthology ID:
- Q18-1036
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 6
- Month:
- Year:
- 2018
- Address:
- Cambridge, MA
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 529–541
- Language:
- URL:
- https://aclanthology.org/Q18-1036
- DOI:
- 10.1162/tacl_a_00036
- Cite (ACL):
- Jacob Buckman and Graham Neubig. 2018. Neural Lattice Language Models. Transactions of the Association for Computational Linguistics, 6:529–541.
- Cite (Informal):
- Neural Lattice Language Models (Buckman & Neubig, TACL 2018)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/Q18-1036.pdf
- Code
- jbuckman/neural-lattice-language-models