Abstract
We introduce a novel neural easy-first decoder that learns to solve sequence tagging tasks in a flexible order. In contrast to previous easy-first decoders, our models are end-to-end differentiable. The decoder iteratively updates a “sketch” of the predictions over the sequence. At its core is an attention mechanism that controls which parts of the input are strategically the best to process next. We present a new constrained softmax transformation that ensures the same cumulative attention to every word, and show how to efficiently evaluate and backpropagate over it. Our models compare favourably to BILSTM taggers on three sequence tagging tasks.- Anthology ID:
- D17-1036
- Volume:
- Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Editors:
- Martha Palmer, Rebecca Hwa, Sebastian Riedel
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 349–362
- Language:
- URL:
- https://aclanthology.org/D17-1036
- DOI:
- 10.18653/v1/D17-1036
- Cite (ACL):
- André F. T. Martins and Julia Kreutzer. 2017. Learning What’s Easy: Fully Differentiable Neural Easy-First Taggers. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 349–362, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Learning What’s Easy: Fully Differentiable Neural Easy-First Taggers (Martins & Kreutzer, EMNLP 2017)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/D17-1036.pdf