An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation

Alessandro Raganato, Raúl Vázquez, Mathias Creutz, Jörg Tiedemann


Abstract
In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks. We systematically study the impact of the size of the shared layer and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that the performance in translation does correlate with trainable downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. On the other hand, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. We hypothesize that the training procedure on the downstream task enables the model to identify the encoded information that is useful for the specific task whereas non-trainable benchmarks can be confused by other types of information also encoded in the representation of a sentence.
Anthology ID:
W19-4304
Volume:
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Isabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Johannes Welbl, Alexis Conneau, Xiang Ren, Marek Rei
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
27–32
Language:
URL:
https://aclanthology.org/W19-4304
DOI:
10.18653/v1/W19-4304
Bibkey:
Cite (ACL):
Alessandro Raganato, Raúl Vázquez, Mathias Creutz, and Jörg Tiedemann. 2019. An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pages 27–32, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation (Raganato et al., RepL4NLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ml4al-ingestion/W19-4304.pdf