Abstract
We inspect the multi-head self-attention in Transformer NMT encoders for three source languages, looking for patterns that could have a syntactic interpretation. In many of the attention heads, we frequently find sequences of consecutive states attending to the same position, which resemble syntactic phrases. We propose a transparent deterministic method of quantifying the amount of syntactic information present in the self-attentions, based on automatically building and evaluating phrase-structure trees from the phrase-like sequences. We compare the resulting trees to existing constituency treebanks, both manually and by computing precision and recall.- Anthology ID:
- W19-4827
- Volume:
- Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Venue:
- BlackboxNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 263–275
- Language:
- URL:
- https://aclanthology.org/W19-4827
- DOI:
- 10.18653/v1/W19-4827
- Cite (ACL):
- David Mareček and Rudolf Rosa. 2019. From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 263–275, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions (Mareček & Rosa, BlackboxNLP 2019)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/W19-4827.pdf