Abstract
A recent line of work in NLP focuses on the (dis)ability of models to generalise compositionally for artificial languages.However, when considering natural language tasks, the data involved is not strictly, or locally, compositional.Quantifying the compositionality of data is a challenging task, which has been investigated primarily for short utterances.We use recursive neural models (Tree-LSTMs) with bottlenecks that limit the transfer of information between nodes.We illustrate that comparing data’s representations in models with and without the bottleneck can be used to produce a compositionality metric.The procedure is applied to the evaluation of arithmetic expressions using synthetic data, and sentiment classification using natural language data.We demonstrate that compression through a bottleneck impacts non-compositional examples disproportionatelyand then use the bottleneck compositionality metric (BCM) to distinguish compositional from non-compositional samples, yielding a compositionality ranking over a dataset.- Anthology ID:
- 2022.findings-emnlp.320
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2022
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4361–4378
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2022.findings-emnlp.320/
- DOI:
- 10.18653/v1/2022.findings-emnlp.320
- Cite (ACL):
- Verna Dankers and Ivan Titov. 2022. Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4361–4378, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Recursive Neural Networks with Bottlenecks Diagnose (Non-)Compositionality (Dankers & Titov, Findings 2022)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2022.findings-emnlp.320.pdf