Abstract
To date, transformer-based models such as BERT have been less successful in predicting compositionality of noun compounds than static word embeddings. This is likely related to a suboptimal use of the encoded information, reflecting an incomplete grasp of how the models represent the meanings of complex linguistic structures. This paper investigates variants of semantic knowledge derived from pretrained BERT when predicting the degrees of compositionality for 280 English noun compounds associated with human compositionality ratings. Our performance strongly improves on earlier unsupervised implementations of pretrained BERT and highlights beneficial decisions in data preprocessing, embedding computation, and compositionality estimation. The distinct linguistic roles of heads and modifiers are reflected by differences in BERT-derived representations, with empirical properties such as frequency, productivity, and ambiguity affecting model performance. The most relevant representational information is concentrated in the initial layers of the model architecture.- Anthology ID:
- 2023.eacl-main.110
- Volume:
- Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Editors:
- Andreas Vlachos, Isabelle Augenstein
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1499–1512
- Language:
- URL:
- https://aclanthology.org/2023.eacl-main.110
- DOI:
- 10.18653/v1/2023.eacl-main.110
- Cite (ACL):
- Filip Miletic and Sabine Schulte im Walde. 2023. A Systematic Search for Compound Semantics in Pretrained BERT Architectures. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1499–1512, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- A Systematic Search for Compound Semantics in Pretrained BERT Architectures (Miletic & Schulte im Walde, EACL 2023)
- PDF:
- https://preview.aclanthology.org/finnlp-2volume-ingestion/2023.eacl-main.110.pdf