Abstract
Word representations are an important aspect of Natural Language Processing (NLP). Representations are trained using large corpora, either as independent static embeddings or as part of a deep contextualized model. While word embeddings are useful, they struggle on rare and unknown words. As such, a large body of work has been done on estimating rare and unknown words. However, most of the methods focus on static embeddings, with few models focused on contextualized representations. In this work, we propose SPRUCE, a rare/unknown embedding architecture that focuses on contextualized representations. This architecture uses subword attention and embedding post-processing combined with the contextualized model to produce high quality embeddings. We then demonstrate these techniques lead to improved performance in most intrinsic and downstream tasks.- Anthology ID:
- 2024.findings-naacl.88
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2024
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Kevin Duh, Helena Gomez, Steven Bethard
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1383–1389
- Language:
- URL:
- https://aclanthology.org/2024.findings-naacl.88
- DOI:
- Cite (ACL):
- Raj Patel and Carlotta Domeniconi. 2024. Subword Attention and Post-Processing for Rare and Unknown Contextualized Embeddings. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 1383–1389, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Subword Attention and Post-Processing for Rare and Unknown Contextualized Embeddings (Patel & Domeniconi, Findings 2024)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2024.findings-naacl.88.pdf