Word class flexibility: A deep contextualized approach

Bai Li, Guillaume Thomas, Yang Xu, Frank Rudzicz


Abstract
Word class flexibility refers to the phenomenon whereby a single word form is used across different grammatical categories. Extensive work in linguistic typology has sought to characterize word class flexibility across languages, but quantifying this phenomenon accurately and at scale has been fraught with difficulties. We propose a principled methodology to explore regularity in word class flexibility. Our method builds on recent work in contextualized word embeddings to quantify semantic shift between word classes (e.g., noun-to-verb, verb-to-noun), and we apply this method to 37 languages. We find that contextualized embeddings not only capture human judgment of class variation within words in English, but also uncover shared tendencies in class flexibility across languages. Specifically, we find greater semantic variation when flexible lemmas are used in their dominant word class, supporting the view that word class flexibility is a directional process. Our work highlights the utility of deep contextualized models in linguistic typology.
Anthology ID:
2020.emnlp-main.71
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
983–994
Language:
URL:
https://aclanthology.org/2020.emnlp-main.71
DOI:
10.18653/v1/2020.emnlp-main.71
Bibkey:
Cite (ACL):
Bai Li, Guillaume Thomas, Yang Xu, and Frank Rudzicz. 2020. Word class flexibility: A deep contextualized approach. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 983–994, Online. Association for Computational Linguistics.
Cite (Informal):
Word class flexibility: A deep contextualized approach (Li et al., EMNLP 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.emnlp-main.71.pdf
Video:
 https://slideslive.com/38938777
Code
 SPOClab-ca/word-class-flexibility +  additional community code