Cross-lingual and Word-Independent Methods for Quantifying Degree of Grammaticalization
Ryo Nagata, Daichi Mochihashi, Misato Ido, Yusuke Kubota, Naoki Otani, Yoshifumi Kawasaki, Hiroya Takamura
Abstract
Grammaticalization denotes a diachronic change of the grammatical category from content words to function words. One of the intensively explored directions in this area is to quantify the degree of grammaticalization. There have been a limited number of automated methods for this task and the existing, best-performing method is heavily language- and word-dependent. In this paper, we explore three methods for quantifying the degree of grammaticalization, which are applicable to a wider variety of words and languages. The difficulty here is that training data is not available in the present task. We overcome this difficulty by using Positive-Unlabeled learning (PU-learning) or Cross-Validation-like learning (hereafter, CV-learning). Experiments show that the CV-learning-based method achieves middle to high correlations to human judgments in English deverbal prepositions and Japanese nouns being grammaticalized. With this method, we further explore words possibly being grammaticalized and counterexamples of the unidirectionality hypothesis.- Anthology ID:
- 2026.eacl-long.221
- Volume:
- Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Vera Demberg, Kentaro Inui, Lluís Marquez
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4775–4787
- Language:
- URL:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.221/
- DOI:
- Cite (ACL):
- Ryo Nagata, Daichi Mochihashi, Misato Ido, Yusuke Kubota, Naoki Otani, Yoshifumi Kawasaki, and Hiroya Takamura. 2026. Cross-lingual and Word-Independent Methods for Quantifying Degree of Grammaticalization. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4775–4787, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- Cross-lingual and Word-Independent Methods for Quantifying Degree of Grammaticalization (Nagata et al., EACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.221.pdf