Modeling Color Terminology Across Thousands of Languages
Arya D. McCarthy, Winston Wu, Aaron Mueller, William Watson, David Yarowsky
Abstract
There is an extensive history of scholarship into what constitutes a “basic” color term, as well as a broadly attested acquisition sequence of basic color terms across many languages, as articulated in the seminal work of Berlin and Kay (1969). This paper employs a set of diverse measures on massively cross-linguistic data to operationalize and critique the Berlin and Kay color term hypotheses. Collectively, the 14 empirically-grounded computational linguistic metrics we design—as well as their aggregation—correlate strongly with both the Berlin and Kay basic/secondary color term partition (γ = 0.96) and their hypothesized universal acquisition sequence. The measures and result provide further empirical evidence from computational linguistics in support of their claims, as well as additional nuance: they suggest treating the partition as a spectrum instead of a dichotomy.- Anthology ID:
- D19-1229
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2241–2250
- Language:
- URL:
- https://aclanthology.org/D19-1229
- DOI:
- 10.18653/v1/D19-1229
- Cite (ACL):
- Arya D. McCarthy, Winston Wu, Aaron Mueller, William Watson, and David Yarowsky. 2019. Modeling Color Terminology Across Thousands of Languages. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2241–2250, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Modeling Color Terminology Across Thousands of Languages (McCarthy et al., EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/D19-1229.pdf
- Code
- aryamccarthy/basic-color-terms
- Data
- Panlex