Abstract
Previous work on noun classification implies that gender systems are inherently optimized to accommodate communicative pressures on human language learning and processing (Dye. et al 2017, 2018). They state that languages make use of either grammatical (e.g., gender) or probabilistic (pre-nominal modifiers) to smoothe the entropy of nouns in context. We show that even languages that are considered genderless, like Mandarin Chinese, possess a noun classification device that plays the same functional role as gender markers. Based on close to 1M Mandarin noun phrases extracted from the Leipzig Corpora Collection (Goldhahn et al. 2012) and their corresponding fastText embeddings (Bojanowski et al. 2016), we show that noun-classifier combinations are sensitive to same frequency, similarity, and co-occurrence interactions that structure gender systems. We also present the first study of the effects of the interaction between grammatical and probabilisitic noun classification.- Anthology ID:
- 2023.findings-emnlp.843
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 12664–12674
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.843
- DOI:
- 10.18653/v1/2023.findings-emnlp.843
- Cite (ACL):
- Yamei Wang and Géraldine Walther. 2023. Mandarin classifier systems optimize to accommodate communicative pressures. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 12664–12674, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Mandarin classifier systems optimize to accommodate communicative pressures (Wang & Walther, Findings 2023)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2023.findings-emnlp.843.pdf