LingGym: How Far Are LLMs from Thinking Like Field Linguists?

Changbing Yang, Franklin Ma, Freda Shi, Jian Zhu


Abstract
This paper introduces LingGym, a new benchmark that evaluates LLMs’ capacity for meta-linguistic reasoning using Interlinear Glossed Text (IGT) and grammatical descriptions extracted from 18 typologically diverse reference grammars. Unlike previous work that focuses on specific downstream tasks, we assess whether LLMs can generalize linguistic inference across low-resource languages and structures not seen during training. We present a controlled evaluation task: Word-Gloss Inference, in which the model must infer a missing word and gloss from context using varying levels of linguistic information (e.g., glosses, grammatical explanations, translations). Our results show that incorporating structured linguistic cues leads to consistent improvements in reasoning performance across all models. This work highlights both the promise and current limitations of using LLMs for typologically informed linguistic analysis and low-resource language documentation.
Anthology ID:
2025.emnlp-main.69
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1314–1340
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.69/
DOI:
Bibkey:
Cite (ACL):
Changbing Yang, Franklin Ma, Freda Shi, and Jian Zhu. 2025. LingGym: How Far Are LLMs from Thinking Like Field Linguists?. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 1314–1340, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
LingGym: How Far Are LLMs from Thinking Like Field Linguists? (Yang et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.69.pdf
Checklist:
 2025.emnlp-main.69.checklist.pdf