Shihui Li
2026
Examining Large Language Models’ form-meaning mappings of information structure constructions in Mandarin Chinese
Shihui Li | Xiaojuan Tan | Jelke Bloem
Proceedings of the 30th Conference on Computational Natural Language Learning
Shihui Li | Xiaojuan Tan | Jelke Bloem
Proceedings of the 30th Conference on Computational Natural Language Learning
Construction Grammar (CxG) knowledge in language models has been extensively studied for English, but remains underexplored in other languages. In Mandarin Chinese, the ba (把, disposal) and bei (被, passive) constructions are widely used for managing information structure. They foreground topical elements (information structure) and encode systematic form-meaning mappings (CxG), particularly with respect to the semantic role of the object. We probe language models’ linguistic competence with these constructions using minimal pairs, constructing a new minimal-pair dataset comprising seven paradigms that target both syntactic constraints and verb–construction compatibility. Our results show that it remains a challenge for many models to capture the form-meaning mappings underlying the ba construction, although they achieve high accuracy on paradigms driven by surface syntactic cues.