ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors

Qinchan Li; Sophie Hao

ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors

Abstract

In languages without orthographic word boundaries, NLP models perform _word segmentation_, either as an explicit preprocessing step or as an implicit step in an end-to-end computation. This paper shows that Chinese NLP models are vulnerable to _morphological garden path errors_—errors caused by a failure to resolve local word segmentation ambiguities using sentence-level morphosyntactic context. We propose a benchmark, _ERAS_, that tests a model’s vulnerability to morphological garden path errors by comparing its behavior on sentences with and without local segmentation ambiguities. Using ERAS, we show that word segmentation models make morphological garden path errors on locally ambiguous sentences, but do not make equivalent errors on unambiguous sentences. We further show that sentiment analysis models with character-level tokenization make implicit garden path errors, even without an explicit word segmentation step in the pipeline. Our results indicate that models’ segmentation of Chinese text often fails to account for morphosyntactic context.

Anthology ID:: 2025.naacl-long.159
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3100–3111
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.159/
DOI:
Bibkey:
Cite (ACL):: Qinchan Li and Sophie Hao. 2025. ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 3100–3111, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: ERAS: Evaluating the Robustness of Chinese NLP Models to Morphological Garden Path Errors (Li & Hao, NAACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.159.pdf

PDF Cite Search Fix data