Thesis Proposal: Diagnosing and Mitigating Semantic Interference in Script-Sharing Low-Resource Language Models: A Case Study on Square Bai Script

Jingting Zheng, Deyi Xiong


Abstract
Multilingual language models now cover more languages than ever, yet script-sharing low-resource languages remain vulnerable to failures driven by script and dominant-language priors. This dissertation studies one such failure mode, semantic interference, in Square Bai Script, where many forms resemble Chinese characters but often differ in meaning. We argue that current adaptation pipelines underperform not only because Bai is low-resource, but because they treat visible overlap as safe transfer by default. Building on an expert-validated corpus of 28,382 Bai-Chinese sentence pairs, an out-of-domain epigraphic set and a reproducible encoding pipeline, the dissertation will (1) diagnose semantic interference, (2) compare adaptation strategies under realistic compute constraints, and (3) estimate when shared-script transfer helps or harms adaptation. The long-term goal is Bai-capable understanding and generation. The dissertation addresses the prerequisite problem of safe and effective adaptation in a script-sharing low-resource setting.
Anthology ID:
2026.acl-srw.43
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
487–497
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.43/
DOI:
Bibkey:
Cite (ACL):
Jingting Zheng and Deyi Xiong. 2026. Thesis Proposal: Diagnosing and Mitigating Semantic Interference in Script-Sharing Low-Resource Language Models: A Case Study on Square Bai Script. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 487–497, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Thesis Proposal: Diagnosing and Mitigating Semantic Interference in Script-Sharing Low-Resource Language Models: A Case Study on Square Bai Script (Zheng & Xiong, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.43.pdf