Evidence-Augmented Generation Reasoning for Extremely Low-Resource Language Decipherment

Xiaoyu Zhu, Long Yuan, Rui Qi, Jinan Xu


Abstract
Inspired by linguistic Olympiads, extremely low-resource language reasoning presents a unique challenge that enables models to solve problems without prior knowledge. This task mirrors the Rosetta Stone decipherment process, where the goal is to induce and apply linguistic rules from minimal context. Existing methods mainly rely on naive in-context learning that fails to handle the complexity and diversity of language rules. To mitigate this issue, we propose a framework that combines dynamic knowledge construction with task-aware evidence augmentation. First, we use large language models (LLMs) to generate a diverse set of task-specific examples that instantiate potential linguistic rules for the target low-resource language. Second, we apply a semantic retrieval mechanism to select the most relevant examples as evidence for each test query, preventing context overload and ensuring focused, analogical reasoning. Our method shifts from learning language distributions to dynamically discovering and applying rules. Experimental results on the LINGOLY and Linguini benchmark show that our approach achieves competitive performance across various LLMs, outperforming existing baselines. More importantly, our framework advances extremely low-resource reasoning and provides a generalizable framework for rule induction under knowledge constraints.
Anthology ID:
2026.mellm-1.2
Volume:
Proceedings of the 1st Workshop on Multilinguality in the Era of Large Language Models (MeLLM 2026)
Month:
July
Year:
2026
Address:
San Diego, United States
Editors:
Kaiyu Huang, Fengran Mo, Pinzhen Chen, Meng Jiang
Venues:
MeLLM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14–29
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.mellm-1.2/
DOI:
Bibkey:
Cite (ACL):
Xiaoyu Zhu, Long Yuan, Rui Qi, and Jinan Xu. 2026. Evidence-Augmented Generation Reasoning for Extremely Low-Resource Language Decipherment. In Proceedings of the 1st Workshop on Multilinguality in the Era of Large Language Models (MeLLM 2026), pages 14–29, San Diego, United States. Association for Computational Linguistics.
Cite (Informal):
Evidence-Augmented Generation Reasoning for Extremely Low-Resource Language Decipherment (Zhu et al., MeLLM 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.mellm-1.2.pdf