ManCC: A Task-Anchored Benchmark for Manchu–Classical Chinese Cross-Lingual Modeling

Meiqi Wang; Xiaoxin Sun; Dongjie Wang; Ruixin Yu; Xiantao Heng; Shuo Wang; Zhen Huang; Peng Zhao; Suhua Wang; Minghao Yin

ManCC: A Task-Anchored Benchmark for Manchu–Classical Chinese Cross-Lingual Modeling

Meiqi Wang, Xiaoxin Sun, Dongjie Wang, Ruixin Yu, Xiantao Heng, Shuo Wang, Zhen Huang, Peng Zhao, Suhua Wang, Minghao Yin

Abstract

Research in cross-lingual modeling for historical and extremely low-resource languages is hindered by the absence of standardized evaluation benchmarks. To address this, we present ManCC—the first task-anchored benchmark for Manchu–Classical Chinese translation. ManCC consists of a high-quality parallel corpus of 16,627 sentence pairs, derived from the Qing-dynasty historical text Manwen Laodang-Taizong, and a reproducible evaluation protocol that combines automatic metrics (BLEU and chrF) with a three-dimensional human assessment (fidelity, fluency, linguistic normativity). Through systematic evaluation across three model families (non-pretrained, multilingual pretrained, and large language models), we find that linguistic differences significantly influence performance, broader language coverage in multilingual pretraining facilitates low-resource transfer, and automatic metrics often fail to capture essential errors in historical translation—underscoring the necessity of human evaluation. ManCC not only provides foundational resources for Manchu–Classical Chinese translation but also establishes a diagnosable, reproducible platform for cross-lingual modeling of historical low-resource languages.

Anthology ID:: 2026.findings-acl.1359
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 27271–27292
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1359/
DOI:
Bibkey:
Cite (ACL):: Meiqi Wang, Xiaoxin Sun, Dongjie Wang, Ruixin Yu, Xiantao Heng, Shuo Wang, Zhen Huang, Peng Zhao, Suhua Wang, and Minghao Yin. 2026. ManCC: A Task-Anchored Benchmark for Manchu–Classical Chinese Cross-Lingual Modeling. In Findings of the Association for Computational Linguistics: ACL 2026, pages 27271–27292, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: ManCC: A Task-Anchored Benchmark for Manchu–Classical Chinese Cross-Lingual Modeling (Wang et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1359.pdf
Checklist:: 2026.findings-acl.1359.checklist.pdf

PDF Cite Search Checklist Fix data