Zero-shot Jianzi Recognition as Structured Visual Information Extraction in Open Compositional Symbolic Systems

Zehan Li, Fu Zhang, Zhijun Liu, Jingwei Cheng


Abstract
Guqin (古琴) Jianzi (減字) is an open and freely compositional tablature system that encodes performance actions rather than acoustic outcomes. Its automatic recognition remains largely unexplored, as conventional OCR assumes a closed and enumerable glyph set and struggles with Jianzi’s unbounded composition and manuscript-level variability.We introduce Zero-shot Jianzi Recognition, which formulates Jianzi recognition as vision-to-sequence prediction of canonical component sequences under a zero-shot split. To enable scalable supervision, we construct Synthetic-JZ from aligned online composition metadata. We then synthesize manuscript-like training images via component-wise style recomposition and manuscript-domain noise modeling, and fine-tune a vision–language model for end-to-end component sequence recognition. At inference time, a lightweight legality-guided correction module re-ranks decoding candidates, suppressing structural hallucinations without modifying the backbone.Experiments on two benchmarks show that our method achieves 63.02% sequence accuracy on Real-JZ, our manually annotated real-world Jianzi benchmark, surpassing Gemini-3-Pro by 35.11%. This result highlights the feasibility of reliable automated Jianzi recognition and its potential for large-scale digitization of historical Guqin Jianzi Pu manuscripts.
Anthology ID:
2026.acl-long.1356
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
29414–29429
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1356/
DOI:
Bibkey:
Cite (ACL):
Zehan Li, Fu Zhang, Zhijun Liu, and Jingwei Cheng. 2026. Zero-shot Jianzi Recognition as Structured Visual Information Extraction in Open Compositional Symbolic Systems. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 29414–29429, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Zero-shot Jianzi Recognition as Structured Visual Information Extraction in Open Compositional Symbolic Systems (Li et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1356.pdf
Checklist:
 2026.acl-long.1356.checklist.pdf