Lan Yang


2025

pdf bib
V-Oracle: Making Progressive Reasoning in Deciphering Oracle Bones for You and Me
Runqi Qiao | Qiuna Tan | Guanting Dong | MinhuiWu MinhuiWu | Jiapeng Wang | YiFan Zhang | Zhuoma GongQue | Chong Sun | Yida Xu | Yadong Xue | Ye Tian | Zhimin Bao | Lan Yang | Chen Li | Honggang Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Oracle Bone Script (OBS) is a vital treasure of human civilization, rich in insights from ancient societies. However, the evolution of written language over millennia complicates its decipherment. In this paper, we propose V-Oracle, an innovative framework that utilizes Large Multi-modal Models (LMMs) for interpreting OBS. V-Oracle applies principles of pictographic character formation and frames the task as a visual question-answering (VQA) problem, establishing a multi-step reasoning chain. It proposes a multi-dimensional data augmentation for synthesizing high-quality OBS samples, and also implements a multi-phase oracle alignment tuning to improve LMMs’ visual reasoning capabilities. Moreover, to bridge the evaluation gap in the OBS field, we further introduce Oracle-Bench, a comprehensive benchmark that emphasizes process-oriented assessment and incorporates both standard and out-of-distribution setups for realistic evaluation. Extensive experimental results can demonstrate the effectiveness of our method in providing quantitative analyses and superior deciphering capability.