Chen-Yuan Chen


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Bringing Suzhou Numerals into the Digital Age: A Dataset and Recognition Study on Ancient Chinese Trade Records
Ting-Lin Wu | Zih-Ching Chen | Chen-Yuan Chen | Pi-Jhong Chen | Li-Chiao Wang
Proceedings of the Second Workshop on Ancient Language Processing

Suzhou numerals, a specialized numerical no-tation system historically used in Chinese com-merce and accounting, played a pivotal role in financial transactions from the Song Dynasty to the early 20th century. Despite their his-torical significance, they remain largely absent from modern OCR benchmarks, limiting com-putational access to archival trade documents. This paper presents a curated dataset of 773 expert-annotated Suzhou numeral samples ex-tracted from late Qing-era trade ledgers. We provide a statistical analysis of character distri-butions, offering insights into their real-world usage in historical bookkeeping. Additionally, we evaluate baseline performance with hand-written text recognition (HTR) model, high-lighting the challenges of recognizing low-resource brush-written numerals. By introduc-ing this dataset and initial benchmark results, we aim to facilitate research in historical doc-umentation in ancient Chinese characters, ad-vancing the digitization of early Chinese finan-cial records. The dataset is publicly available at our huggingface hub, and our codebase can be accessed at our github repository.