BrailleLLM: Braille Instruction Tuning with Large Language Models for Braille Domain Tasks

Tianyuan Huang, Zepeng Zhu, Hangdi Xing, Zirui Shao, Zhi Yu, Chaoxiong Yang, Jiaxian He, Xiaozhong Liu, Jiajun Bu


Abstract
Braille plays a vital role in education and information accessibility for visually impaired individuals. However, Braille information processing faces challenges such as data scarcity and ambiguities in mixed-text contexts. We construct English and Chinese Braille Mixed Datasets (EBMD/CBMD) with mathematical formulas to support diverse Braille domain research, and propose a syntax tree-based augmentation method tailored for Braille data. To address the underperformance of traditional fine-tuning methods in braille-related tasks, we investigate Braille Knowledge-Based Fine-Tuning (BKFT), which reduces the learning difficulty of Braille contextual features. BrailleLLM employs BKFT via instruction tuning to achieve unified Braille translation, formula-to-Braille conversion, and mixed-text translation. Experiments demonstrate that BKFT achieves significant performance improvements over conventional fine-tuning in Braille translation scenarios. Our open-sourced datasets and methodologies establish a foundation for low-resource multilingual Braille research.
Anthology ID:
2025.emnlp-main.1454
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
28589–28600
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1454/
DOI:
Bibkey:
Cite (ACL):
Tianyuan Huang, Zepeng Zhu, Hangdi Xing, Zirui Shao, Zhi Yu, Chaoxiong Yang, Jiaxian He, Xiaozhong Liu, and Jiajun Bu. 2025. BrailleLLM: Braille Instruction Tuning with Large Language Models for Braille Domain Tasks. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 28589–28600, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
BrailleLLM: Braille Instruction Tuning with Large Language Models for Braille Domain Tasks (Huang et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1454.pdf
Checklist:
 2025.emnlp-main.1454.checklist.pdf