Kazuma Kobayashi
2026
MED-COREASONER: Reducing Language Disparities in Medical Reasoning via Language-Informed Co-Reasoning
Fan Gao | Sherry T. Tong | Jiwoong Sohn | Jiahao Huang | Junfeng Jiang | Ding Xia | Piyalitt Ittichaiwong | Kanyakorn Veerakanjana | Hyunjae Kim | Qingyu Chen | Edison Marrese-Taylor | Kazuma Kobayashi | Akiko Aizawa | Irene Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Fan Gao | Sherry T. Tong | Jiwoong Sohn | Jiahao Huang | Junfeng Jiang | Ding Xia | Piyalitt Ittichaiwong | Kanyakorn Veerakanjana | Hyunjae Kim | Qingyu Chen | Edison Marrese-Taylor | Kazuma Kobayashi | Akiko Aizawa | Irene Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While reasoning-enhanced large language models perform strongly on English medical tasks, a persistent multilingual gap remains, with substantially weaker reasoning in local languages, limiting equitable global medical deployment. To bridge this gap, we introduce Med-CoReasoner, a language-informed co-reasoning framework that elicits parallel English and local-language reasoning, abstracts them into structured concepts, and integrates local clinical knowledge into an English logical scaffold via concept-level alignment and retrieval. This design combines the structural robustness of English reasoning with the practice-grounded expertise encoded in local languages. To evaluate multilingual medical reasoning beyond multiple-choice settings, we construct MultiMed-X, a benchmark covering seven languages with expert-annotated long-form question answering and natural language inference tasks, comprising 350 instances per language. Experiments across three benchmarks show that Med-CoReasoner improves multilingual reasoning performance by an average of 5%, with particularly substantial gains in low-resource languages. Moreover, model distillation and expert evaluation analysis further confirm that Med-CoReasoner produces clinically sound and culturally grounded reasoning traces.
2025
Leveraging High-Resource English Corpora for Cross-lingual Domain Adaptation in Low-Resource Japanese Medicine via Continued Pre-training
Kazuma Kobayashi | Zhen Wan | Fei Cheng | Yuma Tsuta | Xin Zhao | Junfeng Jiang | Jiahao Huang | Zhiyi Huang | Yusuke Oda | Rio Yokota | Yuki Arase | Daisuke Kawahara | Akiko Aizawa | Sadao Kurohashi
Findings of the Association for Computational Linguistics: EMNLP 2025
Kazuma Kobayashi | Zhen Wan | Fei Cheng | Yuma Tsuta | Xin Zhao | Junfeng Jiang | Jiahao Huang | Zhiyi Huang | Yusuke Oda | Rio Yokota | Yuki Arase | Daisuke Kawahara | Akiko Aizawa | Sadao Kurohashi
Findings of the Association for Computational Linguistics: EMNLP 2025
Limited low-resource language corpora in professional domains like medicine hinder cross-lingual domain adaptation of pre-trained large language models (PLMs). While abundant English medical corpora could complement this scarcity, the effective mixture of English and target language, including machine-translated content, remains underexplored. We examined how linguistic features (e.g., token sizes and language proportions) affect performance on a Japanese–English medical knowledge benchmark. Through continued pre-training of a bilingual PLM on multilingual corpora with varying proportions of English and Japanese texts (both original and machine-translated), we analyzed correlations between linguistic features and fine-grained task performance. Our findings suggest a practical approach to optimizing multilingual corpora for cross-lingual domain adaptation, which requires leveraging specialized knowledge from English corpora while ensuring sufficient coverage of language-specific expressions in a target language (Japanese). Such insights will contribute to the development of multilingual models that effectively leverage English-language resources in various professional domains with low-resource languages.