Jiwoong Sohn
2026
MED-COREASONER: Reducing Language Disparities in Medical Reasoning via Language-Informed Co-Reasoning
Fan Gao | Sherry T. Tong | Jiwoong Sohn | Jiahao Huang | Junfeng Jiang | Ding Xia | Piyalitt Ittichaiwong | Kanyakorn Veerakanjana | Hyunjae Kim | Qingyu Chen | Edison Marrese-Taylor | Kazuma Kobayashi | Akiko Aizawa | Irene Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Fan Gao | Sherry T. Tong | Jiwoong Sohn | Jiahao Huang | Junfeng Jiang | Ding Xia | Piyalitt Ittichaiwong | Kanyakorn Veerakanjana | Hyunjae Kim | Qingyu Chen | Edison Marrese-Taylor | Kazuma Kobayashi | Akiko Aizawa | Irene Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While reasoning-enhanced large language models perform strongly on English medical tasks, a persistent multilingual gap remains, with substantially weaker reasoning in local languages, limiting equitable global medical deployment. To bridge this gap, we introduce Med-CoReasoner, a language-informed co-reasoning framework that elicits parallel English and local-language reasoning, abstracts them into structured concepts, and integrates local clinical knowledge into an English logical scaffold via concept-level alignment and retrieval. This design combines the structural robustness of English reasoning with the practice-grounded expertise encoded in local languages. To evaluate multilingual medical reasoning beyond multiple-choice settings, we construct MultiMed-X, a benchmark covering seven languages with expert-annotated long-form question answering and natural language inference tasks, comprising 350 instances per language. Experiments across three benchmarks show that Med-CoReasoner improves multilingual reasoning performance by an average of 5%, with particularly substantial gains in low-resource languages. Moreover, model distillation and expert evaluation analysis further confirm that Med-CoReasoner produces clinically sound and culturally grounded reasoning traces.
2025
Rationale-Guided Retrieval Augmented Generation for Medical Question Answering
Jiwoong Sohn | Yein Park | Chanwoong Yoon | Sihyeon Park | Hyeon Hwang | Mujeen Sung | Hyunjae Kim | Jaewoo Kang
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Jiwoong Sohn | Yein Park | Chanwoong Yoon | Sihyeon Park | Hyeon Hwang | Mujeen Sung | Hyunjae Kim | Jaewoo Kang
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Large language models (LLM) hold significant potential for applications in biomedicine, but they struggle with hallucinations and outdated knowledge.While retrieval-augmented generation (RAG) is generally employed to address these issues, it also has its own set of challenges: (1) LLMs are vulnerable to irrelevant or unhelpful context, (2) medical queries are often not well-targeted for helpful information, and (3) retrievers are prone to bias toward the specific source corpus they were trained on. In this study, we present RAG2 (RAtionale-Guided RAG), a new framework for enhancing the reliability of RAG in biomedical contexts. RAG2 incorporates three key innovations: a small filtering model trained on perplexity-based labels of rationales, which selectively augments informative snippets of documents while filtering out distractors; LLM-generated rationales as queries to improve the utility of retrieved snippets; a structure designed to retrieve snippets evenly from a comprehensive set of four biomedical corpora, effectively mitigating retriever bias. Our experiments demonstrate that RAG2 improves the state-of-the-art LLMs of varying sizes, with improvements of up to 6.1%, and it outperforms the previous best medical RAG model by up to 5.6% across three medical question-answering benchmarks. Our code is available at https://github.com/dmis-lab/RAG2
Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards
Jaehoon Yun | Jiwoong Sohn | Jungwoo Park | Hyunjae Kim | Xiangru Tang | Daniel Shao | Yong Hoe Koo | Ko Minhyeok | Qingyu Chen | Mark Gerstein | Michael Moor | Jaewoo Kang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Jaehoon Yun | Jiwoong Sohn | Jungwoo Park | Hyunjae Kim | Xiangru Tang | Daniel Shao | Yong Hoe Koo | Ko Minhyeok | Qingyu Chen | Mark Gerstein | Michael Moor | Jaewoo Kang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large language models have shown promise in clinical decision making, but current approaches struggle to localize and correct errors at specific steps of the reasoning process. This limitation is critical in medicine, where identifying and addressing reasoning errors is essential for accurate diagnosis and effective patient care. We introduce Med-PRM, a process reward modeling framework that leverages retrieval-augmented generation to verify each reasoning step against established medical knowledge bases. By verifying intermediate reasoning steps with evidence retrieved from clinical guidelines and literature, our model can precisely assess the reasoning quality in a fine-grained manner. Evaluations on five medical QA benchmarks and two open-ended diagnostic tasks demonstrate that Med-PRM achieves state-of-the-art performance, with improving the performance of base models by up to 13.50% using Med-PRM. Moreover, we demonstrate the generality of Med-PRM by integrating it in a plug-and-play fashion with strong policy models such as Meerkat, achieving over 80% accuracy on MedQA for the first time using small-scale models of 8 billion parameters.
DMIS Lab at ArchEHR-QA 2025: Evidence-Grounded Answer Generation for EHR-based QA via a Multi-Agent Framework
Hyeon Hwang | Hyeongsoon Hwang | Jongmyung Jung | Jaehoon Yun | Minju Song | Yein Park | Dain Kim | Taewhoo Lee | Jiwoong Sohn | Chanwoong Yoon | Sihyeon Park | Jiwoo Lee | Heechul Yang | Jaewoo Kang
Proceedings of the 24th Workshop on Biomedical Language Processing (Shared Tasks)
Hyeon Hwang | Hyeongsoon Hwang | Jongmyung Jung | Jaehoon Yun | Minju Song | Yein Park | Dain Kim | Taewhoo Lee | Jiwoong Sohn | Chanwoong Yoon | Sihyeon Park | Jiwoo Lee | Heechul Yang | Jaewoo Kang
Proceedings of the 24th Workshop on Biomedical Language Processing (Shared Tasks)
Search
Fix author
Co-authors
- Jaewoo Kang 3
- Hyunjae Kim 3
- Qingyu Chen 2
- Hyeon Hwang 2
- Sihyeon Park 2
- Yein Park 2
- Chanwoong Yoon 2
- Jaehoon Yun 2
- Akiko Aizawa 1
- Fan Gao 1
- Mark Gerstein 1
- Jiahao Huang 1
- Hyeongsoon Hwang 1
- Piyalitt Ittichaiwong 1
- Junfeng Jiang 1
- Jongmyung Jung 1
- Dain Kim 1
- Kazuma Kobayashi 1
- Yong Hoe Koo 1
- Jiwoo Lee 1
- Taewhoo Lee 1
- Irene Li 1
- Edison Marrese-Taylor 1
- Ko Minhyeok 1
- Michael Moor 1
- Jungwoo Park 1
- Daniel Shao 1
- Minju Song 1
- Mujeen Sung 1
- Xiangru Tang 1
- Sherry T. Tong 1
- Kanyakorn Veerakanjana 1
- Ding Xia 1
- Heechul Yang 1