Junwen Wang
2026
DentalGPT: Incentivizing Multimodal Reasoning in Dentistry
Zhenyang Cai | Jiaming Zhang | Junjie Zhao | Ziyi Zeng | Yanchao Li | Liang Jingyi | Junying Chen | Yunjin Yang | Jiajun You | Shuzhi Deng | Xieruiqiii | Yuanting Chen | Xiangyi Feng | Jianquan Li | Liangyi Chen | Junwen Wang | Shan Jiang | Benyou Wang
Findings of the Association for Computational Linguistics: ACL 2026
Zhenyang Cai | Jiaming Zhang | Junjie Zhao | Ziyi Zeng | Yanchao Li | Liang Jingyi | Junying Chen | Yunjin Yang | Jiajun You | Shuzhi Deng | Xieruiqiii | Yuanting Chen | Xiangyi Feng | Jianquan Li | Liangyi Chen | Junwen Wang | Shan Jiang | Benyou Wang
Findings of the Association for Computational Linguistics: ACL 2026
Reliable interpretation of multimodal dental data is essential for automated oral healthcare, yet current multimodal large language models (MLLMs) show limited understanding of dental images. Although complex reasoning improves performance, its gains in dentistry are substantially smaller than in other medical domains, suggesting that complex reasoning is not yet sufficiently incentivized for dental diagnosis, likely due to insufficient domain knowledge and limited reinforcement learning on dental questions. We present DentalGPT, a dentistry-specialized MLLM trained via staged multimodal alignment and reinforcement learning. By constructing the largest annotated multimodal dental dataset to date with over 120k images, multimodal alignment provides the necessary domain knowledge foundation to support and incentivize complex reasoning, which is further strengthened through reinforcement learning. Experiments on expert-annotated benchmarks and dental subsets of medical VQA benchmarks show that DentalGPT achieves superior performance on disease classification and dental VQA tasks, outperforming many state-of-the-art MLLMs despite its compact 7B parameter scale.