Phi Le Nguyen
2026
Region-Grounded Report Generation for 3D Medical Imaging: A Fine-Grained Dataset and Graph-Enhanced Framework
Cong Huy Nguyen | Son Dinh Nguyen | Guanlin Li | Tuan Dung Nguyen | Aditya Narayan Sankaran | Mai Huy Thong | Thanh Trung Nguyen | Mai Hong Son | Reza Farahbakhsh | Phi Le Nguyen | Noel Crespi
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Cong Huy Nguyen | Son Dinh Nguyen | Guanlin Li | Tuan Dung Nguyen | Aditya Narayan Sankaran | Mai Huy Thong | Thanh Trung Nguyen | Mai Hong Son | Reza Farahbakhsh | Phi Le Nguyen | Noel Crespi
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Automated medical report generation for 3D PET/CT imaging is fundamentally challenged by the high-dimensional nature of volumetric data and a critical scarcity of annotated datasets, particularly for low-resource languages. Current black-box methods map whole volumes to reports, ignoring the clinical workflow of analyzing localized Regions of Interest (RoIs) to derive diagnostic conclusions. In this paper, we bridge this gap by introducing VietPET-RoI, the first large-scale 3D PET/CT dataset with fine-grained RoI annotation for a low-resource language, comprising 600 PET/CT samples and 1,960 manually annotated RoIs, paired with corresponding clinical reports. Furthermore, to demonstrate the utility of this dataset, we propose HiRRA, a novel framework that mimics the professional radiologist diagnostic workflow by employing graph-based relational modules to capture dependencies between RoI attributes. This approach shifts from global pattern matching toward localized clinical findings. Additionally, we introduce new clinical evaluation metrics, namely RoI Coverage and RoI Quality Index, that measure both RoI localization accuracy and attribute description fidelity using LLM-based extraction. Extensive evaluation demonstrates that our framework achieves SOTA performance, surpassing existing models by 19.7% in BLEU and 4.7% in ROUGE-L, while achieving a remarkable 45.8% improvement in clinical metrics, indicating enhanced clinical reliability and reduced hallucination. Our code and dataset are available on GitHub.
2025
MLAlgo-Bench: Can Machines Implement Machine Learning Algorithms?
Yunfei Wang | Yeqin Zhang | Yuyang Wu | Liang Lu | Phi Le Nguyen | Xiaoliang Wang | Cam-Tu Nguyen
Findings of the Association for Computational Linguistics: EMNLP 2025
Yunfei Wang | Yeqin Zhang | Yuyang Wu | Liang Lu | Phi Le Nguyen | Xiaoliang Wang | Cam-Tu Nguyen
Findings of the Association for Computational Linguistics: EMNLP 2025
As machine learning (ML) application continues to expand across diverse fields, there is a rising demand for ML code generation. In this paper, we aim at a critical research question: Can machines autonomously generate ML code for sophisticated, human-designed algorithms or solutions? To answer this question, we introduce a novel benchmark, MLAlgo-Bench, which includes two challenging tasks: 1) Generating code for ML algorithms including both traditional ML and modern deep learning-based methods, and 2) Giving humans solution sketches, writing ML code for solving practical tasks in Kaggle competitions. This benchmark is unique in its focus on the challenges of interpreting intricate human instructions and producing multi-step, high-complexity code, offering a rigorous test for current Large Language Model (LLM) capabilities. We introduce an automatic evaluation framework with comprehensive metrics such as task pass rate, relative performance metric, and time overhead. Currently, the top-performing models (Claude3.5-Sonet) achieve a 48.8% task completion rate on realizing machine learning algorithms, and a 21.6% rate for completing Kaggle competitions. Further analysis suggests substantial room for improvement.
2024
CARER - ClinicAl Reasoning-Enhanced Representation for Temporal Health Risk Prediction
Tuan Dung Nguyen | Thanh Trung Huynh | Minh Hieu Phan | Quoc Viet Hung Nguyen | Phi Le Nguyen
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Tuan Dung Nguyen | Thanh Trung Huynh | Minh Hieu Phan | Quoc Viet Hung Nguyen | Phi Le Nguyen
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The increasing availability of multimodal data from electronic health records (EHR) has paved the way for deep learning methods to improve diagnosis accuracy. However, deep learning models are data-driven, requiring large-scale datasets to achieve high generalizability. Inspired by how human experts leverage reasoning for medical diagnosis, we propose CARER, a novel health risk prediction framework, that enhances deep learning models with clinical rationales derived from medically proficient Large Language Models (LLMs). In addition, we provide a cross-view alignment loss which aligns the “local” view from the patient’s health status with the “global” view from the external LLM’s clinical reasoning to boost the mutual feature learning. Through extensive experiments on two predictive tasks using two popular EHR datasets, our CARER’s significantly exceeds the performance of state-of-the-art models by up to 11.2%, especially in improving data efficiency and generalizability. Our code is available at https://github.com/tuandung2812/CARER-EMNLP-2024