Thanh-Ha Do


2024

pdf
ViHealthNLI: A Dataset for Vietnamese Natural Language Inference in Healthcare
Huyen Nguyen | Quyen The Ngo | Thanh-Ha Do | Tuan-Anh Hoang
Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024

This paper introduces ViHealthNLI, a large dataset for the natural language inference problem for Vietnamese. Unlike the similar Vietnamese datasets, ours is specific to the healthcare domain. We conducted an exploratory analysis to characterize the dataset and evaluated the state-of-the-art methods on the dataset. Our findings indicate that the dataset poses significant challenges while also holding promise for further advanced research and the creation of practical applications.