Tho Quan
2026
HiGraAgent: Dual-Agent Adaptive Reasoning over Hierarchical Knowledge Graph for Open Domain Multi-hop Question Answering
Hung Luu | Long S. T. Nguyen | Trung Pham | Hieu Pham | Tho Quan
Findings of the Association for Computational Linguistics: EACL 2026
Hung Luu | Long S. T. Nguyen | Trung Pham | Hieu Pham | Tho Quan
Findings of the Association for Computational Linguistics: EACL 2026
Open Domain Multi-hop Question Answering faces a dual compositionality challenge: reasoning over complex query structures and integrating evidence scattered across contexts. Despite recent advancements in Graph-based Retrieval-Augmented Generation (GraphRAG), persistent limitations in complex reasoning and retrieval inaccuracies continue to constrain the efficacy of multi-hop QA systems. We introduce HiGraAgent, a framework that unifies graph-based retrieval with adaptive reasoning. It constructs a Hierarchical Knowledge Graph (HiGra) with entity alignment, reducing redundancy by 34.5% while preserving expressiveness; employs HiGraRetriever, a hybrid graph-semantic retriever that consistently outperforms the strongest graph-based method across benchmarks; and integrates a dual-agent adaptive reasoning protocol where a Seeker and a Librarian dynamically coordinate retrieval and reasoning. Together, these innovations enable HiGraAgent to achieve 85.3% average accuracy on HotpotQA, 2WikiMultihopQA, and MuSiQue, surpassing the strongest prior system by 11.7%. Our results highlight the importance of reframing multi-hop QA as a problem of adaptive reasoning, offering a more robust and flexible paradigm for complex information seeking.
Which Works Best for Vietnamese? A Practical Study of Information Retrieval Methods across Domains
Long S. T. Nguyen | Tho Quan
Findings of the Association for Computational Linguistics: EACL 2026
Long S. T. Nguyen | Tho Quan
Findings of the Association for Computational Linguistics: EACL 2026
Large Language Models (LLMs) have achieved remarkable progress, yet their reliance on parametric knowledge often leads to hallucinations. Retrieval-Augmented Generation (RAG) mitigates this issue by grounding outputs in external documents, where the quality of retrieval is critical. While retrieval methods have been widely benchmarked in English, it remains unclear which approaches are most effective for Vietnamese, a language characterized by informal queries, noisy documents, and limited resources. Prior studies are restricted to clean datasets or narrow domains, leaving fragmented insights. To the best of our knowledge, we present the first comprehensive benchmark of retrieval methods for Vietnamese across multiple real-world domains. We systematically compare lexical, dense, and hybrid methods on datasets spanning education, legal, healthcare, customer support, lifestyle, and Wikipedia, and introduce two new datasets capturing authentic educational counseling and customer service interactions. Beyond reporting benchmark numbers, we distill a set of empirical insights that clarify trade-offs, highlight domain-specific challenges, and provide practical guidance for building robust Vietnamese QA systems. Together, these contributions offer the first large-scale, practice-oriented perspective on Vietnamese retrieval and inform both academic research and real-world deployment in low-resource languages. All datasets and evaluation scripts are available at https://github.com/longstnguyen/ViRE.
2025
When in Doubt, Ask First: A Unified Retrieval Agent-Based System for Ambiguous and Unanswerable Question Answering
Long Nguyen | Quynh Vo | Hung Luu | Tho Quan
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Long Nguyen | Quynh Vo | Hung Luu | Tho Quan
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Large Language Models (LLMs) have shown strong capabilities in Question Answering (QA), but their effectiveness in high-stakes, closed-domain settings is often constrained by hallucinations and limited handling of vague or underspecified queries. These challenges are especially pronounced in Vietnamese, a low-resource language with complex syntax and strong contextual dependence, where user questions are often short, informal, and ambiguous. We introduce the Unified Retrieval Agent-Based System (URASys), a QA framework that combines agent-based reasoning with dual retrieval under the Just Enough principle to address standard, ambiguous, and unanswerable questions in a unified manner. URASys performs lightweight query decomposition and integrates document retrieval with a question–answer layer via a two-phase indexing pipeline, engaging in interactive clarification when intent is uncertain and explicitly signaling unanswerable cases to avoid hallucination. We evaluate URASys on Vietnamese and English QA benchmarks spanning single-hop, multi-hop, and real-world academic advising tasks, and release new dual-language ambiguous subsets for benchmarking interactive clarification. Results show that URASys outperforms strong retrieval-based baselines in factual accuracy, improves unanswerable handling, and achieves statistically significant gains in human evaluations for clarity and trustworthiness.
Serving the Underserved: Leveraging BARTBahnar Language Model for Bahnaric-Vietnamese Translation
Long Nguyen | Tran Le | Huong Nguyen | Quynh Vo | Phong Nguyen | Tho Quan
Proceedings of the 1st Workshop on Language Models for Underserved Communities (LM4UC 2025)
Long Nguyen | Tran Le | Huong Nguyen | Quynh Vo | Phong Nguyen | Tho Quan
Proceedings of the 1st Workshop on Language Models for Underserved Communities (LM4UC 2025)
The Bahnar people, one of Vietnam’s ethnic minorities, represent an underserved community with limited access to modern technologies. Developing an effective Bahnaric-Vietnamese translation system is essential for fostering linguistic exchange, preserving cultural heritage, and empowering local communities by bridging communication barriers. With advancements in Artificial Intelligence (AI), Neural Machine Translation (NMT) has achieved remarkable success across various language pairs. However, the low-resource nature of Bahnaric, characterized by data scarcity, vocabulary constraints, and the lack of parallel corpora, poses significant challenges to building an accurate and efficient translation system. To address these challenges, we propose a novel hybrid architecture for Bahnaric-Vietnamese translation, with BARTBahnar as its core language model. BARTBahnar is developed by continually training a pre-trained Vietnamese model, BARTPho, on augmented monolingual Bahnaric data, followed by fine-tuning on bilingual datasets. This transfer learning approach reduces training costs while effectively capturing linguistic similarities between the two languages. Additionally, we implement advanced data augmentation techniques to enrich and diversify training data, further enhancing BARTBahnar’s robustness and translation accuracy. Beyond leveraging the language model, our hybrid system integrates rule-based and statistical methods to improve translation quality. Experimental results show substantial improvements on bilingual Bahnaric-Vietnamese datasets, validating the effectiveness of our approach for low-resource translation. To support further research, we open-source our code and related materials at https://github.com/ura-hcmut/BARTBahnar.
2024
Crossing Linguistic Horizons: Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models
Sang Truong | Duc Nguyen | Toan Nguyen | Dong Le | Nhi Truong | Tho Quan | Sanmi Koyejo
Findings of the Association for Computational Linguistics: NAACL 2024
Sang Truong | Duc Nguyen | Toan Nguyen | Dong Le | Nhi Truong | Tho Quan | Sanmi Koyejo
Findings of the Association for Computational Linguistics: NAACL 2024
Recent advancements in large language models (LLMs) have underscored their importance in the evolution of artificial intelligence. However, despite extensive pretraining on multilingual datasets, available open-sourced LLMs exhibit limited effectiveness in processing Vietnamese. The challenge is exacerbated by the absence of systematic benchmark datasets and metrics tailored for Vietnamese LLM evaluation. To mitigate these issues, we have finetuned LLMs specifically for Vietnamese and developed a comprehensive evaluation framework encompassing 10 tasks and 31 metrics. We observe that finetuning can help LLMs transfer knowledge across languages, serving as an efficient way to bolster their capabilities in non-English languages. Moreover, our analysis indicates that larger models can introduce more biases and uncalibrated outputs and the key factor influencing LLM performance is the quality of the training or finetuning datasets. These insights underscore the significance of meticulous finetuning with high-quality datasets in enhancing LLM performance.
ViConsFormer: Constituting Meaningful Phrases of Scene Texts using Transformer-based Method in Vietnamese Text-based Visual Question Answering
Nghia Nguyen | Tho Quan | Ngan Nguyen
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation
Nghia Nguyen | Tho Quan | Ngan Nguyen
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation
2021
Enriching and Controlling Global Semantics for Text Summarization
Thong Nguyen | Anh Tuan Luu | Truc Lu | Tho Quan
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Thong Nguyen | Anh Tuan Luu | Truc Lu | Tho Quan
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Recently, Transformer-based models have been proven effective in the abstractive summarization task by creating fluent and informative summaries. Nevertheless, these models still suffer from the short-range dependency problem, causing them to produce summaries that miss the key points of document. In this paper, we attempt to address this issue by introducing a neural topic model empowered with normalizing flow to capture the global semantics of the document, which are then integrated into the summarization model. In addition, to avoid the overwhelming effect of global semantics on contextualized representation, we introduce a mechanism to control the amount of global semantics supplied to the text generation module. Our method outperforms state-of-the-art summarization models on five common text summarization datasets, namely CNN/DailyMail, XSum, Reddit TIFU, arXiv, and PubMed.