Ha Thanh Nguyen

Also published as: Ha-Thanh Nguyen

2026

We introduce JBE-QA, a Japanese Bar Exam Question–Answering dataset to evaluate large language models’ legal knowledge. Derived from the multiple-choice (tantō-shiki) section of the Japanese bar exam (2015–2024), JBE-QA provides the first comprehensive benchmark for Japanese legal-domain evaluation of LLMs. It covers the Civil Code, the Penal Code, and the Constitution, extending beyond the Civil Code focus of prior Japanese resources. Each question is decomposed into independent true/false judgments with structured contextual fields. The dataset contains 3,464 items with balanced labels. We evaluate 26 LLMs, including proprietary, open-weight, Japanese-specialised, and reasoning models. Our results show that proprietary models with reasoning enabled perform best, and the Constitution questions are generally easier than the Civil Code or the Penal Code questions.

bib abs

We present BIS Reasoning 1.0, the first large-scale Japanese dataset of syllogistic reasoning problems explicitly designed to evaluate belief-inconsistent reasoning in large language models (LLMs). Unlike prior resources such as NeuBAROCO and JFLD, which emphasize general or belief-aligned logic, BIS Reasoning 1.0 systematically introduces logically valid yet belief-inconsistent syllogisms to expose belief bias—the tendency to accept believable conclusions irrespective of validity. We benchmark a representative suite of cutting-edge models—including OpenAI GPT-5 variants, GPT-4o, Qwen, and prominent Japanese LLMs—under a uniform, zero-shot protocol. Reasoning-centric models achieve near-perfect accuracy on BIS Reasoning 1.0 (e.g., Qwen3-32B ≈99% and GPT-5-mini up to ≈99.7%), while GPT-4o attains around 80%. Earlier Japanese-specialized models underperform, often well below 60%, whereas the latest llm-jp-3.1-13b-instruct4 markedly improves to the mid-80% range. These results indicate that robustness to belief-inconsistent inputs is driven more by explicit reasoning optimization than by language specialization or scale alone. Our analysis further shows that even top-tier systems falter when logical validity conflicts with intuitive or factual beliefs, and that performance is sensitive to prompt design and inference-time reasoning effort. We discuss implications for safety-critical domains—law, healthcare, and scientific literature—where strict logical fidelity must override intuitive belief to ensure reliability.

2025

pdf bib

DRILL Shared Task 2025: The Challenge of Deep Retrieval in the Expansive Legal Landscape
Thi-Hai-Yen Vuong | Tan-Minh Nguyen | Hoang-Trung Nguyen | Trong-Khoi Dao | Ha-Thanh Nguyen | Hoang-Quynh Le
Proceedings of the 11th International Workshop on Vietnamese Language and Speech Processing

2024

pdf bib abs

Enhancing Legal Violation Identification with LLMs and Deep Learning Techniques: Achievements in the LegalLens 2024 Competition
Nguyen Tan Minh | Duy Ngoc Mai | Le Xuan Bach | Nguyen Huu Dung | Pham Cong Minh | Ha Thanh Nguyen | Thi Hai Yen Vuong
Proceedings of the Natural Legal Language Processing Workshop 2024

LegalLens is a competition organized to encourage advancements in automatically detecting legal violations. This paper presents our solutions for two tasks Legal Named Entity Recognition (L-NER) and Legal Natural Language Inference (L-NLI). Our approach involves fine-tuning BERT-based models, designing methods based on data characteristics, and a novel prompting template for data augmentation using LLMs. As a result, we secured first place in L-NER and third place in L-NLI among thirty-six participants. We also perform error analysis to provide valuable insights and pave the way for future enhancements in legal NLP. Our implementation is available at https://github.com/lxbach10012004/legal-lens/tree/main

2023

pdf bib abs

Joint Learning for Legal Text Retrieval and Textual Entailment: Leveraging the Relationship between Relevancy and Affirmation
Nguyen Hai Long | Thi Hai Yen Vuong | Ha Thanh Nguyen | Xuan-Hieu Phan
Proceedings of the Natural Legal Language Processing Workshop 2023

In legal text processing and reasoning, one normally performs information retrieval to find relevant documents of an input question, and then performs textual entailment to answer the question. The former is about relevancy whereas the latter is about affirmation (or conclusion). While relevancy and affirmation are two different concepts, there is obviously a connection between them. That is why performing retrieval and textual entailment sequentially and independently may not make the most of this mutually supportive relationship. This paper, therefore, propose a multi–task learning model for these two tasks to improve their performance. Technically, in the COLIEE dataset, we use the information of Task 4 (conclusions) to improve the performance of Task 3 (searching for legal provisions related to the question). Our empirical findings indicate that this supportive relationship truly exists. This important insight sheds light on how leveraging relationship between tasks can significantly enhance the effectiveness of our multi-task learning approach for legal text processing.

2020

pdf bib

Latent Topic Refinement based on Distance Metric Learning and Semantics-assisted Non-negative Matrix Factorization
Tran-Binh Dang | Ha-Thanh Nguyen | Le-Minh Nguyen
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

pdf bib

How State-Of-The-Art Models Can Deal With Long-Form Question Answering
Minh-Quan Bui | Vu Tran | Ha-Thanh Nguyen | Le-Minh Nguyen
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

pdf bib abs

Text representation plays a vital role in retrieval-based question answering, especially in the legal domain where documents are usually long and complicated. The better the question and the legal documents are represented, the more accurate they are matched. In this paper, we focus on the task of answering legal questions at the article level. Given a legal question, the goal is to retrieve all the correct and valid legal articles, that can be used as the basic to answer the question. We present a retrieval-based model for the task by learning neural attentive text representation. Our text representation method first leverages convolutional neural networks to extract important information in a question and legal articles. Attention mechanisms are then used to represent the question and articles and select appropriate information to align them in a matching process. Experimental results on an annotated corpus consisting of 5,922 Vietnamese legal questions show that our model outperforms state-of-the-art retrieval-based methods for question answering by large margins in terms of both recall and NDCG.

Venues

VLSP1

Fix author

Ha Thanh Nguyen

2026

2025

2024

2023

2020

Co-authors

Venues