Chompakorn Chaksangchaichot


2025

pdf bib
NitiBench: Benchmarking LLM Frameworks on Thai Legal Question Answering Capabilities
Pawitsapak Akarajaradwong | Pirat Pothavorn | Chompakorn Chaksangchaichot | Panuthep Tasawong | Thitiwat Nopparatbundit | Keerakiat Pratai | Sarana Nutanong
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) show promise in legal question answering (QA), yet Thai legal QA systems face challenges due to limited data and complex legal structures. We introduce NitiBench, a novel benchmark featuring two datasets: (1) NitiBench-CCL, covering Thai financial laws, and (2) NitiBench-Tax, containing Thailand’s official tax rulings. Our benchmark also consists of specialized evaluation metrics suited for Thai legal QA. We evaluate retrieval-augmented generation (RAG) and long-context LLM (LCLM) approaches across three key dimensions: (1) the benefits of domain-specific techniques like hierarchy-aware chunking and cross-referencing, (2) comparative performance of RAG components, e.g., retrievers and LLMs, and (3) the potential of long-context LLMs to replace traditional RAG systems. Our results reveal that domain-specific components slightly improve over naive methods. At the same time, existing retrieval models still struggle with complex legal queries, and long-context LLMs have limitations in consistent legal reasoning. Our study highlights current limitations in Thai legal NLP and lays a foundation for future research in this emerging domain.

pdf bib
Aligning LLMs for Thai Legal Question Answering with Efficient Semantic-Similarity Rewards
Pawitsapak Akarajaradwong | Chompakorn Chaksangchaichot | Pirat Pothavorn | Ekapol Chuangsuwanich | Attapol Rutherford | Sarana Nutanong
Proceedings of the Natural Legal Language Processing Workshop 2025

The Retrieval-Augmented Generation (RAG) systems’ performance on Thai legal question answering is still limited, especially for questions requiring extensive, complex legal reasoning. To address these limitations, we introduce a resource-efficient approach that aligns Large Language Models (LLMs) for improved citation accuracy and response quality using Group-Relative Policy Optimization (GRPO). Our proposed method leverages BGE-M3 embeddings as a cost-efficient semantic-similarity reward, significantly reducing computational expenses up to 2.5x compared to an LLM-based reward model. Experiments on the NitiBench benchmark demonstrate substantial improvements: GRPO achieves up to 90% citation-F1 gains relative to the base model and a 31% increase in joint quality metrics over instruction tuning. Crucially, our approach provides a practical and effective solution for enhancing legal LLMs in resource-constrained environments.