Ken Satoh


2026

We introduce JBE-QA, a Japanese Bar Exam Question–Answering dataset to evaluate large language models’ legal knowledge. Derived from the multiple-choice (tantō-shiki) section of the Japanese bar exam (2015–2024), JBE-QA provides the first comprehensive benchmark for Japanese legal-domain evaluation of LLMs. It covers the Civil Code, the Penal Code, and the Constitution, extending beyond the Civil Code focus of prior Japanese resources. Each question is decomposed into independent true/false judgments with structured contextual fields. The dataset contains 3,464 items with balanced labels. We evaluate 26 LLMs, including proprietary, open-weight, Japanese-specialised, and reasoning models. Our results show that proprietary models with reasoning enabled perform best, and the Constitution questions are generally easier than the Civil Code or the Penal Code questions.

2024

This paper investigates explainability in Natural Legal Language Processing (NLLP). We study the task of legal outcome prediction of the European Court of Human Rights cases in a ternary classification setup, where a language model is fine-tuned to predict whether an article has been claimed and violated (positive outcome), claimed but not violated (negative outcome) or not claimed at all (null outcome). Specifically, we experiment with three popular NLP explainability methods. Correlating the attribution scores of input-level methods (Integrated Gradients and Contrastive Explanations) with rationales from court rulings, we show that the correlations are very weak, with absolute values of Spearman and Kendall correlation coefficients ranging between 0.003 and 0.094. Furthermore, we use a concept-level interpretability method (Concept Erasure) with human expert annotations of legal reasoning, to show that obscuring legal concepts from the model representation has an insignificant effect on model performance (at most a decline of 0.26 F1). Therefore, our results indicate that automated legal outcome prediction models are not reliably grounded in legal reasoning.

2023

In recent years, COVID-19 has impacted all aspects of human life. As a result, numerous publications relating to this disease have been issued. Due to the massive volume of publications, some retrieval systems have been developed to provide researchers with useful information. In these systems, lexical searching methods are widely used, which raises many issues related to acronyms, synonyms, and rare keywords. In this paper, we present a hybrid relation retrieval system, CovRelex-SE, based on embeddings to provide high-quality search results. Our system can be accessed through the following URL: https://www.jaist.ac.jp/is/labs/nguyen-lab/systems/covrelex-se/

2021

This paper presents CovRelex, a scientific paper retrieval system targeting entities and relations via relation extraction on COVID-19 scientific papers. This work aims at building a system supporting users efficiently in acquiring knowledge across a huge number of COVID-19 scientific papers published rapidly. Our system can be accessed via https://www.jaist.ac.jp/is/labs/nguyen-lab/systems/covrelex/.

1996