Romy A.N. van Drie
2026
QuALA-NL: Question & Answer with Legal Attribution in Dutch
Romy A.N. van Drie | Roos M. Bakker | Daan L. Di Scala | Maaike de Boer
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Romy A.N. van Drie | Roos M. Bakker | Daan L. Di Scala | Maaike de Boer
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Ensuring trustworthy and traceable outputs from Large Language Models (LLMs) is crucial in high-stakes domains such as law. Retrieval-Augmented Generation (RAG) offers a way to enhance LLMs with domain-specific or updated information and provide attribution to the source, and recent work has focused on knowledge-based RAG (K-RAG) for improved factual grounding. However, proper evaluation of such systems requires high-quality datasets. To address this need, we introduce QuALA-NL: a dataset that provides attributions to legal formalizations, enabling experiments with K-RAG in the legal domain. The dataset contains 101 QA pairs on three Dutch laws, with attributions to the law text and a formalization of the interpretation of the legal text. To demonstrate the capabilities of the dataset, we perform experiments using four configurations: LLM-only, RAG using legal texts, K-RAG using a formalization of the legal texts, and RAG combining both legal texts and the formalizations. The results show that K-RAG has the highest retrieval scores, but that this method is outperformed by text-based RAG on generation. A qualitative analysis shows that the use of the knowledge graph for the generation of answers can be improved. QuALA-NL can be used in future work to experiment with knowledge-based Retrieval Augmented Generation methods.
2022
Semantic Role Labelling for Dutch Law Texts
Roos Bakker | Romy A.N. van Drie | Maaike de Boer | Robert van Doesburg | Tom van Engers
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Roos Bakker | Romy A.N. van Drie | Maaike de Boer | Robert van Doesburg | Tom van Engers
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Legal texts are often difficult to interpret, and people who interpret them need to make choices about the interpretation. To improve transparency, the interpretation of a legal text can be made explicit by formalising it. However, creating formalised representations of legal texts manually is quite labour-intensive. In this paper, we describe a method to extract structured representations in the Flint language (van Doesburg and van Engers, 2019) from natural language. Automated extraction of knowledge representation not only makes the interpretation and modelling efforts more efficient, it also contributes to reducing inter-coder dependencies. The Flint language offers a formal model that enables the interpretation of legal text by describing the norms in these texts as acts, facts and duties. To extract the components of a Flint representation, we use a rule-based method and a transformer-based method. In the transformer-based method we fine-tune the last layer with annotated legal texts. The results show that the transformed-based method (80% accuracy) outperforms the rule-based method (42% accuracy) on the Dutch Aliens Act. This indicates that the transformer-based method is a promising approach of automatically extracting Flint frames.