RelEx-PT: A Portuguese Sentence-Level Relation Extraction Dataset

Tomás Pinto, Catarina Silva, Hugo Goncalo Oliveira


Abstract
We introduce RelEx-PT, a new sentence-level Relation Extraction dataset for Portuguese. Addressing the scarcity of high-quality, controlled resources for the language, RelEx-PT provides a balanced benchmark comprising 18 Wikidata-derived relation types across diverse domains. The dataset is built through a distant supervision pipeline that links Wikidata triples with Portuguese Wikipedia sentences and enhanced by a Natural Language Inference (NLI)-based filtering process, combining scalability with quality assurance. Additionally, we conduct baseline experiments to evaluate the dataset’s applicability across diverse extraction settings, including Relation Classification (RC), Relation Triple Extraction, and Open Information Extraction. These experiments leverage both prompting and fine-tuning strategies using Large Language Models. The results show that RelEx-PT effectively supports a range of extraction paradigms, yielding high performance in RC and competitive results in structured triple generation, while also highlighting key challenges in open-ended extraction.
Anthology ID:
2026.lrec-main.609
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
7681–7691
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.609/
DOI:
Bibkey:
Cite (ACL):
Tomás Pinto, Catarina Silva, and Hugo Goncalo Oliveira. 2026. RelEx-PT: A Portuguese Sentence-Level Relation Extraction Dataset. International Conference on Language Resources and Evaluation, main:7681–7691.
Cite (Informal):
RelEx-PT: A Portuguese Sentence-Level Relation Extraction Dataset (Pinto et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.609.pdf