ReTaT: A Unified Benchmark for Relation Extraction across Text and Table

Mohamed Ettaleb, Thibault Ehrhart, Nathalie Aussenac-Gilles, Yoan Chabot, Mouna Kamel, Véronique MORICEAU, Raphael Troncy, Fanfu Wei


Abstract
While prior work in Information Extraction (IE) has focused on extracting information from either textual content or tables in isolation, they miss critical information that emerges only from their interplay. Indeed, tables may summarize facts sparse in the text, while text can disambiguate or elaborate on table entries. This complementarity may take the form of relations which are expressed across text and tables. In this context, we are interested in the task of extracting such relations whose expression spans the two modalities. This task is an original one, for which no reference evaluation corpora exists. Thus we created ReTaT, a corpus that can be used to train and evaluate systems for extracting such relations. This corpus is composed of (table, surrounding text) pairs extracted from Wikipedia pages and has been manually annotated with relation triples. ReTaT is organized in three datasets with distinct characteristics: domain (business, telecommunication and female celebrities), size (from 50 to 255 pairs), language (English vs French), type of relations (data vs object properties), close vs open list of relation, size of the surrounding text (paragraph vs full page). We then assessed its quality and suitability for the joint table-text relation extraction task using Large Language Models (LLMs), at a time when LLMs have demonstrated their ability to extract relations from either text or tables in isolation.
Anthology ID:
2026.lrec-main.104
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
1341–1351
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.104/
DOI:
Bibkey:
Cite (ACL):
Mohamed Ettaleb, Thibault Ehrhart, Nathalie Aussenac-Gilles, Yoan Chabot, Mouna Kamel, Véronique MORICEAU, Raphael Troncy, and Fanfu Wei. 2026. ReTaT: A Unified Benchmark for Relation Extraction across Text and Table. International Conference on Language Resources and Evaluation, main:1341–1351.
Cite (Informal):
ReTaT: A Unified Benchmark for Relation Extraction across Text and Table (Ettaleb et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.104.pdf