Matteo Paganelli

2026

CLARIESG: An End-to-End System for ESG Analysis over Complex Tables in Corporate Reports
Marta Santacroce | Michele Luca Contalbo | Sara Pederzoli | Riccardo Benassi | Venturelli Valeria | Matteo Paganelli | Francesco Guerra
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Sustainability reports contain rich Environmental, Social and Governance (ESG) information, but their heterogeneous layouts and complex multi-table structures pose major challenges for LLMs, especially for unit normalization, cross-document reasoning, and precise numerical computation. We present CLARIESG, an end-to-end system that couples robust table extraction with a structured prompting framework for multi-table filtering, normalization, and program-of-thought reasoning. On ESG-focused multi-table benchmarks, CLARIESG consistently outperforms standard prompting and provides transparent, auditable reasoning, supporting more reliable ESG analysis and greenwashing detection in real-world settings.

2025

pdf bib abs

GRI-QA: a Comprehensive Benchmark for Table Question Answering over Environmental Data
Michele Luca Contalbo | Sara Pederzoli | Francesco Del Buono | Venturelli Valeria | Francesco Guerra | Matteo Paganelli
Findings of the Association for Computational Linguistics: ACL 2025

Assessing corporate environmental sustainability with Table Question Answering systems is challenging due to complex tables, specialized terminology, and the variety of questions they must handle. In this paper, we introduce GRI-QA, a test benchmark designed to evaluate Table QA approaches in the environmental domain. Using GRI standards, we extract and annotate tables from non-financial corporate reports, generating question-answer pairs through a hybrid LLM-human approach. The benchmark includes eight datasets, categorized by the types of operations required, including operations on multiple tables from multiple documents. Our evaluation reveals a significant gap between human and model performance, particularly in multi-step reasoning, highlighting the relevance of the benchmark and the need for further research in domain-specific Table QA. Code and benchmark datasets are available at https://github.com/softlab-unimore/gri_qa.

2024

pdf bib abs

Argument Relation Classification through Discourse Markers and Adversarial Training
Michele Luca Contalbo | Francesco Guerra | Matteo Paganelli
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Argument relation classification (ARC) identifies supportive, contrasting and neutral relations between argumentative units. The current approaches rely on transformer architectures which have proven to be more effective than traditional methods based on hand-crafted linguistic features. In this paper, we introduce DISARM, which advances the state of the art with a training procedure combining multi-task and adversarial learning strategies. By jointly solving the ARC and discourse marker detection tasks and aligning their embedding spaces into a unified latent space, DISARM outperforms the accuracy of existing approaches.

Co-authors

Francesco Del Buono 1

Marta Santacroce 1

Venues

Fix author