Angelo Mozzillo


Fixing paper assignments

  1. Please select all papers that do not belong to this person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
OrQA – Open Data Retrieval for Question Answering dataset generation
Giovanni Malaguti | Angelo Mozzillo | Giovanni Simonini
Proceedings of the 4th Table Representation Learning Workshop

We present OrQA, a novel agentic framework to generate large-scale tabular question-answering (TQA) datasets based on real-world open data.Such datasets are needed to overcome the limitations of existing benchmark datasets, which rely on synthetic questions or limited web tables.OrQA employs LLM agents to retrieve related open data tables, generate natural questions, and synthesize executable SQL queries—involving joins, unions, and other non-trivial operations.By leveraging hundreds of GPU hours on four NVIDIA A100, we applied OrQA to Canadian and UK government open data to produce 1,000 question-tables–SQL triples, a representative sample of which has been human‐validated.This open‐source dataset is now publicly available to drive transparency, reproducibility, and progress in table‐based question answering.