LLM Multi-Agent Systems for Long Triple Set Data-to-Text Generation
Chinonso Cynthia Osuji, Simon Mille, Mark Andrade, Jane Adkins, Ornait O’Connell, Elaine Uí Dhonnchadha, Bláithín Heffernan, Fírinne Nic an tSaoir, Anya Belz, Thiago Castro Ferreira, Brian Davis
Abstract
Generating coherent, semantically accurate text from large structured inputs remains a persistent challenge in data-to-text generation, as single-step LLM mappings from data-to-text limit control over discourse structuring and amplify hallucinations and omissions as input size grows. We introduce a new dataset of extended DBpedia triple sets (up to 199 triples per input), and a modular multi-agent framework: specialised LLM agents handle content ordering, text structuring, and surface realisation under the supervision of an orchestrator and guardrail control loop. The system generates multi-paragraph outputs in English and Irish (low-resource). We compare a three-worker multi-agent configuration against a single-worker multi-task variant and a strong end-to-end baseline. Quality is assessed via human evaluation and LLM-as-a-judge (with truncation-based sanity checks). Results show slightly superior coherence for the multi-agent approach in both languages, with statistically significant inter-rater correlation over all criteria for English and no statistically significant correlation for Irish. Human-LLM alignment is very weak overall, thus exposing key limits in scalable NLG evaluation.- Anthology ID:
- 2026.findings-acl.1712
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 34261–34275
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-form-platform/2026.findings-acl.1712/
- DOI:
- Cite (ACL):
- Chinonso Cynthia Osuji, Simon Mille, Mark Andrade, Jane Adkins, Ornait O’Connell, Elaine Uí Dhonnchadha, Bláithín Heffernan, Fírinne Nic an tSaoir, Anya Belz, Thiago Castro Ferreira, and Brian Davis. 2026. LLM Multi-Agent Systems for Long Triple Set Data-to-Text Generation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 34261–34275, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- LLM Multi-Agent Systems for Long Triple Set Data-to-Text Generation (Osuji et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingestion-form-platform/2026.findings-acl.1712.pdf