Chinonso Cynthia Osuji

2026

Generating coherent, semantically accurate text from large structured inputs remains a persistent challenge in data-to-text generation, as single-step LLM mappings from data-to-text limit control over discourse structuring and amplify hallucinations and omissions as input size grows. We introduce a new dataset of extended DBpedia triple sets (up to 199 triples per input), and a modular multi-agent framework: specialised LLM agents handle content ordering, text structuring, and surface realisation under the supervision of an orchestrator and guardrail control loop. The system generates multi-paragraph outputs in English and Irish (low-resource). We compare a three-worker multi-agent configuration against a single-worker multi-task variant and a strong end-to-end baseline. Quality is assessed via human evaluation and LLM-as-a-judge (with truncation-based sanity checks). Results show slightly superior coherence for the multi-agent approach in both languages, with statistically significant inter-rater correlation over all criteria for English and no statistically significant correlation for Irish. Human-LLM alignment is very weak overall, thus exposing key limits in scalable NLG evaluation.

2025

pdf bib abs

Scaling Up Data-to-Text Generation to Longer Sequences: A New Dataset and Benchmark Results for Generation from Large Triple Sets
Chinonso Cynthia Osuji | Simon Mille | Ornait O’Connell | Thiago Castro Ferreira | Anya Belz | Brian Davis
Proceedings of the 18th International Natural Language Generation Conference

The ability of LLMs to write coherent, faithful long texts from structured data inputs remains relatively uncharted, in part because nearly all public data-to-text datasets contain only short input-output pairs. To address these gaps, we benchmark six LLMs, a rule‐based system and human-written texts on a new long-input dataset in English and Irish via LLM-based evaluation. We find substantial differences between models and languages.

pdf bib abs

Are Multi-Agents the new Pipeline Architecture for Data-to-Text Systems?
Chinonso Cynthia Osuji | Brian Timoney | Mark Andrade | Thiago Castro Ferreira | Brian Davis
Proceedings of the 18th International Natural Language Generation Conference

Large Language Models (LLMs) have achieved remarkable results in natural language generation, yet challenges remain in data-to-text (D2T) tasks, particularly in controlling output, ensuring transparency, and maintaining factual consistency with the input. We introduce the first LLM-based multi-agent framework for D2T generation, coordinating specialized agents to produce high-quality, interpretable outputs. Our system combines the reasoning and acting abilities of ReAct agents, the self-correction of Reflexion agents, and the quality assurance of Guardrail agents, all directed by an Orchestrator agent that assigns tasks to three specialists—content ordering, text structuring, and surface realization—and iteratively refines outputs based on Guardrail feedback. This closed-loop design enables precise control and dynamic optimization, yielding text that is coherent, accurate, and grounded in the input data. On a relatively simple dataset like WebNLG, our framework performs competitively with end-to-end systems, highlighting its promise for more complex D2T scenarios.

pdf bib

DCU-ADAPT-modPB at the GEM’24 Data-to-Text Task: Analysis of Human Evaluation Results
Rudali Huidrom | Chinonso Cynthia Osuji | Kolawole John Adebayo | Thiago Castro Ferreira | Brian Davis
Proceedings of the 18th International Natural Language Generation Conference: Generation Challenges

2024

pdf bib abs

Pipeline Neural Data-to-text with Large Language Models
Chinonso Cynthia Osuji | Brian Timoney | Thiago Castro Ferreira | Brian Davis
Proceedings of the 17th International Natural Language Generation Conference

Previous studies have highlighted the advantages of pipeline neural architectures over end-to-end models, particularly in reducing text hallucination. In this study, we extend prior research by integrating pretrained language models (PLMs) into a pipeline framework, using both fine-tuning and prompting methods. Our findings show that fine-tuned PLMs consistently generate high quality text, especially within end-to-end architectures and at intermediate stages of the pipeline across various domains. These models also outperform prompt-based ones on automatic evaluation metrics but lag in human evaluations. Compared to the standard five-stage pipeline architecture, a streamlined three-stage pipeline, which only include ordering, structuring, and surface realization, achieves superior performance in fluency and semantic adequacy according to the human evaluation.

pdf bib abs

DCU-ADAPT-modPB at the GEM’24 Data-to-Text Generation Task: Model Hybridisation for Pipeline Data-to-Text Natural Language Generation
Chinonso Cynthia Osuji | Rudali Huidrom | Kolawole John Adebayo | Thiago Castro Ferreira | Brian Davis
Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges

In this paper, we present our approach to the GEM Shared Task at the INLG’24 Generation Challenges, which focuses on generating data-to-text in multiple languages, including low-resource languages, from WebNLG triples. We employ a combination of end-to-end and pipeline neural architectures for English text generation. To extend our methodology to Hindi, Korean, Arabic, and Swahili, we leverage a neural machine translation model. Our results demonstrate that our approach achieves competitive performance in the given task.

Co-authors

Elaine Uí Dhonnchadha 1

Bláithín Heffernan 1

Fírinne Nic an tSaoir 1

Venues

INLG5
Findings1

Fix author