Chinonso Cynthia Osuji


2025

pdf bib
Are Multi-Agents the new Pipeline Architecture for Data-to-Text Systems?
Chinonso Cynthia Osuji | Brian Timoney | Mark Andrade | Thiago Castro Ferreira | Brian Davis
Proceedings of the 18th International Natural Language Generation Conference

Large Language Models (LLMs) have achieved remarkable results in natural language generation, yet challenges remain in data-to-text (D2T) tasks, particularly in controlling output, ensuring transparency, and maintaining factual consistency with the input. We introduce the first LLM-based multi-agent framework for D2T generation, coordinating specialized agents to produce high-quality, interpretable outputs. Our system combines the reasoning and acting abilities of ReAct agents, the self-correction of Reflexion agents, and the quality assurance of Guardrail agents, all directed by an Orchestrator agent that assigns tasks to three specialists—content ordering, text structuring, and surface realization—and iteratively refines outputs based on Guardrail feedback. This closed-loop design enables precise control and dynamic optimization, yielding text that is coherent, accurate, and grounded in the input data. On a relatively simple dataset like WebNLG, our framework performs competitively with end-to-end systems, highlighting its promise for more complex D2T scenarios.

pdf bib
Scaling Up Data-to-Text Generation to Longer Sequences: A New Dataset and Benchmark Results for Generation from Large Triple Sets
Chinonso Cynthia Osuji | Simon Mille | Ornait O’Connell | Thiago Castro Ferreira | Anya Belz | Brian Davis
Proceedings of the 18th International Natural Language Generation Conference

The ability of LLMs to write coherent, faithful long texts from structured data inputs remains relatively uncharted, in part because nearly all public data-to-text datasets contain only short input-output pairs. To address these gaps, we benchmark six LLMs, a rule‐based system and human-written texts on a new long-input dataset in English and Irish via LLM-based evaluation. We find substantial differences between models and languages.

pdf bib
DCU-ADAPT-modPB at the GEM’24 Data-to-Text Task: Analysis of Human Evaluation Results
Rudali Huidrom | Chinonso Cynthia Osuji | Kolawole John Adebayo | Thiago Castro Ferreira | Brian Davis
Proceedings of the 18th International Natural Language Generation Conference: Generation Challenges

2024

pdf bib
Pipeline Neural Data-to-text with Large Language Models
Chinonso Cynthia Osuji | Brian Timoney | Thiago Castro Ferreira | Brian Davis
Proceedings of the 17th International Natural Language Generation Conference

Previous studies have highlighted the advantages of pipeline neural architectures over end-to-end models, particularly in reducing text hallucination. In this study, we extend prior research by integrating pretrained language models (PLMs) into a pipeline framework, using both fine-tuning and prompting methods. Our findings show that fine-tuned PLMs consistently generate high quality text, especially within end-to-end architectures and at intermediate stages of the pipeline across various domains. These models also outperform prompt-based ones on automatic evaluation metrics but lag in human evaluations. Compared to the standard five-stage pipeline architecture, a streamlined three-stage pipeline, which only include ordering, structuring, and surface realization, achieves superior performance in fluency and semantic adequacy according to the human evaluation.

pdf bib
DCU-ADAPT-modPB at the GEM’24 Data-to-Text Generation Task: Model Hybridisation for Pipeline Data-to-Text Natural Language Generation
Chinonso Cynthia Osuji | Rudali Huidrom | Kolawole John Adebayo | Thiago Castro Ferreira | Brian Davis
Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges

In this paper, we present our approach to the GEM Shared Task at the INLG’24 Generation Challenges, which focuses on generating data-to-text in multiple languages, including low-resource languages, from WebNLG triples. We employ a combination of end-to-end and pipeline neural architectures for English text generation. To extend our methodology to Hindi, Korean, Arabic, and Swahili, we leverage a neural machine translation model. Our results demonstrate that our approach achieves competitive performance in the given task.