Pipeline Neural Data-to-text with Large Language Models
Chinonso Cynthia Osuji, Brian Timoney, Thiago Castro Ferreira, Brian Davis
Abstract
Previous studies have highlighted the advantages of pipeline neural architectures over end-to-end models, particularly in reducing text hallucination. In this study, we extend prior research by integrating pretrained language models (PLMs) into a pipeline framework, using both fine-tuning and prompting methods. Our findings show that fine-tuned PLMs consistently generate high quality text, especially within end-to-end architectures and at intermediate stages of the pipeline across various domains. These models also outperform prompt-based ones on automatic evaluation metrics but lag in human evaluations. Compared to the standard five-stage pipeline architecture, a streamlined three-stage pipeline, which only include ordering, structuring, and surface realization, achieves superior performance in fluency and semantic adequacy according to the human evaluation.- Anthology ID:
- 2024.inlg-main.27
- Volume:
- Proceedings of the 17th International Natural Language Generation Conference
- Month:
- September
- Year:
- 2024
- Address:
- Tokyo, Japan
- Editors:
- Saad Mahamood, Nguyen Le Minh, Daphne Ippolito
- Venue:
- INLG
- SIG:
- SIGGEN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 320–329
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2024.inlg-main.27/
- DOI:
- Cite (ACL):
- Chinonso Cynthia Osuji, Brian Timoney, Thiago Castro Ferreira, and Brian Davis. 2024. Pipeline Neural Data-to-text with Large Language Models. In Proceedings of the 17th International Natural Language Generation Conference, pages 320–329, Tokyo, Japan. Association for Computational Linguistics.
- Cite (Informal):
- Pipeline Neural Data-to-text with Large Language Models (Osuji et al., INLG 2024)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2024.inlg-main.27.pdf