Abstract
We introduce a simple approach that uses a large language model (LLM) to automatically implement a fully interpretable rule-based data-to-text system in pure Python. Experimental evaluation on the WebNLG dataset showed that such a constructed system produces text of better quality (according to the BLEU and BLEURT metrics) than the same LLM prompted to directly produce outputs, and produces fewer hallucinations than a BART language model fine-tuned on the same data. Furthermore, at runtime, the approach generates text in a fraction of the processing time required by neural approaches, using only a single CPU.- Anthology ID:
- 2024.inlg-main.48
- Volume:
- Proceedings of the 17th International Natural Language Generation Conference
- Month:
- September
- Year:
- 2024
- Address:
- Tokyo, Japan
- Editors:
- Saad Mahamood, Nguyen Le Minh, Daphne Ippolito
- Venue:
- INLG
- SIG:
- SIGGEN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 622–630
- Language:
- URL:
- https://aclanthology.org/2024.inlg-main.48
- DOI:
- Cite (ACL):
- Jędrzej Warczyński, Mateusz Lango, and Ondrej Dusek. 2024. Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems. In Proceedings of the 17th International Natural Language Generation Conference, pages 622–630, Tokyo, Japan. Association for Computational Linguistics.
- Cite (Informal):
- Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems (Warczyński et al., INLG 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.inlg-main.48.pdf