Neural data-to-text generation: A comparison between pipeline and end-to-end architectures
Thiago Castro Ferreira, Chris van der Lee, Emiel van Miltenburg, Emiel Krahmer
Abstract
Traditionally, most data-to-text applications have been designed using a modular pipeline architecture, in which non-linguistic input data is converted into natural language through several intermediate transformations. By contrast, recent neural models for data-to-text generation have been proposed as end-to-end approaches, where the non-linguistic input is rendered in natural language with much less explicit intermediate representations in between. This study introduces a systematic comparison between neural pipeline and end-to-end data-to-text approaches for the generation of text from RDF triples. Both architectures were implemented making use of the encoder-decoder Gated-Recurrent Units (GRU) and Transformer, two state-of-the art deep learning methods. Automatic and human evaluations together with a qualitative analysis suggest that having explicit intermediate steps in the generation process results in better texts than the ones generated by end-to-end approaches. Moreover, the pipeline models generalize better to unseen inputs. Data and code are publicly available.- Anthology ID:
- D19-1052
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 552–562
- Language:
- URL:
- https://aclanthology.org/D19-1052
- DOI:
- 10.18653/v1/D19-1052
- Cite (ACL):
- Thiago Castro Ferreira, Chris van der Lee, Emiel van Miltenburg, and Emiel Krahmer. 2019. Neural data-to-text generation: A comparison between pipeline and end-to-end architectures. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 552–562, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Neural data-to-text generation: A comparison between pipeline and end-to-end architectures (Castro Ferreira et al., EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/D19-1052.pdf
- Code
- ThiagoCF05/webnlg
- Data
- E2E, WebNLG