Denoising Pre-Training and Data Augmentation Strategies for Enhanced RDF Verbalization with Transformers
Sebastien Montella, Betty Fabre, Tanguy Urvoy, Johannes Heinecke, Lina Rojas-Barahona
Abstract
The task of verbalization of RDF triples has known a growth in popularity due to the rising ubiquity of Knowledge Bases (KBs). The formalism of RDF triples is a simple and efficient way to store facts at a large scale. However, its abstract representation makes it difficult for humans to interpret. For this purpose, the WebNLG challenge aims at promoting automated RDF-to-text generation. We propose to leverage pre-trainings from augmented data with the Transformer model using a data augmentation strategy. Our experiment results show a minimum relative increases of 3.73%, 126.05% and 88.16% in BLEU score for seen categories, unseen entities and unseen categories respectively over the standard training.- Anthology ID:
- 2020.webnlg-1.9
- Volume:
- Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+)
- Month:
- 12
- Year:
- 2020
- Address:
- Dublin, Ireland (Virtual)
- Editors:
- Thiago Castro Ferreira, Claire Gardent, Nikolai Ilinykh, Chris van der Lee, Simon Mille, Diego Moussallem, Anastasia Shimorina
- Venue:
- WebNLG
- SIG:
- SIGGEN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 89–99
- Language:
- URL:
- https://aclanthology.org/2020.webnlg-1.9
- DOI:
- Cite (ACL):
- Sebastien Montella, Betty Fabre, Tanguy Urvoy, Johannes Heinecke, and Lina Rojas-Barahona. 2020. Denoising Pre-Training and Data Augmentation Strategies for Enhanced RDF Verbalization with Transformers. In Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+), pages 89–99, Dublin, Ireland (Virtual). Association for Computational Linguistics.
- Cite (Informal):
- Denoising Pre-Training and Data Augmentation Strategies for Enhanced RDF Verbalization with Transformers (Montella et al., WebNLG 2020)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2020.webnlg-1.9.pdf
- Data
- WebNLG