NILC at SR’20: Exploring Pre-Trained Models in Surface Realisation

Marco Antonio Sobrevilla Cabezudo, Thiago Pardo


Abstract
This paper describes the submission by the NILC Computational Linguistics research group of the University of S ̃ao Paulo/Brazil to the English Track 2 (closed sub-track) at the Surface Realisation Shared Task 2020. The success of the current pre-trained models like BERT or GPT-2 in several tasks is well-known, however, this is not the case for data-to-text generation tasks and just recently some initiatives focused on it. This way, we explore how a pre-trained model (GPT-2) performs on the UD-to-text generation task. In general, the achieved results were poor, but there are some interesting ideas to explore. Among the learned lessons we may note that it is necessary to study strategies to represent UD inputs and to introduce structural knowledge into these pre-trained models.
Anthology ID:
2020.msr-1.6
Volume:
Proceedings of the Third Workshop on Multilingual Surface Realisation
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Anya Belz, Bernd Bohnet, Thiago Castro Ferreira, Yvette Graham, Simon Mille, Leo Wanner
Venue:
MSR
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
50–56
Language:
URL:
https://aclanthology.org/2020.msr-1.6
DOI:
Bibkey:
Cite (ACL):
Marco Antonio Sobrevilla Cabezudo and Thiago Pardo. 2020. NILC at SR’20: Exploring Pre-Trained Models in Surface Realisation. In Proceedings of the Third Workshop on Multilingual Surface Realisation, pages 50–56, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
NILC at SR’20: Exploring Pre-Trained Models in Surface Realisation (Sobrevilla Cabezudo & Pardo, MSR 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2020.msr-1.6.pdf