Abstract
This paper describes the submission by the NILC Computational Linguistics research group of the University of S ̃ao Paulo/Brazil to the English Track 2 (closed sub-track) at the Surface Realisation Shared Task 2020. The success of the current pre-trained models like BERT or GPT-2 in several tasks is well-known, however, this is not the case for data-to-text generation tasks and just recently some initiatives focused on it. This way, we explore how a pre-trained model (GPT-2) performs on the UD-to-text generation task. In general, the achieved results were poor, but there are some interesting ideas to explore. Among the learned lessons we may note that it is necessary to study strategies to represent UD inputs and to introduce structural knowledge into these pre-trained models.- Anthology ID:
- 2020.msr-1.6
- Volume:
- Proceedings of the Third Workshop on Multilingual Surface Realisation
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Anya Belz, Bernd Bohnet, Thiago Castro Ferreira, Yvette Graham, Simon Mille, Leo Wanner
- Venue:
- MSR
- SIG:
- SIGGEN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 50–56
- Language:
- URL:
- https://aclanthology.org/2020.msr-1.6
- DOI:
- Cite (ACL):
- Marco Antonio Sobrevilla Cabezudo and Thiago Pardo. 2020. NILC at SR’20: Exploring Pre-Trained Models in Surface Realisation. In Proceedings of the Third Workshop on Multilingual Surface Realisation, pages 50–56, Barcelona, Spain (Online). Association for Computational Linguistics.
- Cite (Informal):
- NILC at SR’20: Exploring Pre-Trained Models in Surface Realisation (Sobrevilla Cabezudo & Pardo, MSR 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.msr-1.6.pdf