BME-TUW at SR’20: Lexical grammar induction for surface realization
Gábor Recski, Ádám Kovács, Kinga Gémes, Judit Ács, Andras Kornai
Abstract
We present a system for mapping Universal Dependency structures to raw text which learns to restore word order by training an Interpreted Regular Tree Grammar (IRTG) that establishes a mapping between string and graph operations. The reinflection step is handled by a standard sequence-to-sequence architecture with a biLSTM encoder and an LSTM decoder with attention. We modify our 2019 system (Kovács et al., 2019) with a new grammar induction mechanism that allows IRTG rules to operate on lemmata in addition to part-of-speech tags and ensures that each word and its dependents are reordered using the most specific set of learned patterns. We also introduce a hierarchical approach to word order restoration that independently determines the word order of each clause in a sentence before arranging them with respect to the main clause, thereby improving overall readability and also making the IRTG parsing task tractable. We participated in the 2020 Surface Realization Shared task, subtrack T1a (shallow, closed). Human evaluation shows we achieve significant improvements on two of the three out-of-domain datasets compared to the 2019 system we modified. Both components of our system are available on GitHub under an MIT license.- Anthology ID:
- 2020.msr-1.2
- Volume:
- Proceedings of the Third Workshop on Multilingual Surface Realisation
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Anya Belz, Bernd Bohnet, Thiago Castro Ferreira, Yvette Graham, Simon Mille, Leo Wanner
- Venue:
- MSR
- SIG:
- SIGGEN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 21–29
- Language:
- URL:
- https://aclanthology.org/2020.msr-1.2
- DOI:
- Cite (ACL):
- Gábor Recski, Ádám Kovács, Kinga Gémes, Judit Ács, and Andras Kornai. 2020. BME-TUW at SR’20: Lexical grammar induction for surface realization. In Proceedings of the Third Workshop on Multilingual Surface Realisation, pages 21–29, Barcelona, Spain (Online). Association for Computational Linguistics.
- Cite (Informal):
- BME-TUW at SR’20: Lexical grammar induction for surface realization (Recski et al., MSR 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.msr-1.2.pdf