BME-TUW at SR’20: Lexical grammar induction for surface realization

Gábor Recski, Ádám Kovács, Kinga Gémes, Judit Ács, Andras Kornai


Abstract
We present a system for mapping Universal Dependency structures to raw text which learns to restore word order by training an Interpreted Regular Tree Grammar (IRTG) that establishes a mapping between string and graph operations. The reinflection step is handled by a standard sequence-to-sequence architecture with a biLSTM encoder and an LSTM decoder with attention. We modify our 2019 system (Kovács et al., 2019) with a new grammar induction mechanism that allows IRTG rules to operate on lemmata in addition to part-of-speech tags and ensures that each word and its dependents are reordered using the most specific set of learned patterns. We also introduce a hierarchical approach to word order restoration that independently determines the word order of each clause in a sentence before arranging them with respect to the main clause, thereby improving overall readability and also making the IRTG parsing task tractable. We participated in the 2020 Surface Realization Shared task, subtrack T1a (shallow, closed). Human evaluation shows we achieve significant improvements on two of the three out-of-domain datasets compared to the 2019 system we modified. Both components of our system are available on GitHub under an MIT license.
Anthology ID:
2020.msr-1.2
Volume:
Proceedings of the Third Workshop on Multilingual Surface Realisation
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Anya Belz, Bernd Bohnet, Thiago Castro Ferreira, Yvette Graham, Simon Mille, Leo Wanner
Venue:
MSR
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
21–29
Language:
URL:
https://aclanthology.org/2020.msr-1.2
DOI:
Bibkey:
Cite (ACL):
Gábor Recski, Ádám Kovács, Kinga Gémes, Judit Ács, and Andras Kornai. 2020. BME-TUW at SR’20: Lexical grammar induction for surface realization. In Proceedings of the Third Workshop on Multilingual Surface Realisation, pages 21–29, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
BME-TUW at SR’20: Lexical grammar induction for surface realization (Recski et al., MSR 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2020.msr-1.2.pdf