Abstract
This paper describes our submission to the E2E NLG Challenge. Recently, neural seq2seq approaches have become mainstream in NLG, often resorting to pre- (respectively post-) processing delexicalization (relexicalization) steps at the word-level to handle rare words. By contrast, we train a simple character level seq2seq model, which requires no pre/post-processing (delexicalization, tokenization or even lowercasing), with surprisingly good results. For further improvement, we explore two re-ranking approaches for scoring candidates. We also introduce a synthetic dataset creation procedure, which opens up a new way of creating artificial datasets for Natural Language Generation.- Anthology ID:
- W18-6555
- Volume:
- Proceedings of the 11th International Conference on Natural Language Generation
- Month:
- November
- Year:
- 2018
- Address:
- Tilburg University, The Netherlands
- Editors:
- Emiel Krahmer, Albert Gatt, Martijn Goudbeek
- Venue:
- INLG
- SIG:
- SIGGEN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 451–456
- Language:
- URL:
- https://aclanthology.org/W18-6555
- DOI:
- 10.18653/v1/W18-6555
- Cite (ACL):
- Shubham Agarwal, Marc Dymetman, and Éric Gaussier. 2018. Char2char Generation with Reranking for the E2E NLG Challenge. In Proceedings of the 11th International Conference on Natural Language Generation, pages 451–456, Tilburg University, The Netherlands. Association for Computational Linguistics.
- Cite (Informal):
- Char2char Generation with Reranking for the E2E NLG Challenge (Agarwal et al., INLG 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/W18-6555.pdf
- Data
- E2E