FELIX: Flexible Text Editing Through Tagging and Insertion

Jonathan Mallinson, Aliaksei Severyn, Eric Malmi, Guillermo Garrido


Abstract
We present FELIX – a flexible text-editing approach for generation, designed to derive maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pretraining. In contrast to conventional sequenceto-sequence (seq2seq) models, FELIX is efficient in low-resource settings and fast at inference time, while being capable of modeling flexible input-output transformations. We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input. The tagging model employs a novel Pointer mechanism, while the insertion model is based on a Masked Language Model (MLM). Both of these models are chosen to be non-autoregressive to guarantee faster inference. FELIX performs favourably when compared to recent text-editing methods and strong seq2seq baselines when evaluated on four NLG tasks: Sentence Fusion, Machine Translation Automatic Post-Editing, Summarization, and Text Simplification
Anthology ID:
2020.findings-emnlp.111
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1244–1255
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.111
DOI:
10.18653/v1/2020.findings-emnlp.111
Bibkey:
Cite (ACL):
Jonathan Mallinson, Aliaksei Severyn, Eric Malmi, and Guillermo Garrido. 2020. FELIX: Flexible Text Editing Through Tagging and Insertion. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1244–1255, Online. Association for Computational Linguistics.
Cite (Informal):
FELIX: Flexible Text Editing Through Tagging and Insertion (Mallinson et al., Findings 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.findings-emnlp.111.pdf
Code
 google-research/google-research +  additional community code
Data
DiscoFuseWikiLarge