Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification
Reno Kriz, João Sedoc, Marianna Apidianaki, Carolina Zheng, Gaurav Kumar, Eleni Miltsakaki, Chris Callison-Burch
Abstract
Sentence simplification is the task of rewriting texts so they are easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement learning and memory augmentation. One of the main problems with applying generic Seq2Seq models for simplification is that these models tend to copy directly from the original sentence, resulting in outputs that are relatively long and complex. We aim to alleviate this issue through the use of two main techniques. First, we incorporate content word complexities, as predicted with a leveled word complexity model, into our loss function during training. Second, we generate a large set of diverse candidate simplifications at test time, and rerank these to promote fluency, adequacy, and simplicity. Here, we measure simplicity through a novel sentence complexity model. These extensions allow our models to perform competitively with state-of-the-art systems while generating simpler sentences. We report standard automatic and human evaluation metrics.- Anthology ID:
- N19-1317
- Volume:
- Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Jill Burstein, Christy Doran, Thamar Solorio
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3137–3147
- Language:
- URL:
- https://aclanthology.org/N19-1317
- DOI:
- 10.18653/v1/N19-1317
- Cite (ACL):
- Reno Kriz, João Sedoc, Marianna Apidianaki, Carolina Zheng, Gaurav Kumar, Eleni Miltsakaki, and Chris Callison-Burch. 2019. Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3137–3147, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification (Kriz et al., NAACL 2019)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/N19-1317.pdf
- Code
- rekriz11/sockeye-recipes + additional community code
- Data
- Newsela, WikiLarge