Abstract
We report on experiments in automatic text simplification (ATS) for German with multiple simplification levels along the Common European Framework of Reference for Languages (CEFR), simplifying standard German into levels A1, A2 and B1. For that purpose, we investigate the use of source labels and pretraining on standard German, allowing us to simplify standard language to a specific CEFR level. We show that these approaches are especially effective in low-resource scenarios, where we are able to outperform a standard transformer baseline. Moreover, we introduce copy labels, which we show can help the model make a distinction between sentences that require further modifications and sentences that can be copied as-is.- Anthology ID:
- 2021.ranlp-1.150
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
- Month:
- September
- Year:
- 2021
- Address:
- Held Online
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 1339–1349
- Language:
- URL:
- https://aclanthology.org/2021.ranlp-1.150
- DOI:
- Cite (ACL):
- Nicolas Spring, Annette Rios, and Sarah Ebling. 2021. Exploring German Multi-Level Text Simplification. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1339–1349, Held Online. INCOMA Ltd..
- Cite (Informal):
- Exploring German Multi-Level Text Simplification (Spring et al., RANLP 2021)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2021.ranlp-1.150.pdf
- Code
- zurichnlp/ranlp2021-german-ats
- Data
- Newsela