Abstract
In this paper, we describe the system we presented at the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022) regarding the shared task on Lexical Simplification for English, Portuguese, and Spanish. We proposed an unsupervised approach in two steps: First, we used a masked language model with word masking for each language to extract possible candidates for the replacement of a difficult word; second, we ranked the candidates according to three different Transformer-based metrics. Finally, we determined our list of candidates based on the lowest average rank across different metrics.- Anthology ID:
- 2022.tsar-1.24
- Volume:
- Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates (Virtual)
- Editors:
- Sanja Štajner, Horacio Saggion, Daniel Ferrés, Matthew Shardlow, Kim Cheng Sheang, Kai North, Marcos Zampieri, Wei Xu
- Venue:
- TSAR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 225–230
- Language:
- URL:
- https://aclanthology.org/2022.tsar-1.24
- DOI:
- 10.18653/v1/2022.tsar-1.24
- Cite (ACL):
- Emmanuele Chersoni and Yu-Yin Hsu. 2022. PolyU-CBS at TSAR-2022 Shared Task: A Simple, Rank-Based Method for Complex Word Substitution in Two Steps. In Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022), pages 225–230, Abu Dhabi, United Arab Emirates (Virtual). Association for Computational Linguistics.
- Cite (Informal):
- PolyU-CBS at TSAR-2022 Shared Task: A Simple, Rank-Based Method for Complex Word Substitution in Two Steps (Chersoni & Hsu, TSAR 2022)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2022.tsar-1.24.pdf