Improving Tokenisation by Alternative Treatment of Spaces
Edward Gow-Smith, Harish Tayyar Madabushi, Carolina Scarton, Aline Villavicencio
- Anthology ID:
- 2022.emnlp-main.786
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11430–11443
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.786
- DOI:
- Cite (ACL):
- Edward Gow-Smith, Harish Tayyar Madabushi, Carolina Scarton, and Aline Villavicencio. 2022. Improving Tokenisation by Alternative Treatment of Spaces. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11430–11443, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Improving Tokenisation by Alternative Treatment of Spaces (Gow-Smith et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.emnlp-main.786.pdf