@inproceedings{juffs-naismith-2025-identifying,
    title = "Identifying and analyzing `noisy' spelling errors in a second language corpus",
    author = "Juffs, Alan  and
      Naismith, Ben",
    editor = "Bak, JinYeong  and
      Goot, Rob van der  and
      Jang, Hyeju  and
      Buaphet, Weerayut  and
      Ramponi, Alan  and
      Xu, Wei  and
      Ritter, Alan",
    booktitle = "Proceedings of the Tenth Workshop on Noisy and User-generated Text",
    month = may,
    year = "2025",
    address = "Albuquerque, New Mexico, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.wnut-1.4/",
    doi = "10.18653/v1/2025.wnut-1.4",
    pages = "26--37",
    ISBN = "979-8-89176-232-9",
    abstract = "This paper addresses the problem of identifying and analyzing `noisy' spelling errors in texts written by second language (L2) learners' texts in a written corpus. Using Python, spelling errors were identified in 5774 texts greater than or equal to 66 words (total=1,814,209 words), selected from a corpus of 4.2 million words (Authors-1). The statistical analysis used hurdle() models in R, which are appropriate for non-normal, count data, with many zeros."
}Markdown (Informal)
[Identifying and analyzing ‘noisy’ spelling errors in a second language corpus](https://preview.aclanthology.org/ingest-emnlp/2025.wnut-1.4/) (Juffs & Naismith, WNUT 2025)
ACL