Abstract
We study highly granular dialect normalization and phonological dialect translation on Limburgish, a non-standardized low-resource language with a wide variation in spelling conventions and phonology. We find improvements to the traditional transformer by embedding the geographic coordinates of dialects in dialect normalization tasks and use these geographically-embedded transformers to translate words between the phonologies of different dialects. These results are found to be consistent with notions in traditional Limburgish dialectology.- Anthology ID:
- 2024.vardial-1.13
- Original:
- 2024.vardial-1.13v1
- Version 2:
- 2024.vardial-1.13v2
- Volume:
- Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Yves Scherrer, Tommi Jauhiainen, Nikola Ljubešić, Marcos Zampieri, Preslav Nakov, Jörg Tiedemann
- Venues:
- VarDial | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 152–162
- Language:
- URL:
- https://aclanthology.org/2024.vardial-1.13
- DOI:
- 10.18653/v1/2024.vardial-1.13
- Cite (ACL):
- Andreas Simons, Stefano De Pascale, and Karlien Franco. 2024. Highly Granular Dialect Normalization and Phonological Dialect Translation for Limburgish. In Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024), pages 152–162, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Highly Granular Dialect Normalization and Phonological Dialect Translation for Limburgish (Simons et al., VarDial-WS 2024)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2024.vardial-1.13.pdf