Highly Granular Dialect Normalization and Phonological Dialect Translation for Limburgish

Andreas Simons, Stefano De Pascale, Karlien Franco


Abstract
We study highly granular dialect normalization and phonological dialect translation on Limburgish, a non-standardized low-resource language with a wide variation in spelling conventions and phonology. We find improvements to the traditional transformer by embedding the geographic coordinates of dialects in dialect normalization tasks and use these geographically-embedded transformers to translate words between the phonologies of different dialects. These results are found to be consistent with notions in traditional Limburgish dialectology.
Anthology ID:
2024.vardial-1.13
Original:
2024.vardial-1.13v1
Version 2:
2024.vardial-1.13v2
Volume:
Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Yves Scherrer, Tommi Jauhiainen, Nikola Ljubešić, Marcos Zampieri, Preslav Nakov, Jörg Tiedemann
Venues:
VarDial | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
152–162
Language:
URL:
https://aclanthology.org/2024.vardial-1.13
DOI:
10.18653/v1/2024.vardial-1.13
Bibkey:
Cite (ACL):
Andreas Simons, Stefano De Pascale, and Karlien Franco. 2024. Highly Granular Dialect Normalization and Phonological Dialect Translation for Limburgish. In Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024), pages 152–162, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Highly Granular Dialect Normalization and Phonological Dialect Translation for Limburgish (Simons et al., VarDial-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2024.vardial-1.13.pdf
Supplementary material:
 2024.vardial-1.13.SupplementaryMaterial.txt