Dia-Lingle: A Gamified Interface for Dialectal Data Collection

Jiugeng Sun, Rita Sevastjanova, Sina Ahmadi, Rico Sennrich, Mennatallah El-Assady


Abstract
Dialects suffer from the scarcity of computational textual resources as they exist predominantly in spoken rather than written form and exhibit remarkable geographical diversity. Collecting dialect data and subsequently integrating it into current language technologies present significant obstacles. Gamification has been proven to facilitate remote data collection processes with great ease and on a substantially wider scale. This paper introduces Dia-Lingle, a gamified interface aimed to improve and facilitate dialectal data collection tasks such as corpus expansion and dialect labelling. The platform features two key components: the first challenges users to rewrite sentences in their dialects, identifies them through a classifier and solicits feedback, and the other one asks users to match sentences to their geographical locations. Dia-Lingle combines active learning with gamified difficulty levels, strategically encouraging prolonged user engagement while efficiently enriching the dialect corpus. Usability evaluation shows that our interface demonstrates high levels of user satisfaction. We provide the link to Dia-Lingle: https://dia-lingle.ivia.ch/, and demo video: https://youtu.be/0QyJsB8ym64.
Anthology ID:
2025.acl-demo.15
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Pushkar Mishra, Smaranda Muresan, Tao Yu
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
148–158
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-demo.15/
DOI:
Bibkey:
Cite (ACL):
Jiugeng Sun, Rita Sevastjanova, Sina Ahmadi, Rico Sennrich, and Mennatallah El-Assady. 2025. Dia-Lingle: A Gamified Interface for Dialectal Data Collection. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 148–158, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Dia-Lingle: A Gamified Interface for Dialectal Data Collection (Sun et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-demo.15.pdf
Copyright agreement:
 2025.acl-demo.15.copyright_agreement.pdf