German Dialect Identification Using Classifier Ensembles

Alina Maria Ciobanu, Shervin Malmasi, Liviu P. Dinu


Abstract
In this paper we present the GDI classification entry to the second German Dialect Identification (GDI) shared task organized within the scope of the VarDial Evaluation Campaign 2018. We present a system based on SVM classifier ensembles trained on characters and words. The system was trained on a collection of speech transcripts of five Swiss-German dialects provided by the organizers. The transcripts included in the dataset contained speakers from Basel, Bern, Lucerne, and Zurich. Our entry in the challenge reached 62.03% F1 score and was ranked third out of eight teams.
Anthology ID:
W18-3933
Volume:
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Marcos Zampieri, Preslav Nakov, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi, Ahmed Ali
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
288–294
Language:
URL:
https://aclanthology.org/W18-3933
DOI:
Bibkey:
Cite (ACL):
Alina Maria Ciobanu, Shervin Malmasi, and Liviu P. Dinu. 2018. German Dialect Identification Using Classifier Ensembles. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), pages 288–294, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
German Dialect Identification Using Classifier Ensembles (Ciobanu et al., VarDial 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/W18-3933.pdf