Abstract
This paper presents the experiments and results obtained by the SUKI team in the Discriminating between Dutch and Flemish in Subtitles shared task of the VarDial 2018 Evaluation Campaign. Our best submission was ranked 8th, obtaining macro F1-score of 0.61. Our best results were produced by a language identifier implementing the HeLI method without any modifications. We describe, in addition to the best method we used, some of the experiments we did with unsupervised clustering.- Anthology ID:
- W18-3915
- Volume:
- Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Marcos Zampieri, Preslav Nakov, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi, Ahmed Ali
- Venue:
- VarDial
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 137–144
- Language:
- URL:
- https://aclanthology.org/W18-3915
- DOI:
- Cite (ACL):
- Tommi Jauhiainen, Heidi Jauhiainen, and Krister Lindén. 2018. HeLI-based Experiments in Discriminating Between Dutch and Flemish Subtitles. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), pages 137–144, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- HeLI-based Experiments in Discriminating Between Dutch and Flemish Subtitles (Jauhiainen et al., VarDial 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/W18-3915.pdf