Identification of Differences between Dutch Language Varieties with the VarDial2018 Dutch-Flemish Subtitle Data

Hans van Halteren, Nelleke Oostdijk


Abstract
With the goal of discovering differences between Belgian and Netherlandic Dutch, we participated as Team Taurus in the Dutch-Flemish Subtitles task of VarDial2018. We used a rather simple marker-based method, but a wide range of features, including lexical, lexico-syntactic and syntactic ones, and achieved a second position in the ranking. Inspection of highly distin-guishing features did point towards differences between the two language varieties, but because of the nature of the experimental data, we have to treat our observations as very tentative and in need of further investigation.
Anthology ID:
W18-3923
Volume:
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
199–209
Language:
URL:
https://aclanthology.org/W18-3923
DOI:
Bibkey:
Cite (ACL):
Hans van Halteren and Nelleke Oostdijk. 2018. Identification of Differences between Dutch Language Varieties with the VarDial2018 Dutch-Flemish Subtitle Data. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), pages 199–209, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Identification of Differences between Dutch Language Varieties with the VarDial2018 Dutch-Flemish Subtitle Data (van Halteren & Oostdijk, VarDial 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/W18-3923.pdf