Dirk Speelman


2018

pdf bib
Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign
Marcos Zampieri | Shervin Malmasi | Preslav Nakov | Ahmed Ali | Suwon Shon | James Glass | Yves Scherrer | Tanja Samardžić | Nikola Ljubešić | Jörg Tiedemann | Chris van der Lee | Stefan Grondelaers | Nelleke Oostdijk | Dirk Speelman | Antal van den Bosch | Ritesh Kumar | Bornini Lahiri | Mayank Jain
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)

We present the results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The campaign was organized as part of the fifth edition of the VarDial workshop, collocated with COLING’2018. This year, the campaign included five shared tasks, including two task re-runs – Arabic Dialect Identification (ADI) and German Dialect Identification (GDI) –, and three new tasks – Morphosyntactic Tagging of Tweets (MTT), Discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the VarDial workshop proceedings and are referred to in this report.

2014

pdf
Multidimensional Scaling on the specialised corpus TALN (Analyse de positionnement multidimensionnel sur le corpus spécialisé TALN) [in French]
Ann Bertels | Dirk Speelman
TALN-RECITAL 2014 Workshop SemDis 2014 : Enjeux actuels de la sémantique distributionnelle (SemDis 2014: Current Challenges in Distributional Semantics)

2012

pdf
Looking at word meaning. An interactive visualization of Semantic Vector Spaces for Dutch synsets
Kris Heylen | Dirk Speelman | Dirk Geeraerts
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH

2009

pdf bib
Word Space Models of Lexical Variation
Yves Peirsman | Dirk Speelman
Proceedings of the Workshop on Geometrical Models of Natural Language Semantics

2008

pdf
Modelling Word Similarity: an Evaluation of Automatic Synonymy Extraction Algorithms.
Kris Heylen | Yves Peirsman | Dirk Geeraerts | Dirk Speelman
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Vector-based models of lexical semantics retrieve semantically related words automatically from large corpora by exploiting the property that words with a similar meaning tend to occur in similar contexts. Despite their increasing popularity, it is unclear which kind of semantic similarity they actually capture and for which kind of words. In this paper, we use three vector-based models to retrieve semantically related words for a set of Dutch nouns and we analyse whether three linguistic properties of the nouns influence the results. In particular, we compare results from a dependency-based model with those from a 1st and 2nd order bag-of-words model and we examine the effect of the nouns’ frequency, semantic speficity and semantic class. We find that all three models find more synonyms for high-frequency nouns and those belonging to abstract semantic classses. Semantic specificty does not have a clear influence.

2006

pdf
Analyse quantitative et statistique de la sémantique dans un corpus technique
Ann Bertels | Dirk Speelman | Dirk Geeraerts
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Cet article présente la méthodologie et les résultats d’une analyse sémantique quantitative d’environ 5000 spécificités dans le domaine technique des machines-outils pour l’usinage des métaux. Les spécificités seront identifiées avec la méthode des mots-clés (KeyWords Method). Ensuite, elles seront soumises à une analyse sémantique quantitative, à partir du recouvrement des cooccurrences des cooccurrences, permettant de déterminer le degré de monosémie des spécificités. Finalement, les données quantitatives de spécificité et de monosémie feront l’objet d’analyses de régression. Nous avançons l’hypothèse que les mots (les plus) spécifiques du corpus technique ne sont pas (les plus) monosémiques. Nous présenterons ici les résultats statistiques, ainsi qu’une interprétation linguistique. Le but de cette étude est donc de vérifier si et dans quelle mesure les spécificités du corpus technique sont monosémiques ou polysémiques et quels sont les facteurs déterminants.

1994

pdf
A Dutch to SQL database interface using Generalized Quantifier Theory
Dirk Speelman | Geert Adriaens
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics