Natalia Klyueva

Also published as: Natalia Kljueva

2019

In this paper, we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with NAACL 2019. This year, the campaign included five shared tasks, including one task re-run – German Dialect Identification (GDI) – and four new tasks – Cross-lingual Morphological Analysis (CMA), Discriminating between Mainland and Taiwan variation of Mandarin Chinese (DMT), Moldavian vs. Romanian Cross-dialect Topic identification (MRC), and Cuneiform Language Identification (CLI). A total of 22 teams submitted runs across the five shared tasks. After the end of the competition, we received 14 system description papers, which are published in the VarDial workshop proceedings and referred to in this report.

pdf bib

2018

pdf bib

Annotating Chinese Light Verb Constructions according to PARSEME guidelines
Menghan Jiang | Natalia Klyueva | Hongzhi Xu | Chu-Ren Huang
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib

Improving a Neural-based Tagger for Multiword Expressions Identification
Dušan Variš | Natalia Klyueva
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib

2017

pdf bib abs

Neural Networks for Multi-Word Expression Detection
Natalia Klyueva | Antoine Doucet | Milan Straka
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)

In this paper we describe the MUMULS system that participated to the 2017 shared task on automatic identification of verbal multiword expressions (VMWEs). The MUMULS system was implemented using a supervised approach based on recurrent neural networks using the open source library TensorFlow. The model was trained on a data set containing annotated VMWEs as well as morphological and syntactic information. The MUMULS system performed the identification of VMWEs in 15 languages, it was one of few systems that could categorize VMWEs type in nearly all languages.

pdf bib

Querying Multi-word Expressions Annotation with CQL
Natalia Klyueva | Anna Vernerová | Behrang Qasemizadeh
Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories

2016

pdf bib abs

Improving corpus search via parsing
Natalia Klyueva | Pavel Straňák
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper, we describe an addition to the corpus query system Kontext that enables to enhance the search using syntactic attributes in addition to the existing features, mainly lemmas and morphological categories. We present the enhancements of the corpus query system itself, the attributes we use to represent syntactic structures in data, and some examples of querying the syntactically annotated corpora, such as treebanks in various languages as well as an automatically parsed large corpus.

pdf bib

Incorporation of a valency lexicon into a TectoMT pipeline
Natalia Klyueva | Vladislav Kuboň
Proceedings of the 2nd Deep Machine Translation Workshop