2020
pdf
abs
GeCzLex: Lexicon of Czech and German Anaphoric Connectives
Lucie Poláková
|
Kateřina Rysová
|
Magdaléna Rysová
|
Jiří Mírovský
Proceedings of the Twelfth Language Resources and Evaluation Conference
We introduce the first version of GeCzLex, an online electronic resource for translation equivalents of Czech and German discourse connectives. The lexicon is one of the outcomes of the research on anaphoricity and long-distance relations in discourse, it contains at present anaphoric connectives (ACs) for Czech and German connectives, and further their possible translations documented in bilingual parallel corpora (not necessarily anaphoric). As a basis, we use two existing monolingual lexicons of connectives: the Lexicon of Czech Discourse Connectives (CzeDLex) and the Lexicon of Discourse Markers (DiMLex) for German, interlink their relevant entries via semantic annotation of the connectives (according to the PDTB 3 sense taxonomy) and statistical information of translation possibilities from the Czech and German parallel data of the InterCorp project. The lexicon is, as far as we know, the first bilingual inventory of connectives with linkage on the level of individual entries, and a first attempt to systematically describe devices engaged in long-distance, non-local discourse coherence. The lexicon is freely available under the Creative Commons License.
2019
pdf
abs
A Test Suite and Manual Evaluation of Document-Level NMT at WMT19
Kateřina Rysová
|
Magdaléna Rysová
|
Tomáš Musil
|
Lucie Poláková
|
Ondřej Bojar
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
As the quality of machine translation rises and neural machine translation (NMT) is moving from sentence to document level translations, it is becoming increasingly difficult to evaluate the output of translation systems. We provide a test suite for WMT19 aimed at assessing discourse phenomena of MT systems participating in the News Translation Task. We have manually checked the outputs and identified types of translation errors that are relevant to document-level translation.
pdf
Ordering of Adverbials of Time and Place in Grammars and in an Annotated English-Czech Parallel Corpus
Eva Hajičová
|
Jiří Mírovský
|
Kateřina Rysová
Proceedings of the 18th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2019)
2018
pdf
abs
EvalD Reference-Less Discourse Evaluation for WMT18
Ondřej Bojar
|
Jiří Mírovský
|
Kateřina Rysová
|
Magdaléna Rysová
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
We present the results of automatic evaluation of discourse in machine translation (MT) outputs using the EVALD tool. EVALD was originally designed and trained to assess the quality of human writing, for native speakers and foreign-language learners. MT has seen a tremendous leap in translation quality at the level of sentences and it is thus interesting to see if the human-level evaluation is becoming relevant.
2017
pdf
abs
Introducing EVALD – Software Applications for Automatic Evaluation of Discourse in Czech
Kateřina Rysová
|
Magdaléna Rysová
|
Jiří Mírovský
|
Michal Novák
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
In the paper, we introduce two software applications for automatic evaluation of coherence in Czech texts called EVALD – Evaluator of Discourse. The first one – EVALD 1.0 – evaluates texts written by native speakers of Czech on a five-step scale commonly used at Czech schools (grade 1 is the best, grade 5 is the worst). The second application is EVALD 1.0 for Foreigners assessing texts by non-native speakers of Czech using six-step scale (A1–C2) according to CEFR. Both appli-cations are available online at
https://lindat.mff.cuni.cz/services/evald-foreign/.
2016
pdf
Automatic evaluation of surface coherence in L2 texts in Czech
Kateřina Rysová
|
Magdaléna Rysová
|
Jiří Mírovský
Proceedings of the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016)
2015
pdf
Secondary Connectives in the Prague Dependency Treebank
Magdaléna Rysová
|
Kateřina Rysová
Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015)
2014
pdf
The Centre and Periphery of Discourse Connectives
Magdaléna Rysová
|
Kateřina Rysová
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing
pdf
abs
Valency and Word Order in Czech — A Corpus Probe
Kateřina Rysová
|
Jiří Mírovský
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
We present a part of broader research on word order aiming at finding factors influencing word order in Czech (i.e. in an inflectional language) and their intensity. The main aim of the paper is to test a hypothesis that obligatory adverbials (in terms of the valency) follow the non-obligatory (i.e. optional) ones in the surface word order. The determined hypothesis was tested by creating a list of features for the decision trees algorithm and by searching in data of the Prague Dependency Treebank using the search tool PML Tree Query. Apart from the valency, our experiment also evaluates importance of several other features, such as argument length and deep syntactic function. Neither of the used methods has proved the given hypothesis but according to the results, there are several other features that influence word order of contextually non-bound free modifiers of a verb in Czech, namely position of the sentence in the text, form and length of the verb modifiers (the whole subtrees), and the semantic dependency relation (functor) of the modifiers.
2013
pdf
(Pre-)Annotation of Topic-Focus Articulation in Prague Czech-English Dependency Treebank
Jiří Mírovský
|
Kateřina Rysová
|
Magdaléna Rysová
|
Eva Hajičová
Proceedings of the Sixth International Joint Conference on Natural Language Processing