Michal Auersperger

2022

pdf abs
Defending Compositionality in Emergent Languages
Michal Auersperger | Pavel Pecina
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop

Compositionality has traditionally been understood as a major factor in productivity of language and, more broadly, human cognition. Yet, recently some research started to question its status showing that artificial neural networks are good at generalization even without noticeable compositional behavior. We argue some of these conclusions are too strong and/or incomplete. In the context of a two-agent communication game, we show that compositionality indeed seems essential for successful generalization when the evaluation is done on a suitable dataset.

2021

pdf abs
Solving SCAN Tasks with Data Augmentation and Input Embeddings
Michal Auersperger | Pavel Pecina
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

We address the compositionality challenge presented by the SCAN benchmark. Using data augmentation and a modification of the standard seq2seq architecture with attention, we achieve SOTA results on all the relevant tasks from the benchmark, showing the models can generalize to words used in unseen contexts. We propose an extension of the benchmark by a harder task, which cannot be solved by the proposed method.

2019

pdf abs
English-Czech Systems in WMT19: Document-Level Transformer
Martin Popel | Dominik Macháček | Michal Auersperger | Ondřej Bojar | Pavel Pecina
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We describe our NMT systems submitted to the WMT19 shared task in English→Czech news translation. Our systems are based on the Transformer model implemented in either Tensor2Tensor (T2T) or Marian framework. We aimed at improving the adequacy and coherence of translated documents by enlarging the context of the source and target. Instead of translating each sentence independently, we split the document into possibly overlapping multi-sentence segments. In case of the T2T implementation, this “document-level”-trained system achieves a +0.6 BLEU improvement (p < 0.05) relative to the same system applied on isolated sentences. To assess the potential effect document-level models might have on lexical coherence, we performed a semi-automatic analysis, which revealed only a few sentences improved in this aspect. Thus, we cannot draw any conclusions from this week evidence.