Nathaniel Berger
2023
Enhancing Supervised Learning with Contrastive Markings in Neural Machine Translation Training
Nathaniel Berger
|
Miriam Exel
|
Matthias Huck
|
Stefan Riezler
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
Supervised learning in Neural Machine Translation (NMT) standardly follows a teacher forcing paradigm where the conditioning context in the model’s prediction is constituted by reference tokens, instead of its own previous predictions. In order to alleviate this lack of exploration in the space of translations, we present a simple extension of standard maximum likelihood estimation by a contrastive marking objective. The additional training signals are extracted automatically from reference translations by comparing the system hypothesis against the reference, and used for up/down-weighting correct/incorrect tokens. The proposed new training procedure requires one additional translation pass over the training set, and does not alter the standard inference setup. We show that training with contrastive markings yields improvements on top of supervised learning, and is especially useful when learning from postedits where contrastive markings indicate human error corrections to the original hypotheses.
2021
Don’t Search for a Search Method — Simple Heuristics Suffice for Adversarial Text Attacks
Nathaniel Berger
|
Stefan Riezler
|
Sebastian Ebert
|
Artem Sokolov
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Recently more attention has been given to adversarial attacks on neural networks for natural language processing (NLP). A central research topic has been the investigation of search algorithms and search constraints, accompanied by benchmark algorithms and tasks. We implement an algorithm inspired by zeroth order optimization-based attacks and compare with the benchmark results in the TextAttack framework. Surprisingly, we find that optimization-based methods do not yield any improvement in a constrained setup and slightly benefit from approximate gradient information only in unconstrained setups where search spaces are larger. In contrast, simple heuristics exploiting nearest neighbors without querying the target function yield substantial success rates in constrained setups, and nearly full success rate in unconstrained setups, at an order of magnitude fewer queries. We conclude from these results that current TextAttack benchmark tasks are too easy and constraints are too strict, preventing meaningful research on black-box adversarial text attacks.
2020
Correct Me If You Can: Learning from Error Corrections and Markings
Julia Kreutzer
|
Nathaniel Berger
|
Stefan Riezler
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Sequence-to-sequence learning involves a trade-off between signal strength and annotation cost of training data. For example, machine translation data range from costly expert-generated translations that enable supervised learning, to weak quality-judgment feedback that facilitate reinforcement learning. We present the first user study on annotation cost and machine learnability for the less popular annotation mode of error markings. We show that error markings for translations of TED talks from English to German allow precise credit assignment while requiring significantly less human effort than correcting/post-editing, and that error-marked data can be used successfully to fine-tune neural machine translation models.
Search
Co-authors
- Stefan Riezler 3
- Julia Kreutzer 1
- Sebastian Ebert 1
- Artem Sokolov 1
- Miriam Exel 1
- show all...