Haluk Açarçiçek


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2020

pdf bib
Filtering Noisy Parallel Corpus using Transformers with Proxy Task Learning
Haluk Açarçiçek | Talha Çolakoğlu | Pınar Ece Aktan Hatipoğlu | Chong Hsuan Huang | Wei Peng
Proceedings of the Fifth Conference on Machine Translation

This paper illustrates Huawei’s submission to the WMT20 low-resource parallel corpus filtering shared task. Our approach focuses on developing a proxy task learner on top of a transformer-based multilingual pre-trained language model to boost the filtering capability for noisy parallel corpora. Such a supervised task also helps us to iterate much more quickly than using an existing neural machine translation system to perform the same task. After performing empirical analyses of the finetuning task, we benchmark our approach by comparing the results with past years’ state-of-theart records. This paper wraps up with a discussion of limitations and future work. The scripts for this study will be made publicly available.