SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering

MinhQuang Pham, Josep Crego, Jean Senellart


Abstract
This paper describes the participation of SYSTRAN to the shared task on parallel corpus filtering at the Third Conference on Machine Translation (WMT 2018). We participate for the first time using a neural sentence similarity classifier which aims at predicting the relatedness of sentence pairs in a multilingual context. The paper describes the main characteristics of our approach and discusses the results obtained on the data sets published for the shared task.
Anthology ID:
W18-6485
Volume:
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Month:
October
Year:
2018
Address:
Belgium, Brussels
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
934–938
Language:
URL:
https://aclanthology.org/W18-6485
DOI:
10.18653/v1/W18-6485
Bibkey:
Cite (ACL):
MinhQuang Pham, Josep Crego, and Jean Senellart. 2018. SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 934–938, Belgium, Brussels. Association for Computational Linguistics.
Cite (Informal):
SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering (Pham et al., WMT 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-bitext-workshop/W18-6485.pdf