Construction of a Multilingual Corpus Annotated with Translation Relations

Yuming Zhai, Aurélien Max, Anne Vilnat


Abstract
Translation relations, which distinguish literal translation from other translation techniques, constitute an important subject of study for human translators (Chuquet and Paillard, 1989). However, automatic processing techniques based on interlingual relations, such as machine translation or paraphrase generation exploiting translational equivalence, have not exploited these relations explicitly until now. In this work, we present a categorisation of translation relations and annotate them in a parallel multilingual (English, French, Chinese) corpus of oral presentations, the TED Talks. Our long term objective will be to automatically detect these relations in order to integrate them as important characteristics for the search of monolingual segments in relation of equivalence (paraphrases) or of entailment. The annotated corpus resulting from our work will be made available to the community.
Anthology ID:
W18-3814
Volume:
Proceedings of the First Workshop on Linguistic Resources for Natural Language Processing
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Peter Machonis, Anabela Barreiro, Kristina Kocijan, Max Silberztein
Venue:
LR4NLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
102–111
Language:
URL:
https://aclanthology.org/W18-3814
DOI:
Bibkey:
Cite (ACL):
Yuming Zhai, Aurélien Max, and Anne Vilnat. 2018. Construction of a Multilingual Corpus Annotated with Translation Relations. In Proceedings of the First Workshop on Linguistic Resources for Natural Language Processing, pages 102–111, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Construction of a Multilingual Corpus Annotated with Translation Relations (Zhai et al., LR4NLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/W18-3814.pdf