TransBank: Metadata as the Missing Link between NLP and Traditional Translation Studies

Michael Ustaszewski, Andy Stauder


Abstract
Despite the growing importance of data in translation, there is no data repository that equally meets the requirements of translation industry and academia alike. Therefore, we plan to develop a freely available, multilingual and expandable bank of translations and their source texts aligned at the sentence level. Special emphasis will be placed on the labelling of metadata that precisely describe the relations between translated texts and their originals. This metadata-centric approach gives users the opportunity to compile and download custom corpora on demand. Such a general-purpose data repository may help to bridge the gap between translation theory and the language industry, including translation technology providers and NLP.
Anthology ID:
W17-7904
Volume:
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology
Month:
September
Year:
2017
Address:
Varna, Bulgaria
Editors:
Irina Temnikova, Constantin Orasan, Gloria Corpas Pastor, Stephan Vogel
Venue:
RANLP
SIG:
Publisher:
Association for Computational Linguistics, Shoumen, Bulgaria
Note:
Pages:
29–35
Language:
URL:
https://doi.org/10.26615/978-954-452-042-7_004
DOI:
10.26615/978-954-452-042-7_004
Bibkey:
Cite (ACL):
Michael Ustaszewski and Andy Stauder. 2017. TransBank: Metadata as the Missing Link between NLP and Traditional Translation Studies. In Proceedings of the Workshop Human-Informed Translation and Interpreting Technology, pages 29–35, Varna, Bulgaria. Association for Computational Linguistics, Shoumen, Bulgaria.
Cite (Informal):
TransBank: Metadata as the Missing Link between NLP and Traditional Translation Studies (Ustaszewski & Stauder, RANLP 2017)
Copy Citation:
PDF:
https://doi.org/10.26615/978-954-452-042-7_004