Abstract
Despite the growing importance of data in translation, there is no data repository that equally meets the requirements of translation industry and academia alike. Therefore, we plan to develop a freely available, multilingual and expandable bank of translations and their source texts aligned at the sentence level. Special emphasis will be placed on the labelling of metadata that precisely describe the relations between translated texts and their originals. This metadata-centric approach gives users the opportunity to compile and download custom corpora on demand. Such a general-purpose data repository may help to bridge the gap between translation theory and the language industry, including translation technology providers and NLP.- Anthology ID:
- W17-7904
- Volume:
- Proceedings of the Workshop Human-Informed Translation and Interpreting Technology
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Editors:
- Irina Temnikova, Constantin Orasan, Gloria Corpas Pastor, Stephan Vogel
- Venue:
- RANLP
- SIG:
- Publisher:
- Association for Computational Linguistics, Shoumen, Bulgaria
- Note:
- Pages:
- 29–35
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-042-7_004
- DOI:
- 10.26615/978-954-452-042-7_004
- Cite (ACL):
- Michael Ustaszewski and Andy Stauder. 2017. TransBank: Metadata as the Missing Link between NLP and Traditional Translation Studies. In Proceedings of the Workshop Human-Informed Translation and Interpreting Technology, pages 29–35, Varna, Bulgaria. Association for Computational Linguistics, Shoumen, Bulgaria.
- Cite (Informal):
- TransBank: Metadata as the Missing Link between NLP and Traditional Translation Studies (Ustaszewski & Stauder, RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-042-7_004