Exploring Transfer Learning and Domain Data Selection for the Biomedical Translation
Noor-e- Hira, Sadaf Abdul Rauf, Kiran Kiani, Ammara Zafar, Raheel Nawaz
Abstract
Transfer Learning and Selective data training are two of the many approaches being extensively investigated to improve the quality of Neural Machine Translation systems. This paper presents a series of experiments by applying transfer learning and selective data training for participation in the Bio-medical shared task of WMT19. We have used Information Retrieval to selectively choose related sentences from out-of-domain data and used them as additional training data using transfer learning. We also report the effect of tokenization on translation model performance.- Anthology ID:
- W19-5419
- Volume:
- Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 156–163
- Language:
- URL:
- https://aclanthology.org/W19-5419
- DOI:
- 10.18653/v1/W19-5419
- Cite (ACL):
- Noor-e- Hira, Sadaf Abdul Rauf, Kiran Kiani, Ammara Zafar, and Raheel Nawaz. 2019. Exploring Transfer Learning and Domain Data Selection for the Biomedical Translation. In Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), pages 156–163, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Exploring Transfer Learning and Domain Data Selection for the Biomedical Translation (Hira et al., WMT 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/W19-5419.pdf