Investigating Techniques for a Deeper Understanding of Neural Machine Translation (NMT) Systems through Data Filtering and Fine-tuning Strategies
Lichao Zhu, Maria Zimina, Maud Bénard, Behnoosh Namdar, Nicolas Ballier, Guillaume Wisniewski, Jean-Baptiste Yunès
Abstract
In the context of this biomedical shared task, we have implemented data filters to enhance the selection of relevant training data for fine- tuning from the available training data sources. Specifically, we have employed textometric analysis to detect repetitive segments within the test set, which we have then used for re- fining the training data used to fine-tune the mBart-50 baseline model. Through this approach, we aim to achieve several objectives: developing a practical fine-tuning strategy for training biomedical in-domain fr<>en models, defining criteria for filtering in-domain training data, and comparing model predictions, fine-tuning data in accordance with the test set to gain a deeper insight into the functioning of Neural Machine Translation (NMT) systems.- Anthology ID:
- 2023.wmt-1.28
- Volume:
- Proceedings of the Eighth Conference on Machine Translation
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 275–281
- Language:
- URL:
- https://aclanthology.org/2023.wmt-1.28
- DOI:
- 10.18653/v1/2023.wmt-1.28
- Cite (ACL):
- Lichao Zhu, Maria Zimina, Maud Bénard, Behnoosh Namdar, Nicolas Ballier, Guillaume Wisniewski, and Jean-Baptiste Yunès. 2023. Investigating Techniques for a Deeper Understanding of Neural Machine Translation (NMT) Systems through Data Filtering and Fine-tuning Strategies. In Proceedings of the Eighth Conference on Machine Translation, pages 275–281, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Investigating Techniques for a Deeper Understanding of Neural Machine Translation (NMT) Systems through Data Filtering and Fine-tuning Strategies (Zhu et al., WMT 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.wmt-1.28.pdf