Abstract
Very large language models have been shown to translate with few-shot in-context examples. However, they have not achieved state-of-art results for translating out of English. In this work, we investigate an extremely lightweight fixed-parameter method for conditioning a large language model to better translate into the target language. Our method introduces additional embeddings, known as prefix embeddings which do not interfere with the existing weights of the model. Using unsupervised and weakly semi-supervised methods that train only 0.0001% of the model parameters, the simple method improves ~0.2-1.3 BLEU points across 3 domains and 3 languages. We analyze the resulting embeddings’ training dynamics, and where they lie in the embedding space, and show that our trained embeddings can be used for both in-context translation, and diverse generation of the target sentence.- Anthology ID:
- 2022.amta-research.4
- Volume:
- Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
- Month:
- September
- Year:
- 2022
- Address:
- Orlando, USA
- Venue:
- AMTA
- SIG:
- Publisher:
- Association for Machine Translation in the Americas
- Note:
- Pages:
- 45–57
- Language:
- URL:
- https://aclanthology.org/2022.amta-research.4
- DOI:
- Cite (ACL):
- Suzanna Sia and Kevin Duh. 2022. Prefix Embeddings for In-context Machine Translation. In Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 45–57, Orlando, USA. Association for Machine Translation in the Americas.
- Cite (Informal):
- Prefix Embeddings for In-context Machine Translation (Sia & Duh, AMTA 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.amta-research.4.pdf
- Data
- MTNT, The Pile