Combining Multi-Domain Statistical Machine Translation Models using Automatic Classifiers
Pratyush Banerjee, Jinhua Du, Baoli Li, Sudip Naskar, Andy Way, Josef van Genabith
Abstract
This paper presents a set of experiments on Domain Adaptation of Statistical Machine Translation systems. The experiments focus on Chinese-English and two domain-specific corpora. The paper presents a novel approach for combining multiple domain-trained translation models to achieve improved translation quality for both domain-specific as well as combined sets of sentences. We train a statistical classifier to classify sentences according to the appropriate domain and utilize the corresponding domain-specific MT models to translate them. Experimental results show that the method achieves a statistically significant absolute improvement of 1.58 BLEU (2.86% relative improvement) score over a translation model trained on combined data, and considerable improvements over a model using multiple decoding paths of the Moses decoder, for the combined domain test set. Furthermore, even for domain-specific test sets, our approach works almost as well as dedicated domain-specific models and perfect classification.- Anthology ID:
- 2010.amta-papers.16
- Volume:
- Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
- Month:
- October 31-November 4
- Year:
- 2010
- Address:
- Denver, Colorado, USA
- Venue:
- AMTA
- SIG:
- Publisher:
- Association for Machine Translation in the Americas
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/2010.amta-papers.16
- DOI:
- Cite (ACL):
- Pratyush Banerjee, Jinhua Du, Baoli Li, Sudip Naskar, Andy Way, and Josef van Genabith. 2010. Combining Multi-Domain Statistical Machine Translation Models using Automatic Classifiers. In Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers, Denver, Colorado, USA. Association for Machine Translation in the Americas.
- Cite (Informal):
- Combining Multi-Domain Statistical Machine Translation Models using Automatic Classifiers (Banerjee et al., AMTA 2010)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2010.amta-papers.16.pdf