Abstract
In conventional word alignment methods, some employ statistical models or statistical measures, which need large-scale bilingual sentence-aligned training corpora. Others employ dictionaries to guide alignment selection. However, these methods achieve unsatisfactory alignment results when performing word alignment on a small-scale domain-specific bilingual corpus without terminological lexicons. This paper proposes an approach to improve word alignment in a specific domain, in which only a small-scale domain-specific corpus is available, by adapting the word alignment information in the general domain to the specific domain. This approach first trains two statistical word alignment models with the large-scale corpus in the general domain and the small-scale corpus in the specific domain respectively, and then improves the domain-specific word alignment with these two models. Experimental results show a significant improvement in terms of both alignment precision and recall, achieving a relative error rate reduction of 21.96% as compared with state-of-the-art technologies.- Anthology ID:
- 2004.amta-papers.29
- Volume:
- Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers
- Month:
- September 28 - October 2
- Year:
- 2004
- Address:
- Washington, USA
- Editors:
- Robert E. Frederking, Kathryn B. Taylor
- Venue:
- AMTA
- SIG:
- Publisher:
- Springer
- Note:
- Pages:
- 262–271
- Language:
- URL:
- https://link.springer.com/chapter/10.1007/978-3-540-30194-3_29
- DOI:
- Cite (ACL):
- Hua Wu and Haifeng Wang. 2004. Improving domain-specific word alignment with a general bilingual corpus. In Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 262–271, Washington, USA. Springer.
- Cite (Informal):
- Improving domain-specific word alignment with a general bilingual corpus (Wu & Wang, AMTA 2004)
- PDF:
- https://link.springer.com/chapter/10.1007/978-3-540-30194-3_29