Abstract
Machine Translation traditionally treats documents as sets of independent sentences. In many genres, however, documents are highly structured, and their structure contains information that can be used to improve translation quality. We present a preliminary approach to document translation that uses structural features to modify the behaviour of a language model, at sentence-level granularity. To our knowledge, this is the first attempt to incorporate structural information into statistical MT. In experiments on structured English/French documents from the Hansard corpus, we demonstrate small but statistically significant improvements.- Anthology ID:
- 2010.amta-papers.24
- Volume:
- Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
- Month:
- October 31-November 4
- Year:
- 2010
- Address:
- Denver, Colorado, USA
- Venue:
- AMTA
- SIG:
- Publisher:
- Association for Machine Translation in the Americas
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/2010.amta-papers.24
- DOI:
- Cite (ACL):
- George Foster, Pierre Isabelle, and Roland Kuhn. 2010. Translating Structured Documents. In Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers, Denver, Colorado, USA. Association for Machine Translation in the Americas.
- Cite (Informal):
- Translating Structured Documents (Foster et al., AMTA 2010)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2010.amta-papers.24.pdf