Linguistic knowledge and complexity in an EBMT system based on translation patterns

Kevin McTait


Abstract
An approach to Example-Based Machine Translation is presented which operates by extracting translation patterns from a bilingual corpus aligned at the level of the sentence. This is carried out using a language-neutral recursive machine-learning algorithm based on the principle of similar distributions of strings. The translation patterns extracted represent generalisations of sentences that are translations of each other and, to some extent, resemble transfer rules but with fewer constraints. The strings and variables, of which translations patterns are composed, are aligned in order to provide a more refined bilingual knowledge source, necessary for the recombination phase. A non-structural approach based on surface forms is error prone and liable to produce translation patterns that are false translations. Such errors are highlighted and solutions are proposed by the addition of external linguistic resources, namely morphological analysis and part-of-speech tagging. The amount of linguistic resources added has consequences for computational complexity and portability.
Anthology ID:
2001.mtsummit-ebmt.3
Volume:
Workshop on Example-Based machine Translation
Month:
September 18-22
Year:
2001
Address:
Santiago de Compostela, Spain
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2001.mtsummit-ebmt.3
DOI:
Bibkey:
Cite (ACL):
Kevin McTait. 2001. Linguistic knowledge and complexity in an EBMT system based on translation patterns. In Workshop on Example-Based machine Translation, Santiago de Compostela, Spain.
Cite (Informal):
Linguistic knowledge and complexity in an EBMT system based on translation patterns (McTait, MTSummit 2001)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2001.mtsummit-ebmt.3.pdf