Using Sublexical Translations to Handle the OOV Problem in MT
Chung-chi Huang | Ho-ching Yen | Shih-ting Huang | Jason Chang
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
We introduce a method for learning to translate out-of-vocabulary (OOV) words. The method focuses on combining sublexical/constituent translations of an OOV to generate its translation candidates. In our approach, wild-card searches are formulated based on our OOV analysis, aimed at maximizing the probability of retrieving OOVs’ sublexical translations from existing resource of machine translation (MT) systems. At run-time, translation candidates of the unknown words are generated from their suitable sublexical translations and ranked based on monolingual and bilingual information. We have incorporated the OOV model into a state-of-the-art MT system and experimental results show that our model indeed helps to ease the negative impact of OOVs on translation quality, especially for sentences containing more OOVs (significant improvement).