Neha Mourya


2016

pdf
Dealing with Linguistic Divergences in English-Bhojpuri Machine Translation
Pitambar Behera | Neha Mourya | Vandana Pandey
Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016)

In Machine Translation, divergence is one of the major barriers which plays a deciding role in determining the efficiency of the system at hand. Translation divergences originate when there is structural discrepancies between the input and the output languages. It can be of various types based on the issues we are addressing to such as linguistic, cultural, communicative and so on. Owing to the fact that two languages owe their origin to different language families, linguistic divergences emerge. The present study attempts at categorizing different types of linguistic divergences: the lexical-semantic and syntactic. In addition, it also helps identify and resolve the divergent linguistic features between English as source language and Bhojpuri as target language pair. Dorr’s theoretical framework (1994, 1994a) has been followed in the classification and resolution procedure. Furthermore, so far as the methodology is concerned, we have adhered to the Dorr’s Lexical Conceptual Structure for the resolution of divergences. This research will prove to be beneficial for developing efficient MT systems if the mentioned factors are incorporated considering the inherent structural constraints between source and target languages.ated considering the inherent structural constraints between SL and TL pairs.