Amit Sangodkar


2014

pdf
A Domain-Restricted, Rule Based, English-Hindi Machine Translation System Based on Dependency Parsing
Pratik Desai | Amit Sangodkar | Om P. Damani
Proceedings of the 11th International Conference on Natural Language Processing

2012

pdf
Re-ordering Source Sentences for SMT
Amit Sangodkar | Om Damani
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We propose a pre-processing stage for Statistical Machine Translation (SMT) systems where the words of the source sentence are re-ordered as per the syntax of the target language prior to the alignment process, so that the alignment found by the statistical system is improved. We take a dependency parse of the source sentence and linearize it as per the syntax of the target language, before it is used in either the training or the decoding phase. During this linearization, the ordering decisions among dependency nodes having a common parent are done based on two aspects: parent-child positioning and relation priority. To make the linearization process rule-driven, we assume that the relative word order of a dependency relation's relata does not depend either on the semantic properties of the relata or on the rest of the expression. We also assume that the relative word order of various relations sharing a relata does not depend on the rest of the expression. We experiment with a publicly available English-Hindi parallel corpus and show that our scheme improves the BLEU score.