Karthik Visweswariah


2014

pdf
When Transliteration Met Crowdsourcing : An Empirical Study of Transliteration via Crowdsourcing using Efficient, Non-redundant and Fair Quality Control
Mitesh M. Khapra | Ananthakrishnan Ramanathan | Anoop Kunchukuttan | Karthik Visweswariah | Pushpak Bhattacharyya
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Sufficient parallel transliteration pairs are needed for training state of the art transliteration engines. Given the cost involved, it is often infeasible to collect such data using experts. Crowdsourcing could be a cheaper alternative, provided that a good quality control (QC) mechanism can be devised for this task. Most QC mechanisms employed in crowdsourcing are aggressive (unfair to workers) and expensive (unfair to requesters). In contrast, we propose a low-cost QC mechanism which is fair to both workers and requesters. At the heart of our approach, lies a rule based Transliteration Equivalence approach which takes as input a list of vowels in the two languages and a mapping of the consonants in the two languages. We empirically show that our approach outperforms other popular QC mechanisms (viz., consensus and sampling) on two vital parameters : (i) fairness to requesters (lower cost per correct transliteration) and (ii) fairness to workers (lower rate of rejecting correct answers). Further, as an extrinsic evaluation we use the standard NEWS 2010 test set and show that such quality controlled crowdsourced data compares well to expert data when used for training a transliteration engine.

pdf
Unsupervised Solution Post Identification from Discussion Forums
Deepak P | Karthik Visweswariah
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

pdf
Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation
Karthik Visweswariah | Mitesh M. Khapra | Ananthakrishnan Ramanathan
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Improving reordering performance using higher order and structural features
Mitesh M. Khapra | Ananthakrishnan Ramanathan | Karthik Visweswariah
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Semi-Supervised Answer Extraction from Discussion Forums
Rose Catherine | Rashmi Gangadharaiah | Karthik Visweswariah | Dinesh Raghu
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

pdf bib
Proceedings of the Workshop on Reordering for Statistical Machine Translation
Karthik Visweswariah | Ananthakrishnan Ramanathan | Mitesh M. Khapra
Proceedings of the Workshop on Reordering for Statistical Machine Translation

pdf bib
Whitepaper for Shared Task on Learning Reordering from Word Alignments at RSMT 2012
Mitesh M. Khapra | Ananthakrishnan Ramanathan | Karthik Visweswariah
Proceedings of the Workshop on Reordering for Statistical Machine Translation

pdf bib
Report of the Shared Task on Learning Reordering from Word Alignments at RSMT 2012
Mitesh M. Khapra | Ananthakrishnan Ramanathan | Karthik Visweswariah
Proceedings of the Workshop on Reordering for Statistical Machine Translation

pdf
A Comparison of Syntactic Reordering Methods for English-German Machine Translation
Jiří Navrátil | Karthik Visweswariah | Ananthakrishnan Ramanathan
Proceedings of COLING 2012

pdf
Does Similarity Matter? The Case of Answer Extraction from Technical Discussion Forums
Rose Catherine | Amit Singh | Rashmi Gangadharaiah | Dinesh Raghu | Karthik Visweswariah
Proceedings of COLING 2012: Posters

pdf
A Study of Word-Classing for MT Reordering
Ananthakrishnan Ramanathan | Karthik Visweswariah
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

MT systems typically use parsers to help reorder constituents. However most languages do not have adequate treebank data to learn good parsers, and such training data is extremely time-consuming to annotate. Our earlier work has shown that a reordering model learned from word-alignments using POS tags as features can improve MT performance (Visweswariah et al., 2011). In this paper, we investigate the effect of word-classing on reordering performance using this model. We show that unsupervised word clusters perform somewhat worse but still reasonably well, compared to a part-of-speech (POS) tagger built with a small amount of annotated data; while a richer tag set including case and gender-number-person further improves reordering performance by around 1.2 monolingual BLEU points. While annotating this richer tagset is more complicated than annotating the base tagset, it is much easier than annotating treebank data.

2011

pdf
Handling verb phrase morphology in highly inflected Indian languages for Machine Translation
Ankur Gandhe | Rashmi Gangadharaiah | Karthik Visweswariah | Ananthakrishnan Ramanathan
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
Clause-Based Reordering Constraints to Improve Statistical Machine Translation
Ananthakrishnan Ramanathan | Pushpak Bhattacharyya | Karthik Visweswariah | Kushal Ladha | Ankur Gandhe
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
A Word Reordering Model for Improved Machine Translation
Karthik Visweswariah | Rajakrishnan Rajkumar | Ankur Gandhe | Ananthakrishnan Ramanathan | Jiri Navratil
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf
Syntax Based Reordering with Automatically Derived Rules for Improved Statistical Machine Translation
Karthik Visweswariah | Jiri Navratil | Jeffrey Sorensen | Vijil Chenthamarakshan | Nandakishore Kambhatla
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf
Urdu and Hindi: Translation and sharing of linguistic resources
Karthik Visweswariah | Vijil Chenthamarakshan | Nandakishore Kambhatla
Coling 2010: Posters