Avneesh Saluja


2021

pdf bib
Hierarchical Encoders for Modeling and Interpreting Screenplays
Gayatri Bhat | Avneesh Saluja | Melody Dye | Jan Florjanczyk
Proceedings of the Third Workshop on Narrative Understanding

While natural language understanding of long-form documents remains an open challenge, such documents often contain structural information that can inform the design of models encoding them. Movie scripts are an example of such richly structured text – scripts are segmented into scenes, which decompose into dialogue and descriptive components. In this work, we propose a neural architecture to encode this structure, which performs robustly on two multi-label tag classification tasks without using handcrafted features. We add a layer of insight by augmenting the encoder with an unsupervised ‘interpretability’ module, which can be used to extract and visualize narrative trajectories. Though this work specifically tackles screenplays, we discuss how the underlying approach can be generalized to a range of structured documents.

2018

pdf
Using Aspect Extraction Approaches to Generate Review Summaries and User Profiles
Christopher Mitcheltree | Skyler Wharton | Avneesh Saluja
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

Reviews of products or services on Internet marketplace websites contain a rich amount of information. Users often wish to survey reviews or review snippets from the perspective of a certain aspect, which has resulted in a large body of work on aspect identification and extraction from such corpora. In this work, we evaluate a newly-proposed neural model for aspect extraction on two practical tasks. The first is to extract canonical sentences of various aspects from reviews, and is judged by human evaluators against alternatives. A k-means baseline does remarkably well in this setting. The second experiment focuses on the suitability of the recovered aspect distributions to represent users by the reviews they have written. Through a set of review reranking experiments, we find that aspect-based profiles can largely capture notions of user preferences, by showing that divergent users generate markedly different review rankings.

2014

pdf
Graph-based Semi-Supervised Learning of Translation Models from Monolingual Data
Avneesh Saluja | Hany Hassan | Kristina Toutanova | Chris Quirk
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Language Modeling with Power Low Rank Ensembles
Ankur P. Parikh | Avneesh Saluja | Chris Dyer | Eric Xing
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Latent-Variable Synchronous CFGs for Hierarchical Translation
Avneesh Saluja | Chris Dyer | Shay B. Cohen
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf
Graph-Based Unsupervised Learning of Word Similarities Using Heterogeneous Feature Types
Avneesh Saluja | Jiří Navrátil
Proceedings of TextGraphs-8 Graph-based Methods for Natural Language Processing

2012

pdf
Machine Translation with Binary Feedback: a Large-Margin Approach
Avneesh Saluja | Ian Lane | Ying Zhang
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers

Viewing machine translation as a structured classification problem has provided a gateway for a host of structured prediction techniques to enter the field. In particular, large-margin structured prediction methods for discriminative training of feature weights, such as the structured perceptron or MIRA, have started to match or exceed the performance of existing methods such as MERT. One issue with structured problems in general is the difficulty in obtaining fully structured labels, e.g., in machine translation, obtaining reference translations or parallel sentence corpora for arbitrary language pairs. Another issue, more specific to the translation domain, is the difficulty in online training of machine translation systems, since existing methods often require bilingual knowledge to correct translation output online. We propose a solution to these two problems, by demonstrating a way to incorporate binary-labeled feedback (i.e., feedback on whether a translation hypothesis is a “good” or understandable one or not), a form of supervision that can be easily integrated in an online manner, into a machine translation framework. Experimental results show marked improvement by incorporating binary feedback on unseen test data, with gains exceeding 5.5 BLEU points.

2011

pdf
Context-aware Language Modeling for Conversational Speech Translation
Avneesh Saluja | Ian Lane | Ying Zhang
Proceedings of Machine Translation Summit XIII: Papers