This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
GeetanjaliRakshit
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
Figurative language is commonplace in natural language, and while making communication memorable and creative, can be difficult to understand. In this work, we investigate the robustness of Question Answering (QA) models on figurative text. Yes/no questions, in particular, are a useful probe of figurative language understanding capabilities of large language models. We propose FigurativeQA, a set of 1000 yes/no questions with figurative and non-figurative contexts, extracted from the domains of restaurant and product reviews. We show that state-of-the-art BERT-based QA models exhibit an average performance drop of up to 15% points when answering questions from figurative contexts, as compared to non-figurative ones. While models like GPT-3 and ChatGPT are better at handling figurative texts, we show that further performance gains can be achieved by automatically simplifying the figurative contexts into their non-figurative (literal) counterparts. We find that the best overall model is ChatGPT with chain-of-thought prompting to generate non-figurative contexts. Our work provides a promising direction for building more robust QA models with figurative language understanding capabilities.
Figurative language is widespread in human language (Lakoff and Johnson, 2008) posing potential challenges in NLP applications. In this paper, we investigate the effect of figurative language on the task of question answering (QA). We construct FigQA, a test set of 400 yes-no questions with figurative and non-figurative contexts, extracted from product reviews and restaurant reviews. We demonstrate that a state-of-the-art RoBERTa QA model has considerably lower performance in question answering when the contexts are figurative rather than literal, indicating a gap in current models. We propose a general method for improving the performance of QA models by converting the figurative contexts into non-figurative by prompting GPT-3, and demonstrate its effectiveness. Our results indicate a need for building QA models infused with figurative language understanding capabilities.
Prepositional Phrase (PP) attachment is a classical problem in NLP for languages like English, which suffer from structural ambiguity. In this work, we solve this problem with the help of another language free from such ambiguities, using the parse tree of the parallel sentence in the other language, and word alignments. We formulate an optimization framework that encourages agreement between the parse trees for two languages, and solve it using a novel Dual Decomposition (DD) based algorithm. Experiments on the English-Hindi language pair show promising improvements over the baseline.