Priyanka Agrawal


2024

pdf
𝜇PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge
Fantine Huot | Joshua Maynez | Chris Alberti | Reinald Kim Amplayo | Priyanka Agrawal | Constanza Fierro | Shashi Narayan | Mirella Lapata
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Cross-lingual summarization aims to generate a summary in one languagegiven input in a different language, allowing for the dissemination ofrelevant content among different language speaking populations. Thetask is challenging mainly due to the paucity of cross-lingualdatasets and the compounded difficulty of summarizing andtranslating.This work presents 𝜇PLAN, an approach to cross-lingual summarization that uses an intermediate planning step as a cross-lingual bridge. We formulate the plan as a sequence of entities capturing thesummary’s content and the order in which it should becommunicated. Importantly, our plans abstract from surface form: usinga multilingual knowledge base, we align entities to their canonicaldesignation across languages and generate the summary conditioned onthis cross-lingual bridge and the input. Automatic and human evaluation on the XWikis dataset (across four language pairs) demonstrates that our planning objective achieves state-of-the-art performance interms of informativeness and faithfulness. Moreover, 𝜇PLAN modelsimprove the zero-shot transfer to new cross-lingual language pairscompared to baselines without a planning component.

2023

pdf
Benchmarking Large Language Model Capabilities for Conditional Generation
Joshua Maynez | Priyanka Agrawal | Sebastian Gehrmann
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Pre-trained large language models (PLMs) underly most new developments in natural language processing. They have shifted the field from application-specific model pipelines to a single model that is adapted to a wide range of tasks. Autoregressive PLMs like GPT-3 or PaLM and associated techniques like fewshot learning, have additionally shifted the output modality to generation instead of classification or regression. Despite their ubiquitous use, the generation quality of language models is rarely evaluated when these models are introduced. Additionally, it is unclear how existing generation tasks–while they can be used to compare systems at a high level–relate to the real world use cases for which people have been adopting them. In this work, we discuss how to adapt existing application-specific generation benchmarks to PLMs and provide an in-depth, empirical study of the limitations and capabilities of PLMs in natural language generation tasks along dimensions such as scale, architecture, input and output language. Our results show that PLMs differ in their applicability to different data regimes and their generalization to multiple languages. They further inform practitioners as to which PLMs to use for a given generation task setup. We share best practices to be taken into consideration when benchmarking generation capabilities during the development of upcoming PLMs.

2022

pdf bib
Proceedings of the Sixth Workshop on Structured Prediction for NLP
Andreas Vlachos | Priyanka Agrawal | André Martins | Gerasimos Lampouras | Chunchuan Lyu
Proceedings of the Sixth Workshop on Structured Prediction for NLP

2021

pdf bib
Proceedings of the 5th Workshop on Structured Prediction for NLP (SPNLP 2021)
Zornitsa Kozareva | Sujith Ravi | Andreas Vlachos | Priyanka Agrawal | André Martins
Proceedings of the 5th Workshop on Structured Prediction for NLP (SPNLP 2021)

2020

pdf bib
Proceedings of the Fourth Workshop on Structured Prediction for NLP
Priyanka Agrawal | Zornitsa Kozareva | Julia Kreutzer | Gerasimos Lampouras | André Martins | Sujith Ravi | Andreas Vlachos
Proceedings of the Fourth Workshop on Structured Prediction for NLP

2019

pdf
Unified Semantic Parsing with Weak Supervision
Priyanka Agrawal | Ayushi Dalmia | Parag Jain | Abhishek Bansal | Ashish Mittal | Karthik Sankaranarayanan
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Semantic parsing over multiple knowledge bases enables a parser to exploit structural similarities of programs across the multiple domains. However, the fundamental challenge lies in obtaining high-quality annotations of (utterance, program) pairs across various domains needed for training such models. To overcome this, we propose a novel framework to build a unified multi-domain enabled semantic parser trained only with weak supervision (denotations). Weakly supervised training is particularly arduous as the program search space grows exponentially in a multi-domain setting. To solve this, we incorporate a multi-policy distillation mechanism in which we first train domain-specific semantic parsers (teachers) using weak supervision in the absence of the ground truth programs, followed by training a single unified parser (student) from the domain specific policies obtained from these teachers. The resultant semantic parser is not only compact but also generalizes better, and generates more accurate programs. It further does not require the user to provide a domain label while querying. On the standard Overnight dataset (containing multiple domains), we demonstrate that the proposed model improves performance by 20% in terms of denotation accuracy in comparison to baseline techniques.

2012

pdf bib
Proceedings of the Workshop on Question Answering for Complex Domains
Nanda Kambhatla | Sachindra Joshi | Ganesh Ramakrishnan | Kiran Kate | Priyanka Agrawal
Proceedings of the Workshop on Question Answering for Complex Domains