Sumit Agarwal


2022

pdf
Model Transfer for Event tracking as Transcript Understanding for Videos of Small Group Interaction
Sumit Agarwal | Rosanna Vitiello | Carolyn Rosé
Proceedings of the First Workshop On Transcript Understanding

Videos of group interactions contain a wealth of information beyond the information directly communicated in a transcript of the discussion. Tracking who has participated throughout an extended interaction and what each of their trajectories has been in relation to one another is the foundation for joint activity understanding, though it comes with some unique challenges in videos of tightly coupled group work. Motivated by insights into the properties of such scenarios, including group composition and the properties of task-oriented, goal directed tasks, we present a successful proof-of-concept. In particular, we present a transfer experiment to a dyadic robot construction task, an ablation study, and a qualitative analysis.

pdf
R3 : Refined Retriever-Reader pipeline for Multidoc2dial
Srijan Bansal | Suraj Tripathi | Sumit Agarwal | Sireesh Gururaja | Aditya Srikanth Veerubhotla | Ritam Dutt | Teruko Mitamura | Eric Nyberg
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering

In this paper, we present our submission to the DialDoc shared task based on the MultiDoc2Dial dataset. MultiDoc2Dial is a conversational question answering dataset that grounds dialogues in multiple documents. The task involves grounding a user’s query in a document followed by generating an appropriate response. We propose several improvements over the baseline’s retriever-reader architecture to aid in modeling goal-oriented dialogues grounded in multiple documents. Our proposed approach employs sparse representations for passage retrieval, a passage re-ranker, the fusion-in-decoder architecture for generation, and a curriculum learning training paradigm. Our approach shows a 12 point improvement in BLEU score compared to the baseline RAG model.

pdf
Zero-shot cross-lingual open domain question answering
Sumit Agarwal | Suraj Tripathi | Teruko Mitamura | Carolyn Penstein Rose
Proceedings of the Workshop on Multilingual Information Access (MIA)

People speaking different kinds of languages search for information in a cross-lingual manner. They tend to ask questions in their language and expect the answer to be in the same language, despite the evidence lying in another language. In this paper, we present our approach for this task of cross-lingual open-domain question-answering. Our proposed method employs a passage reranker, the fusion-in-decoder technique for generation, and a wiki data entity-based post-processing system to tackle the inability to generate entities across all languages. Our end-2-end pipeline shows an improvement of 3 and 4.6 points on F1 and EM metrics respectively, when compared with the baseline CORA model on the XOR-TyDi dataset. We also evaluate the effectiveness of our proposed techniques in the zero-shot setting using the MKQA dataset and show an improvement of 5 points in F1 for high-resource and 3 points improvement for low-resource zero-shot languages. Our team, CMUmQA’s submission in the MIA-Shared task ranked 1st in the constrained setup for the dev and 2nd in the test setting.