Saptarashmi Bandyopadhyay


2024

pdf
You Make me Feel like a Natural Question: Training QA Systems on Transformed Trivia Questions
Tasnim Kabir | Yoo Yeon Sung | Saptarashmi Bandyopadhyay | Hao Zou | Abhranil Chandra | Jordan Lee Boyd-Graber
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Training question-answering QA and information retrieval systems for web queries require large, expensive datasets that are difficult to annotate and time-consuming to gather. Moreover, while natural datasets of information-seeking questions are often prone to ambiguity or ill-formed, there are troves of freely available, carefully crafted question datasets for many languages. Thus, we automatically generate shorter, information-seeking questions, resembling web queries in the style of the Natural Questions (NQ) dataset from longer trivia data. Training a QA system on these transformed questions is a viable strategy for alternating to more expensive training setups showing the F1 score difference of less than six points and contrasting the final systems.

2021

pdf
The University of Maryland, College Park Submission to Large-Scale Multilingual Shared Task at WMT 2021
Saptarashmi Bandyopadhyay | Tasnim Kabir | Zizhen Lian | Marine Carpuat
Proceedings of the Sixth Conference on Machine Translation

This paper describes the system submitted to Large-Scale Multilingual Shared Task (Small Task #2) at WMT 2021. It is based on the massively multilingual open-source model FLORES101_MM100 model, with selective fine-tuning. Our best-performing system reported a 15.72 average BLEU score for the task.

2020

pdf
UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian
Sourav Dutta | Jesujoba Alabi | Saptarashmi Bandyopadhyay | Dana Ruiter | Josef van Genabith
Proceedings of the Fifth Conference on Machine Translation

This paper describes the UdS-DFKI submission to the shared task for unsupervised machine translation (MT) and very low-resource supervised MT between German (de) and Upper Sorbian (hsb) at the Fifth Conference of Machine Translation (WMT20). We submit systems for both the supervised and unsupervised tracks. Apart from various experimental approaches like bitext mining, model pre-training, and iterative back-translation, we employ a factored machine translation approach on a small BPE vocabulary.

pdf
Natural Language Response Generation from SQL with Generalization and Back-translation
Saptarashmi Bandyopadhyay | Tianyang Zhao
Proceedings of the First Workshop on Interactive and Executable Semantic Parsing

Generation of natural language responses to the queries of structured language like SQL is very challenging as it requires generalization to new domains and the ability to answer ambiguous queries among other issues. We have participated in the CoSQL shared task organized in the IntEx-SemPar workshop at EMNLP 2020. We have trained a number of Neural Machine Translation (NMT) models to efficiently generate the natural language responses from SQL. Our shuffled back-translation model has led to a BLEU score of 7.47 on the unknown test dataset. In this paper, we will discuss our methodologies to approach the problem and future directions to improve the quality of the generated natural language responses.

2019

pdf
Factored Neural Machine Translation at LoResMT 2019
Saptarashmi Bandyopadhyay
Proceedings of the 2nd Workshop on Technologies for MT of Low Resource Languages

2016

pdf
Content selection as semantic-based ontology exploration
Laura Perez-Beltrachini | Claire Gardent | Anselme Revuz | Saptarashmi Bandyopadhyay
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)