Md Arafat Sultan

Also published as: Md. Arafat Sultan, Md. Sultan

2020

Transfer learning techniques are particularly useful for NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pretrained language model (LM) on in-domain text before fine-tuning to downstream tasks. We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. To a bigger effect, we utilize structure in the unlabeled data to create auxiliary synthetic tasks, which helps the LM transfer to downstream tasks. We apply these approaches incrementally on a pretrained Roberta-large LM and show considerable performance gain on three tasks in the IT domain: Extractive Reading Comprehension, Document Ranking and Duplicate Question Detection.

Answer validation in machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair. Previous work has looked at re-assessing the “answerability” of the question given the extracted answer. Here we address a different problem: the tendency of existing MRC systems to produce partially correct answers when presented with answerable questions. We explore the nature of such errors and propose a post-processing correction method that yields statistically significant performance improvements over state-of-the-art MRC systems in both monolingual and multilingual evaluation.

Abstract Meaning Representations (AMRs) are broad-coverage sentence-level semantic graphs. Existing approaches to generating text from AMR have focused on training sequence-to-sequence or graph-to-sequence models on AMR annotated data only. In this paper, we propose an alternative approach that combines a strong pre-trained language model with cycle consistency-based re-scoring. Despite the simplicity of the approach, our experimental results show these models outperform all previous techniques on the English LDC2017T10 dataset, including the recent use of transformer architectures. In addition to the standard evaluation metrics, we provide human evaluation experiments that further substantiate the strength of our approach.

pdf bib abs
On the Importance of Diversity in Question Generation for QA
Md Arafat Sultan | Shubham Chandel | Ramón Fernandez Astudillo | Vittorio Castelli
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Automatic question generation (QG) has shown promise as a source of synthetic training data for question answering (QA). In this paper we ask: Is textual diversity in QG beneficial for downstream QA? Using top-p nucleus sampling to derive samples from a transformer-based question generator, we show that diversity-promoting QG indeed provides better QA training than likelihood maximization approaches such as beam search. We also show that standard QG evaluation metrics such as BLEU, ROUGE and METEOR are inversely correlated with diversity, and propose a diversity-aware intrinsic measure of overall QG quality that correlates well with extrinsic evaluation on QA.

2019

pdf bib abs
Cross-Task Knowledge Transfer for Query-Based Text Summarization
Elozino Egonmwan | Vittorio Castelli | Md Arafat Sultan
Proceedings of the 2nd Workshop on Machine Reading for Question Answering

We demonstrate the viability of knowledge transfer between two related tasks: machine reading comprehension (MRC) and query-based text summarization. Using an MRC model trained on the SQuAD1.1 dataset as a core system component, we first build an extractive query-based summarizer. For better precision, this summarizer also compresses the output of the MRC model using a novel sentence compression technique. We further leverage pre-trained machine translation systems to abstract our extracted summaries. Our models achieve state-of-the-art results on the publicly available CNN/Daily Mail and Debatepedia datasets, and can serve as simple yet powerful baselines for future systems. We also hope that these results will encourage research on transfer learning from large MRC corpora to query-based summarization.

2016

pdf bib
Bayesian Supervised Domain Adaptation for Short Text Similarity
Md Arafat Sultan | Jordan Boyd-Graber | Tamara Sumner
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Fast and Easy Short Answer Grading with High Accuracy
Md Arafat Sultan | Cristobal Salazar | Tamara Sumner
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
DLS@CU at SemEval-2016 Task 1: Supervised Models of Sentence Similarity
Md Arafat Sultan | Steven Bethard | Tamara Sumner
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib abs
A Joint Model for Answer Sentence Ranking and Answer Extraction
Md Arafat Sultan | Vittorio Castelli | Radu Florian
Transactions of the Association for Computational Linguistics, Volume 4

Answer sentence ranking and answer extraction are two key challenges in question answering that have traditionally been treated in isolation, i.e., as independent tasks. In this article, we (1) explain how both tasks are related at their core by a common quantity, and (2) propose a simple and intuitive joint probabilistic model that addresses both via joint computation but task-specific application of that quantity. In our experiments with two TREC datasets, our joint model substantially outperforms state-of-the-art systems in both tasks.

2015

pdf bib
DLS@CU: Sentence Similarity from Word Alignment and Semantic Vector Composition
Md Arafat Sultan | Steven Bethard | Tamara Sumner
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
Feature-Rich Two-Stage Logistic Regression for Monolingual Alignment
Md Arafat Sultan | Steven Bethard | Tamara Sumner
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib
DLS@CU: Sentence Similarity from Word Alignment
Md Arafat Sultan | Steven Bethard | Tamara Sumner
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib abs
Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence
Md Arafat Sultan | Steven Bethard | Tamara Sumner
Transactions of the Association for Computational Linguistics, Volume 2

We present a simple, easy-to-replicate monolingual aligner that demonstrates state-of-the-art performance while relying on almost no supervision and a very small number of external resources. Based on the hypothesis that words with similar meanings represent potential pairs for alignment if located in similar contexts, we propose a system that operates by finding such pairs. In two intrinsic evaluations on alignment test data, our system achieves F1 scores of 88–92%, demonstrating 1–3% absolute improvement over the previous best system. Moreover, in two extrinsic evaluations our aligner outperforms existing aligners, and even a naive application of the aligner approaches state-of-the-art performance in each extrinsic task.