2022
pdf
abs
Unsupervised Generation of Long-form Technical Questions from Textbook Metadata using Structured Templates
Indrajit Bhattacharya
|
Subhasish Ghosh
|
Arpita Kundu
|
Pratik Saini
|
Tapas Nayak
Proceedings of the First Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning
We explore the task of generating long-form technical questions from textbooks. Semi-structured metadata of a textbook — the table of contents and the index — provide rich cues for technical question generation. Existing literature for long-form question generation focuses mostly on reading comprehension assessment, and does not use semi-structured metadata for question generation. We design unsupervised template based algorithms for generating questions based on structural and contextual patterns in the index and ToC. We evaluate our approach on textbooks on diverse subjects and show that our approach generates high quality questions of diverse types. We show that, in comparison, zero-shot question generation using pre-trained LLMs on the same meta-data has much poorer quality.
pdf
abs
Weakly Supervised Context-based Interview Question Generation
Samiran Pal
|
Kaamraan Khan
|
Avinash Kumar Singh
|
Subhasish Ghosh
|
Tapas Nayak
|
Girish Palshikar
|
Indrajit Bhattacharya
Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM)
We explore the task of automated generation of technical interview questions from a given textbook. Such questions are different from those for reading comprehension studied in question generation literature. We curate a context based interview questions data set for Machine Learning and Deep Learning from two popular textbooks. We first explore the possibility of using a large generative language model (GPT-3) for this task in a zero shot setting. We then evaluate the performance of smaller generative models such as BART fine-tuned on weakly supervised data obtained using GPT-3 and hand-crafted templates. We deploy an automatic question importance assignment technique to figure out suitability of a question in a technical interview. It improves the evaluation results in many dimensions. We dissect the performance of these models for this task and also scrutinize the suitability of questions generated by them for use in technical interviews.
pdf
abs
A Weak Supervision Approach for Predicting Difficulty of Technical Interview Questions
Arpita Kundu
|
Subhasish Ghosh
|
Pratik Saini
|
Tapas Nayak
|
Indrajit Bhattacharya
Proceedings of the 29th International Conference on Computational Linguistics
Predicting difficulty of questions is crucial for technical interviews. However, such questions are long-form and more open-ended than factoid and multiple choice questions explored so far for question difficulty prediction. Existing models also require large volumes of candidate response data for training. We study weak-supervision and use unsupervised algorithms for both question generation and difficulty prediction. We create a dataset of interview questions with difficulty scores for deep learning and use it to evaluate SOTA models for question difficulty prediction trained using weak supervision. Our analysis brings out the task’s difficulty as well as the promise of weak supervision for it.
2021
pdf
abs
Joint Learning of Representations for Web-tables, Entities and Types using Graph Convolutional Network
Aniket Pramanick
|
Indrajit Bhattacharya
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Existing approaches for table annotation with entities and types either capture the structure of table using graphical models, or learn embeddings of table entries without accounting for the complete syntactic structure. We propose TabGCN, that uses Graph Convolutional Networks to capture the complete structure of tables, knowledge graph and the training annotations, and jointly learns embeddings for table elements as well as the entities and types. To account for knowledge incompleteness, TabGCN’s embeddings can be used to discover new entities and types. Using experiments on 5 benchmark datasets, we show that TabGCN significantly outperforms multiple state-of-the-art baselines for table annotation, while showing promising performance on downstream table-related applications.
pdf
abs
Complex Question Answering on knowledge graphs using machine translation and multi-task learning
Saurabh Srivastava
|
Mayur Patidar
|
Sudip Chowdhury
|
Puneet Agarwal
|
Indrajit Bhattacharya
|
Gautam Shroff
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Question answering (QA) over a knowledge graph (KG) is a task of answering a natural language (NL) query using the information stored in KG. In a real-world industrial setting, this involves addressing multiple challenges including entity linking, multi-hop reasoning over KG, etc. Traditional approaches handle these challenges in a modularized sequential manner where errors in one module lead to the accumulation of errors in downstream modules. Often these challenges are inter-related and the solutions to them can reinforce each other when handled simultaneously in an end-to-end learning setup. To this end, we propose a multi-task BERT based Neural Machine Translation (NMT) model to address these challenges. Through experimental analysis, we demonstrate the efficacy of our proposed approach on one publicly available and one proprietary dataset.
pdf
abs
Generating An Optimal Interview Question Plan Using A Knowledge Graph And Integer Linear Programming
Soham Datta
|
Prabir Mallick
|
Sangameshwar Patil
|
Indrajit Bhattacharya
|
Girish Palshikar
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Given the diversity of the candidates and complexity of job requirements, and since interviewing is an inherently subjective process, it is an important task to ensure consistent, uniform, efficient and objective interviews that result in high quality recruitment. We propose an interview assistant system to automatically, and in an objective manner, select an optimal set of technical questions (from question banks) personalized for a candidate. This set can help a human interviewer to plan for an upcoming interview of that candidate. We formalize the problem of selecting a set of questions as an integer linear programming problem and use standard solvers to get a solution. We use knowledge graph as background knowledge in this formulation, and derive our objective functions and constraints from it. We use candidate’s resume to personalize the selection of questions. We propose an intrinsic evaluation to compare a set of suggested questions with actually asked questions. We also use expert interviewers to comparatively evaluate our approach with a set of reasonable baselines.
2020
pdf
abs
Discovering Knowledge Graph Schema from Short Natural Language Text via Dialog
Subhasis Ghosh
|
Arpita Kundu
|
Aniket Pramanick
|
Indrajit Bhattacharya
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
We study the problem of schema discovery for knowledge graphs. We propose a solution where an agent engages in multi-turn dialog with an expert for this purpose. Each mini-dialog focuses on a short natural language statement, and looks to elicit the expert’s desired schema-based interpretation of that statement, taking into account possible augmentations to the schema. The overall schema evolves by performing dialog over a collection of such statements. We take into account the probability that the expert does not respond to a query, and model this probability as a function of the complexity of the query. For such mini-dialogs with response uncertainty, we propose a dialog strategy that looks to elicit the schema over as short a dialog as possible. By combining the notion of uncertainty sampling from active learning with generalized binary search, the strategy asks the query with the highest expected reduction of entropy. We show that this significantly reduces dialog complexity while engaging the expert in meaningful dialog.
2017
pdf
abs
Stance Classification of Context-Dependent Claims
Roy Bar-Haim
|
Indrajit Bhattacharya
|
Francesco Dinuzzo
|
Amrita Saha
|
Noam Slonim
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Recent work has addressed the problem of detecting relevant claims for a given controversial topic. We introduce the complementary task of Claim Stance Classification, along with the first benchmark dataset for this task. We decompose this problem into: (a) open-domain target identification for topic and claim (b) sentiment classification for each target, and (c) open-domain contrast detection between the topic and the claim targets. Manual annotation of the dataset confirms the applicability and validity of our model. We describe an implementation of our model, focusing on a novel algorithm for contrast detection. Our approach achieves promising results, and is shown to outperform several baselines, which represent the common practice of applying a single, monolithic classifier for stance classification.
2004
pdf
The University of Maryland Senseval-3 system descriptions
Clara Cabezas
|
Indrajit Bhattacharya
|
Philip Resnik
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text
pdf
Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models
Indrajit Bhattacharya
|
Lise Getoor
|
Yoshua Bengio
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)