Language use changes over time, and this impacts the effectiveness of NLP systems. This phenomenon is even more prevalent in social media data during crisis events where meaning and frequency of word usage may change over the course of days. Contextual language models fail to adapt temporally, emphasizing the need for temporal adaptation in models which need to be deployed over an extended period of time. While existing approaches consider data spanning large periods of time (from years to decades), shorter time spans are critical for crisis data. We quantify temporal degradation for this scenario and propose methods to cope with performance loss by leveraging techniques from domain adaptation. To the best of our knowledge, this is the first effort to explore effects of rapid language change driven by adversarial adaptations, particularly during natural and human-induced disasters. Through extensive experimentation on diverse crisis datasets, we analyze under what conditions our approaches outperform strong baselines while highlighting the current limitations of temporal adaptation methods in scenarios where access to unlabeled data is scarce.
Existing approaches for table annotation with entities and types either capture the structure of table using graphical models, or learn embeddings of table entries without accounting for the complete syntactic structure. We propose TabGCN, that uses Graph Convolutional Networks to capture the complete structure of tables, knowledge graph and the training annotations, and jointly learns embeddings for table elements as well as the entities and types. To account for knowledge incompleteness, TabGCN’s embeddings can be used to discover new entities and types. Using experiments on 5 benchmark datasets, we show that TabGCN significantly outperforms multiple state-of-the-art baselines for table annotation, while showing promising performance on downstream table-related applications.
We study the problem of schema discovery for knowledge graphs. We propose a solution where an agent engages in multi-turn dialog with an expert for this purpose. Each mini-dialog focuses on a short natural language statement, and looks to elicit the expert’s desired schema-based interpretation of that statement, taking into account possible augmentations to the schema. The overall schema evolves by performing dialog over a collection of such statements. We take into account the probability that the expert does not respond to a query, and model this probability as a function of the complexity of the query. For such mini-dialogs with response uncertainty, we propose a dialog strategy that looks to elicit the schema over as short a dialog as possible. By combining the notion of uncertainty sampling from active learning with generalized binary search, the strategy asks the query with the highest expected reduction of entropy. We show that this significantly reduces dialog complexity while engaging the expert in meaningful dialog.