Also published as: Christopher Re


Metadata Shaping: A Simple Approach for Knowledge-Enhanced Language Models
Simran Arora | Sen Wu | Enci Liu | Christopher Re
Findings of the Association for Computational Linguistics: ACL 2022

Popular language models (LMs) struggle to capture knowledge about rare tail facts and entities. Since widely used systems such as search and personal-assistants must support the long tail of entities that users ask about, there has been significant effort towards enhancing these base LMs with factual knowledge. We observe proposed methods typically start with a base LM and data that has been annotated with entity metadata, then change the model, by modifying the architecture or introducing auxiliary loss terms to better capture entity knowledge. In this work, we question this typical process and ask to what extent can we match the quality of model modifications, with a simple alternative: using a base LM and only changing the data. We propose metadata shaping, a method which inserts substrings corresponding to the readily available entity metadata, e.g. types and descriptions, into examples at train and inference time based on mutual information. Despite its simplicity, metadata shaping is quite effective. On standard evaluation benchmarks for knowledge-enhanced LMs, the method exceeds the base-LM baseline by an average of 4.3 F1 points and achieves state-of-the-art results. We further show the gains are on average 4.4x larger for the slice of examples containing tail vs. popular entities.

TABi: Type-Aware Bi-Encoders for Open-Domain Entity Retrieval
Megan Leszczynski | Daniel Fu | Mayee Chen | Christopher Re
Findings of the Association for Computational Linguistics: ACL 2022

Entity retrieval—retrieving information about entity mentions in a query—is a key step in open-domain tasks, such as question answering or fact checking. However, state-of-the-art entity retrievers struggle to retrieve rare entities for ambiguous mentions due to biases towards popular entities. Incorporating knowledge graph types during training could help overcome popularity biases, but there are several challenges: (1) existing type-based retrieval methods require mention boundaries as input, but open-domain tasks run on unstructured text, (2) type-based methods should not compromise overall performance, and (3) type-based methods should be robust to noisy and missing types. In this work, we introduce TABi, a method to jointly train bi-encoders on knowledge graph types and unstructured text for entity retrieval for open-domain tasks. TABi leverages a type-enforced contrastive loss to encourage entities and queries of similar types to be close in the embedding space. TABi improves retrieval of rare entities on the Ambiguous Entity Retrieval (AmbER) sets, while maintaining strong overall retrieval performance on open-domain tasks in the KILT benchmark compared to state-of-the-art retrievers. TABi is also robust to incomplete type systems, improving rare entity retrieval over baselines with only 5% type coverage of the training dataset. We make our code publicly available.


Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text
Maya Varma | Laurel Orr | Sen Wu | Megan Leszczynski | Xiao Ling | Christopher Ré
Findings of the Association for Computational Linguistics: EMNLP 2021

Named entity disambiguation (NED), which involves mapping textual mentions to structured entities, is particularly challenging in the medical domain due to the presence of rare entities. Existing approaches are limited by the presence of coarse-grained structural resources in biomedical knowledge bases as well as the use of training datasets that provide low coverage over uncommon resources. In this work, we address these issues by proposing a cross-domain data integration method that transfers structural knowledge from a general text knowledge base to the medical domain. We utilize our integration scheme to augment structural resources and generate a large biomedical NED dataset for pretraining. Our pretrained model with injected structural knowledge achieves state-of-the-art performance on two benchmark medical NED datasets: MedMentions and BC5CDR. Furthermore, we improve disambiguation of rare entities by up to 57 accuracy points.

Robustness Gym: Unifying the NLP Evaluation Landscape
Karan Goel | Nazneen Fatema Rajani | Jesse Vig | Zachary Taschdjian | Mohit Bansal | Christopher Ré
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations

Despite impressive performance on standard benchmarks, natural language processing (NLP) models are often brittle when deployed in real-world systems. In this work, we identify challenges with evaluating NLP systems and propose a solution in the form of Robustness Gym (RG), a simple and extensible evaluation toolkit that unifies 4 standard evaluation paradigms: subpopulations, transformations, evaluation sets, and adversarial attacks. By providing a common platform for evaluation, RG enables practitioners to compare results from disparate evaluation paradigms with a single click, and to easily develop and share novel evaluation methods using a built-in set of abstractions. RG is under active development and we welcome feedback & contributions from the community.

Goodwill Hunting: Analyzing and Repurposing Off-the-Shelf Named Entity Linking Systems
Karan Goel | Laurel Orr | Nazneen Fatema Rajani | Jesse Vig | Christopher Ré
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

Named entity linking (NEL) or mapping “strings” to “things” in a knowledge base is a fundamental preprocessing step in systems that require knowledge of entities such as information extraction and question answering. In this work, we lay out and investigate two challenges faced by individuals or organizations building NEL systems. Can they directly use an off-the-shelf system? If not, how easily can such a system be repurposed for their use case? First, we conduct a study of off-the-shelf commercial and academic NEL systems. We find that most systems struggle to link rare entities, with commercial solutions lagging their academic counterparts by 10%+. Second, for a use case where the NEL model is used in a sports question-answering (QA) system, we investigate how to close the loop in our analysis by repurposing the best off-the-shelf model (Bootleg) to correct sport-related errors. We show how tailoring a simple technique for patching models using weak labeling can provide a 25% absolute improvement in accuracy of sport-related errors.


Contextual Embeddings: When Are They Worth It?
Simran Arora | Avner May | Jian Zhang | Christopher Ré
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We study the settings for which deep contextual embeddings (e.g., BERT) give large improvements in performance relative to classic pretrained embeddings (e.g., GloVe), and an even simpler baseline—random word embeddings—focusing on the impact of the training set size and the linguistic properties of the task. Surprisingly, we find that both of these simpler baselines can match contextual embeddings on industry-scale data, and often perform within 5 to 10% accuracy (absolute) on benchmark tasks. Furthermore, we identify properties of data for which contextual embeddings give particularly large gains: language containing complex structure, ambiguous word usage, and words unseen in training.

Low-Dimensional Hyperbolic Knowledge Graph Embeddings
Ines Chami | Adva Wolf | Da-Cheng Juan | Frederic Sala | Sujith Ravi | Christopher Ré
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Knowledge graph (KG) embeddings learn low- dimensional representations of entities and relations to predict missing facts. KGs often exhibit hierarchical and logical patterns which must be preserved in the embedding space. For hierarchical data, hyperbolic embedding methods have shown promise for high-fidelity and parsimonious representations. However, existing hyperbolic embedding methods do not account for the rich logical patterns in KGs. In this work, we introduce a class of hyperbolic KG embedding models that simultaneously capture hierarchical and logical patterns. Our approach combines hyperbolic reflections and rotations with attention to model complex relational patterns. Experimental results on standard KG benchmarks show that our method improves over previous Euclidean- and hyperbolic-based efforts by up to 6.1% in mean reciprocal rank (MRR) in low dimensions. Furthermore, we observe that different geometric transformations capture different types of relations while attention- based transformations generalize to multiple relations. In high dimensions, our approach yields new state-of-the-art MRRs of 49.6% on WN18RR and 57.7% on YAGO3-10.


Training Classifiers with Natural Language Explanations
Braden Hancock | Paroma Varma | Stephanie Wang | Martin Bringmann | Percy Liang | Christopher Ré
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100 faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.


Understanding Tables in Context Using Standard NLP Toolkits
Vidhya Govindaraju | Ce Zhang | Christopher Ré
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)


Big Data versus the Crowd: Looking for Relationships in All the Right Places
Ce Zhang | Feng Niu | Christopher Ré | Jude Shavlik
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)