Jeremiah Milbauer


2023

pdf
LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction
Jeremiah Milbauer | Annie Louis | Mohammad Javad Hosseini | Alex Fabrikant | Donald Metzler | Tal Schuster
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Transformer encoders contextualize token representations by attending to all other tokens at each layer, leading to quadratic increase in compute effort with the input length. In practice, however, the input text of many NLP tasks can be seen as a sequence of related segments (e.g., the sequence of sentences within a passage, or the hypothesis and premise in NLI). While attending across these segments is highly beneficial for many tasks, we hypothesize that this interaction can be delayed until later encoding stages. To this end, we introduce Layer-Adjustable Interactions in Transformers (LAIT). Within LAIT, segmented inputs are first encoded independently, and then jointly. This partial two-tower architecture bridges the gap between a Dual Encoder’s ability to pre-compute representations for segments and a fully self-attentive Transformer’s capacity to model cross-segment attention. The LAIT framework effectively leverages existing pretrained Transformers and converts them into the hybrid of the two aforementioned architectures, allowing for easy and intuitive control over the performance-efficiency tradeoff. Experimenting on a wide range of NLP tasks, we find LAIT able to reduce 30-50% of the attention FLOPs on many tasks, while preserving high accuracy; in some practical settings, LAIT could reduce actual latency by orders of magnitude.

pdf
NewsSense: Reference-free Verification via Cross-document Comparison
Jeremiah Milbauer | Ziqi Ding | Zhijin Wu | Tongshuang Wu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present NewsSense, a novel sensemaking tool and reading interface designed to collect and integrate information from multiple news articles on a central topic. NewsSense provides “reference-free verification,” augmenting a central grounding article of the user’s choice by: (1) linking to related articles from different sources; and (2) providing inline highlights on how specific claims are either supported or contradicted by information from other articles. Using NewsSense, users can seamlessly digest and cross-check multiple information sources without disturbing their natural reading flow. Our pilot study shows that NewsSense has the potential to help users identify key information, verify the credibility of news articles, explore different perspectives, and understand what content is supported, contradicted, or missing.

2021

pdf
Aligning Multidimensional Worldviews and Discovering Ideological Differences
Jeremiah Milbauer | Adarsh Mathew | James Evans
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

The Internet is home to thousands of communities, each with their own unique worldview and associated ideological differences. With new communities constantly emerging and serving as ideological birthplaces, battlegrounds, and bunkers, it is critical to develop a framework for understanding worldviews and ideological distinction. Most existing work, however, takes a predetermined view based on political polarization: the “right vs. left” dichotomy of U.S. politics. In reality, both political polarization – and worldviews more broadly – transcend one-dimensional difference, and deserve a more complete analysis. Extending the ability of word embedding models to capture the semantic and cultural characteristics of their training corpora, we propose a novel method for discovering the multifaceted ideological and worldview characteristics of communities. Using over 1B comments collected from the largest communities on Reddit.com representing ~40% of Reddit activity, we demonstrate the efficacy of this approach to uncover complex ideological differences across multiple axes of polarization.