Leah Findlater

2020

pdf abs
Which Evaluations Uncover Sense Representations that Actually Make Sense?
Jordan Boyd-Graber | Fenfei Guo | Leah Findlater | Mohit Iyyer
Proceedings of the Twelfth Language Resources and Evaluation Conference

Text representations are critical for modern natural language processing. One form of text representation, sense-specific embeddings, reflect a word’s sense in a sentence better than single-prototype word embeddings tied to each type. However, existing sense representations are not uniformly better: although they work well for computer-centric evaluations, they fail for human-centric tasks like inspecting a language’s sense inventory. To expose this discrepancy, we propose a new coherence evaluation for sense embeddings. We also describe a minimal model (Gumbel Attention for Sense Induction) optimized for discovering interpretable sense representations that are more coherent than existing sense embeddings.

pdf abs
Interactive Refinement of Cross-Lingual Word Embeddings
Michelle Yuan | Mozhi Zhang | Benjamin Van Durme | Leah Findlater | Jordan Boyd-Graber
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Cross-lingual word embeddings transfer knowledge between languages: models trained on high-resource languages can predict in low-resource languages. We introduce CLIME, an interactive system to quickly refine cross-lingual word embeddings for a given classification problem. First, CLIME ranks words by their salience to the downstream task. Then, users mark similarity between keywords and their nearest neighbors in the embedding space. Finally, CLIME updates the embeddings using the annotations. We evaluate CLIME on identifying health-related text in four low-resource languages: Ilocano, Sinhalese, Tigrinya, and Uyghur. Embeddings refined by CLIME capture more nuanced word semantics and have higher test accuracy than the original embeddings. CLIME often improves accuracy faster than an active learning baseline and can be easily combined with active learning to improve results.

2019

pdf abs
Why Didn’t You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models
Varun Kumar | Alison Smith-Renner | Leah Findlater | Kevin Seppi | Jordan Boyd-Graber
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

To address the lack of comparative evaluation of Human-in-the-Loop Topic Modeling (HLTM) systems, we implement and evaluate three contrasting HLTM modeling approaches using simulation experiments. These approaches extend previously proposed frameworks, including constraints and informed prior-based methods. Users should have a sense of control in HLTM systems, so we propose a control metric to measure whether refinement operations’ results match users’ expectations. Informed prior-based methods provide better control than constraints, but constraints yield higher quality topics.

2017

pdf bib abs
Evaluating Visual Representations for Topic Understanding and Their Effects on Manually Generated Topic Labels
Alison Smith | Tak Yeon Lee | Forough Poursabzi-Sangdeh | Jordan Boyd-Graber | Niklas Elmqvist | Leah Findlater
Transactions of the Association for Computational Linguistics, Volume 5

Probabilistic topic models are important tools for indexing, summarizing, and analyzing large document collections by their themes. However, promoting end-user understanding of topics remains an open research problem. We compare labels generated by users given four topic visualization techniques—word lists, word lists with bars, word clouds, and network graphs—against each other and against automatically generated labels. Our basis of comparison is participant ratings of how well labels describe documents from the topic. Our study has two phases: a labeling phase where participants label visualized topics and a validation phase where different participants select which labels best describe the topics’ documents. Although all visualizations produce similar quality labels, simple visualizations such as word lists allow participants to quickly understand topics, while complex visualizations take longer but expose multi-word expressions that simpler visualizations obscure. Automatic labels lag behind user-created labels, but our dataset of manually labeled topics highlights linguistic patterns (e.g., hypernyms, phrases) that can be used to improve automatic topic labeling algorithms.

Leah Findlater

2020

2019

2017

2016

2014

Co-authors

Venues