Clara Seyfried


Fixing paper assignments

  1. Please select all papers that do not belong to this person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Automating Alternative Generation in Decision-Making
Yevhen Kostiuk | Clara Seyfried | Chris Reed
Findings of the Association for Computational Linguistics: EMNLP 2025

In decision making, generating alternative solutions is crucial for solving a problem. However, cognitive biases can impede this process by constraining individual decision makers’ creativity. To address this issue, we introduce a new task for automatically generating alternatives, inspired by the process of human “brainstorming”. We define alternative options based on atomic action components and present a dataset of 106 annotated Reddit r/Advice posts containing unique alternative options extracted from users’ replies. We also introduce new metrics to assess the quality of generated components, including distinctiveness, creativity, upvote-weighted, crowd intersection, and final commit intersection scores. As a baseline, we evaluated the large language models (LLMs) LLaMa3:8b, LLaMa3.1:8b, and Gemma 2:9b on the alternative component generation task. On the one hand, models demonstrated high creativity (ability to generate options beyond what Reddit users suggested) and performed well at proposing distinct alternatives. A subset of generated components was manually evaluated and found overall useful. This indicates that LLMs might be used to extend lists of alternative options, helping decision makers consider a problem from different perspectives. On the other hand, LLMs’ outputs often failed to align with human suggestions, implying that they still tend to miss important components.