Sarah Harrell


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting
Steven H Wang | Maksim Zubkov | Kexin Fan | Sarah Harrell | Yuyang Sun | Wei Chen | Andreas Plesner | Roger Wattenhofer
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Contract clause retrieval is foundational to contract drafting because lawyers rarely draft contracts from scratch; instead, they locate and revise the most relevant precedent clauses. We introduce the Atticus Clause Retrieval Dataset (ACORD), the first expert-annotated benchmark specifically designed for contract clause retrieval to support contract drafting tasks. ACORD focuses on complex contract clauses such as Limitation of Liability, Indemnification, Change of Control, and Most Favored Nation. It includes 114 queries and over 126,000 query-clause pairs, each ranked on a scale from 1 to 5 stars. The task is to find the most relevant precedent clauses to a query. The bi-encoder retriever paired with pointwise LLMs re-rankers shows promising results. However, substantial improvements are still needed to manage the complex legal work typically undertaken by lawyers effectively. As the first expert-annotated benchmark for contract clause retrieval, ACORD can serve as a valuable IR benchmark for the NLP community.