Steven H Wang

2025

Contract clause retrieval is foundational to contract drafting because lawyers rarely draft contracts from scratch; instead, they locate and revise the most relevant precedent clauses. We introduce the Atticus Clause Retrieval Dataset (ACORD), the first expert-annotated benchmark specifically designed for contract clause retrieval to support contract drafting tasks. ACORD focuses on complex contract clauses such as Limitation of Liability, Indemnification, Change of Control, and Most Favored Nation. It includes 114 queries and over 126,000 query-clause pairs, each ranked on a scale from 1 to 5 stars. The task is to find the most relevant precedent clauses to a query. The bi-encoder retriever paired with pointwise LLMs re-rankers shows promising results. However, substantial improvements are still needed to manage the complex legal work typically undertaken by lawyers effectively. As the first expert-annotated benchmark for contract clause retrieval, ACORD can serve as a valuable IR benchmark for the NLP community.

Co-authors

Roger Wattenhofer 1

Maksim Zubkov 1

Venues

acl1

Fix author