Christopher Hahn


2025

pdf bib
LawInstruct: A Resource for Studying Language Model Adaptation to the Legal Domain
Joel Niklaus | Lucia Zheng | Arya D. McCarthy | Christopher Hahn | Brian M Rosen | Peter Henderson | Daniel E. Ho | Garrett Honke | Percy Liang | Christopher D Manning
Findings of the Association for Computational Linguistics: NAACL 2025

Instruction tuning is an important step in making language models useful for direct user interaction. However, the legal domain is underrepresented in typical instruction datasets (e.g., only 10 out of 1600+ tasks in Super-NaturalInstructions). To study whether instruction tuning on legal datasets is necessary for strong legal reasoning, we aggregate 58 annotated legal datasets and write instructions for each, creating LawInstruct. LawInstruct covers 17 global jurisdictions, 24 languages and a total of 12M examples across diverse tasks such as legal QA, summarization of court cases, and legal argument mining. We evaluate our models on LegalBench, measuring legal reasoning across five categories in 162 challenging and realistic legal tasks, and MMLU, to measure potential drops in general reasoning capabilities. We find that legal-specific instruction tuning on Flan-T5 – yielding FLawN-T5 – improves performance on LegalBench across all model sizes, with an aggregate increase of 15 points or 50% over Flan-T5 for the base size. No model size shows performance drops in MMLU. We publish LawInstruct as a resource for further study of instruction tuning in the legal domain.

pdf bib
Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection
Abhinav Lalwani | Tasha Kim | Lovish Chopra | Christopher Hahn | Zhijing Jin | Mrinmaya Sachan
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

Translating natural language into formal language such as First-Order Logic (FOL) is a foundational challenge in NLP with wide-ranging applications in automated reasoning, misinformation tracking, and knowledge validation. In this paper, we introduce Natural Language to First-Order Logic (NL2FOL), a framework to autoformalize natural language to FOL step-by-step using Large Language Models (LLMs). Our approach addresses key challenges in this translation process, including the integration of implicit background knowledge. By leveraging structured representations generated by NL2FOL, we use Satisfiability Modulo Theory (SMT) solvers to reason about the logical validity of natural language statements. We present logical fallacy detection as a case study to evaluate the efficacy of NL2FOL. Being neurosymbolic, our approach also provides interpretable insights into the reasoning process and demonstrates robustness without requiring model fine-tuning or labeled training data. Our framework achieves good performance on multiple datasets–on the Logic dataset, NL2FOL achieves an F1-score of 78%, while generalizing effectively to the LogicClimate dataset with an F1-score of 80%.