The Lawyer That Never Thinks: Consistency and Fairness as Keys to Reliable AI
Dana R Alsagheer, Abdulrahman Kamal, Mohammad Kamal, Cosmo Yang Wu, Weidong Shi
Abstract
Large Language Models (LLMs) are increasingly used in high-stakes domains like law and research, yet their inconsistencies and response instability raise concerns about trustworthiness. This study evaluates six leading LLMs—GPT-3.5, GPT-4, Claude, Gemini, Mistral, and LLaMA 2—on rationality, stability, and ethical fairness through reasoning tests, legal challenges, and bias-sensitive scenarios. Results reveal significant inconsistencies, highlighting trade-offs between model scale, architecture, and logical coherence. These findings underscore the risks of deploying LLMs in legal and policy settings, emphasizing the need for AI systems that prioritize transparency, fairness, and ethical robustness.- Anthology ID:
- 2025.acl-long.491
- Volume:
- Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9943–9954
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.491/
- DOI:
- Cite (ACL):
- Dana R Alsagheer, Abdulrahman Kamal, Mohammad Kamal, Cosmo Yang Wu, and Weidong Shi. 2025. The Lawyer That Never Thinks: Consistency and Fairness as Keys to Reliable AI. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9943–9954, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- The Lawyer That Never Thinks: Consistency and Fairness as Keys to Reliable AI (Alsagheer et al., ACL 2025)
- PDF:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.491.pdf