Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study

Renzo Alva Principe, Marco Viviani, Nicola Chiarini


Abstract
52 Information Extraction (IE) is a key task in Natural Language Processing (NLP) that transforms unstructured text into structured data. This study compares human annotation, rule-based systems, and Large Language Models (LLMs) for domain-specific IE, focusing on real estate auction documents. We assess each method in terms of accuracy, scalability, and cost-efficiency, highlighting the associated trade-offs. Our findings provide valuable insights into the effectiveness of using LLMs for the considered task and, more broadly, offer guidance on how organizations can balance automation, maintainability, and performance when selecting the most suitable IE solution.
Anthology ID:
2025.ldk-1.25
Volume:
Proceedings of the 5th Conference on Language, Data and Knowledge
Month:
September
Year:
2025
Address:
Naples, Italy
Editors:
Mehwish Alam, Andon Tchechmedjiev, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venues:
LDK | WS
SIG:
Publisher:
Unior Press
Note:
Pages:
243–254
Language:
URL:
https://preview.aclanthology.org/ldl-25-ingestion/2025.ldk-1.25/
DOI:
Bibkey:
Cite (ACL):
Renzo Alva Principe, Marco Viviani, and Nicola Chiarini. 2025. Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study. In Proceedings of the 5th Conference on Language, Data and Knowledge, pages 243–254, Naples, Italy. Unior Press.
Cite (Informal):
Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study (Principe et al., LDK 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ldl-25-ingestion/2025.ldk-1.25.pdf