Erick Cedeño
2026
pAtChWoRK: Patching the Pieces of Public Procurement Documents
Lorena Calvo-Bartolomé | Saúl Blanco Fortes | Erick Cedeño | Jerónimo Arenas-García
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Lorena Calvo-Bartolomé | Saúl Blanco Fortes | Erick Cedeño | Jerónimo Arenas-García
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Public procurement data is legally open, yet practically locked inside thousands of unstructured PDFs and inconsistent portal metadata. pAtChWoRK starts with these fragmented, unstructured sources then leverages a hybrid pipeline (traditional NLP with LLM-based technologies) to restructure this information into a navigable knowledge base. Specifically, pAtChWoRK corrects manual classification errors, extracts complex unstructured fields such as award and solvency criteria and tenders’ objectives, and assists users in easily navigating the tender landscape. This unified process enables more effective handling of the transparency bottlenecks that hinder competition and oversight in public administration. A user study with practitioners across different procurement