AttributeForge: An Agentic LLM Framework for Automated Product Schema Modeling

Yunhan Huang, Klevis Ramo, Andrea Iovine, Melvin Monteiro, Sedat Gokalp, Arjun Bakshi, Hasan Turalic, Arsh Kumar, Jona Neumeier, Ripley Yates, Rejaul Monir, Simon Hartmann, Tushar Manglik, Mohamed Yakout


Abstract
Effective product schema modeling is fundamental to e-commerce success, enabling accurate product discovery and superior customer experience. However, traditional manual schema modeling processes are severely bottlenecked, producing only tens of attributes per month, which is insufficient for modern e-commerce platforms managing thousands of product types. This paper introduces AttributeForge, the first framework to automate end-to-end product schema modeling using Large Language Models (LLMs). Our key innovation lies in orchestrating 43 specialized LLM agents through strategic workflow patterns to handle the complex interdependencies in schema generation. The framework incorporates two novel components: MC2-Eval, a comprehensive validation system that assesses schemas against technical, business, and customer experience requirements; and AutoFix, an intelligent mechanism that automatically corrects modeling defects through iterative refinement. Deployed in production, AttributeForge achieves an 88× increase in modeling throughput while delivering superior quality: a 59.83% Good-to-Good (G2G) conversion rate compared to 37.50% for manual approaches. This significant improvement in both speed and quality enables e-commerce platforms to rapidly adapt their product schemas to evolving market needs.
Anthology ID:
2025.emnlp-industry.148
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
November
Year:
2025
Address:
Suzhou (China)
Editors:
Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2106–2121
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.148/
DOI:
Bibkey:
Cite (ACL):
Yunhan Huang, Klevis Ramo, Andrea Iovine, Melvin Monteiro, Sedat Gokalp, Arjun Bakshi, Hasan Turalic, Arsh Kumar, Jona Neumeier, Ripley Yates, Rejaul Monir, Simon Hartmann, Tushar Manglik, and Mohamed Yakout. 2025. AttributeForge: An Agentic LLM Framework for Automated Product Schema Modeling. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 2106–2121, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):
AttributeForge: An Agentic LLM Framework for Automated Product Schema Modeling (Huang et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.148.pdf