Leonardo Lezcano
2025
Cost-Effective E-Commerce Catalog Translation at Scale Ensuring Named Entity Protection
Asier Gutiérrez-Fandiño
|
Jorge Yero Salazar
|
Clement Ruin
|
Alejandro Quintero-Roba
|
Shangeetha Ravichandran
|
Jesus Perez-Martin
|
Pankaj Adsul
|
Suruchi Garg
|
Leonardo Lezcano
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
We present an enterprise-grade translation platform for global e-commerce that combines daily batch and real-time API pipelines with optimized T5-based models and a Reference Generator to enforce >99% non-translatable entity preservation. A linguist-driven rule engine and explainable evaluation framework (BLEU, COMET, and a custom e-commerce metric) enable continuous quality improvements. Deployed on GPU-accelerated inference servers and CPU-based processing nodes, our system processes millions of listings per day with sub-second latency and achieves 10×–100× cost savings over general-purpose LLMs for English→Spanish and English→French translation, all while version-tracking every update for robust enterprise rollouts.