Joseph Giroux


2024

pdf
Explicit Attribute Extraction in e-Commerce Search
Robyn Loughnane | Jiaxin Liu | Zhilin Chen | Zhiqi Wang | Joseph Giroux | Tianchuan Du | Benjamin Schroeder | Weiyi Sun
Proceedings of the Seventh Workshop on e-Commerce and NLP @ LREC-COLING 2024

This paper presents a model architecture and training pipeline for attribute value extraction from search queries. The model uses weak labels generated from customer interactions to train a transformer-based NER model. A two-stage normalization process is then applied to deal with the problem of a large label space: first, the model output is normalized onto common generic attribute values, then it is mapped onto a larger range of actual product attribute values. This approach lets us successfully apply a transformer-based NER model to the extraction of a broad range of attribute values in a real-time production environment for e-commerce applications, contrary to previous research. In an online test, we demonstrate business value by integrating the model into a system for semantic product retrieval and ranking.