TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories

Giannis Karamanolakis; Jun Ma; Xin Luna Dong

doi:10.18653/v1/2020.acl-main.751

TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories

Giannis Karamanolakis, Jun Ma, Xin Luna Dong

Abstract

Extracting structured knowledge from product profiles is crucial for various applications in e-Commerce. State-of-the-art approaches for knowledge extraction were each designed for a single category of product, and thus do not apply to real-life e-Commerce scenarios, which often contain thousands of diverse categories. This paper proposes TXtract, a taxonomy-aware knowledge extraction model that applies to thousands of product categories organized in a hierarchical taxonomy. Through category conditional self-attention and multi-task learning, our approach is both scalable, as it trains a single model for thousands of categories, and effective, as it extracts category-specific attribute values. Experiments on products from a taxonomy with 4,000 categories show that TXtract outperforms state-of-the-art approaches by up to 10% in F1 and 15% in coverage across all categories.

Anthology ID:: 2020.acl-main.751
Volume:: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2020
Address:: Online
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8489–8502
Language:
URL:: https://aclanthology.org/2020.acl-main.751
DOI:: 10.18653/v1/2020.acl-main.751
Bibkey:
Cite (ACL):: Giannis Karamanolakis, Jun Ma, and Xin Luna Dong. 2020. TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8489–8502, Online. Association for Computational Linguistics.
Cite (Informal):: TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories (Karamanolakis et al., ACL 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/remove-xml-comments/2020.acl-main.751.pdf
Video:: http://slideslive.com/38929154

PDF Search Video