Learning Cross-Task Attribute - Attribute Similarity for Multi-task Attribute-Value Extraction
Mayank Jain, Sourangshu Bhattacharya, Harshit Jain, Karimulla Shaik, Muthusamy Chelliah
Abstract
Automatic extraction of product attribute-value pairs from unstructured text like product descriptions is an important problem for e-commerce companies. The attribute schema typically varies from one category of products (which will be referred as vertical) to another. This leads to extreme annotation efforts for training of supervised deep sequence labeling models such as LSTM-CRF, and consequently not enough labeled data for some vertical-attribute pairs. In this work, we propose a technique for alleviating this problem by using annotated data from related verticals in a multi-task learning framework. Our approach relies on availability of similar attributes (labels) in another related vertical. Our model jointly learns the similarity between attributes of the two verticals along with the model parameters for the sequence tagging model. The main advantage of our approach is that it does not need any prior annotation of attribute similarity. Our system has been tested with datasets of size more than 10000 from a large e-commerce company in India. We perform detailed experiments to show that our method indeed increases the macro-F1 scores for attribute value extraction in general, and for labels with low training data in particular. We also report top labels from other verticals that contribute towards learning of particular labels.- Anthology ID:
- 2021.ecnlp-1.10
- Volume:
- Proceedings of the 4th Workshop on e-Commerce and NLP
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Shervin Malmasi, Surya Kallumadi, Nicola Ueffing, Oleg Rokhlenko, Eugene Agichtein, Ido Guy
- Venue:
- ECNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 79–87
- Language:
- URL:
- https://aclanthology.org/2021.ecnlp-1.10
- DOI:
- 10.18653/v1/2021.ecnlp-1.10
- Cite (ACL):
- Mayank Jain, Sourangshu Bhattacharya, Harshit Jain, Karimulla Shaik, and Muthusamy Chelliah. 2021. Learning Cross-Task Attribute - Attribute Similarity for Multi-task Attribute-Value Extraction. In Proceedings of the 4th Workshop on e-Commerce and NLP, pages 79–87, Online. Association for Computational Linguistics.
- Cite (Informal):
- Learning Cross-Task Attribute - Attribute Similarity for Multi-task Attribute-Value Extraction (Jain et al., ECNLP 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2021.ecnlp-1.10.pdf