Li Zhang

AWS

Other people with similar names: Li Zhang (University of Pennsylvania), Li Zhang (Birmingham), Li Zhang (Google), Li Zhang (UC San Diego), Li Zhang (Newcastle, UK), Li Zhang (Teesside University), Li Zhang (Nankai), Li Zhang (Google), Li Zhang (UK), Li Zhang (IBM-china), Li Zhang (Wuhan)


2025

pdf bib
GAVEL: Generative Attribute-Value Extraction Using LLMs on LLM-Augmented Datasets
Pollawat Hongwimol | Dong Sheng | Li Zhang | Kai Liu | Xiufei Wang
Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing

In the evolving e-commerce landscape, accurate product attribute-value extraction is crucial for enhancing user experience and increasing sales. This paper introduces GAVEL, a generative approach leveraging large language models (LLMs) to augment training data for attribute extraction from diverse textual sources. Our method extracts over 1,000 unique attributes across 2,000 product categories in multiple Southeast Asian languages, including Thai, Vietnamese, and Indonesian. Rigorous evaluations show significant improvements in accuracy and coverage compared to seller-provided attributes, with enhanced recall and F1 scores. Additionally, GAVEL reduces operational costs by minimizing instruction token usage and improves inference speed. The results of the A/B testing indicate that our model has a positive impact on Gross Merchandise Value (GMV) per page view (PV) across all three operating countries. This research highlights the potential of generative techniques for optimizing attribute extraction in multi-language e-commerce applications.

2021

pdf bib
Complementary Evidence Identification in Open-Domain Question Answering
Xiangyang Mou | Mo Yu | Shiyu Chang | Yufei Feng | Li Zhang | Hui Su
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

This paper proposes a new problem of complementary evidence identification for open-domain question answering (QA). The problem aims to efficiently find a small set of passages that covers full evidence from multiple aspects as to answer a complex question. To this end, we proposes a method that learns vector representations of passages and models the sufficiency and diversity within the selected set, in addition to the relevance between the question and passages. Our experiments demonstrate that our method considers the dependence within the supporting evidence and significantly improves the accuracy of complementary evidence selection in QA domain.