Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction

Junjie Li, Jianfei Yu, Rui Xia


Abstract
As a fundamental task in opinion mining, aspect and opinion co-extraction aims to identify the aspect terms and opinion terms in reviews. However, due to the lack of fine-grained annotated resources, it is hard to train a robust model for many domains. To alleviate this issue, unsupervised domain adaptation is proposed to transfer knowledge from a labeled source domain to an unlabeled target domain. In this paper, we propose a new Generative Cross-Domain Data Augmentation framework for unsupervised domain adaptation. The proposed framework is aimed to generate target-domain data with fine-grained annotation by exploiting the labeled data in the source domain. Specifically, we remove the domain-specific segments in a source-domain labeled sentence, and then use this as input to a pre-trained sequence-to-sequence model BART to simultaneously generate a target-domain sentence and predict the corresponding label for each word. Experimental results on three datasets demonstrate that our approach is more effective than previous domain adaptation methods.
Anthology ID:
2022.naacl-main.312
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4219–4229
Language:
URL:
https://aclanthology.org/2022.naacl-main.312
DOI:
10.18653/v1/2022.naacl-main.312
Bibkey:
Cite (ACL):
Junjie Li, Jianfei Yu, and Rui Xia. 2022. Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4219–4229, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction (Li et al., NAACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2022.naacl-main.312.pdf
Video:
 https://preview.aclanthology.org/landing_page/2022.naacl-main.312.mp4
Code
 nustm/gcdda