Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction

Junjie Li; Jianfei Yu; Rui Xia

doi:10.18653/v1/2022.naacl-main.312

Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction

Abstract

As a fundamental task in opinion mining, aspect and opinion co-extraction aims to identify the aspect terms and opinion terms in reviews. However, due to the lack of fine-grained annotated resources, it is hard to train a robust model for many domains. To alleviate this issue, unsupervised domain adaptation is proposed to transfer knowledge from a labeled source domain to an unlabeled target domain. In this paper, we propose a new Generative Cross-Domain Data Augmentation framework for unsupervised domain adaptation. The proposed framework is aimed to generate target-domain data with fine-grained annotation by exploiting the labeled data in the source domain. Specifically, we remove the domain-specific segments in a source-domain labeled sentence, and then use this as input to a pre-trained sequence-to-sequence model BART to simultaneously generate a target-domain sentence and predict the corresponding label for each word. Experimental results on three datasets demonstrate that our approach is more effective than previous domain adaptation methods.

Anthology ID:: 2022.naacl-main.312
Volume:: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4219–4229
Language:
URL:: https://preview.aclanthology.org/landing_page/2022.naacl-main.312/
DOI:: 10.18653/v1/2022.naacl-main.312
Bibkey:
Cite (ACL):: Junjie Li, Jianfei Yu, and Rui Xia. 2022. Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4219–4229, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction (Li et al., NAACL 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2022.naacl-main.312.pdf
Video:: https://preview.aclanthology.org/landing_page/2022.naacl-main.312.mp4
Code: nustm/gcdda

PDF Cite Search Code Video Fix data