Disentangled Learning with Synthetic Parallel Data for Text Style Transfer
Jingxuan Han, Quan Wang, Zikang Guo, Benfeng Xu, Licheng Zhang, Zhendong Mao
Abstract
Text style transfer (TST) is an important task in natural language generation, which aims to transfer the text style (e.g., sentiment) while keeping its semantic information. Due to the absence of parallel datasets for supervision, most existing studies have been conducted in an unsupervised manner, where the generated sentences often suffer from high semantic divergence and thus low semantic preservation. In this paper, we propose a novel disentanglement-based framework for TST named DisenTrans, where disentanglement means that we separate the attribute and content components in the natural language corpus and consider this task from these two perspectives. Concretely, we first create a disentangled Chain-of-Thought prompting procedure to synthesize parallel data and corresponding attribute components for supervision. Then we develop a disentanglement learning method with synthetic data, where two losses are designed to enhance the focus on attribute properties and constrain the semantic space, thereby benefiting style control and semantic preservation respectively. Instructed by the disentanglement concept, our framework creates valuable supervised information and utilizes it effectively in TST tasks. Extensive experiments on mainstream datasets present that our framework achieves significant performance with great sample efficiency.- Anthology ID:
- 2024.acl-long.811
- Volume:
- Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 15187–15201
- Language:
- URL:
- https://aclanthology.org/2024.acl-long.811
- DOI:
- 10.18653/v1/2024.acl-long.811
- Cite (ACL):
- Jingxuan Han, Quan Wang, Zikang Guo, Benfeng Xu, Licheng Zhang, and Zhendong Mao. 2024. Disentangled Learning with Synthetic Parallel Data for Text Style Transfer. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15187–15201, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- Disentangled Learning with Synthetic Parallel Data for Text Style Transfer (Han et al., ACL 2024)
- PDF:
- https://preview.aclanthology.org/landing_page/2024.acl-long.811.pdf