Low-Resource Text Style Transfer for Bangla: Data & Models
Sourabrata Mukherjee, Akanksha Bansal, Pritha Majumdar, Atul Kr. Ojha, Ondřej Dušek
Abstract
Text style transfer (TST) involves modifying the linguistic style of a given text while retaining its core content. This paper addresses the challenging task of text style transfer in the Bangla language, which is low-resourced in this area. We present a novel Bangla dataset that facilitates text sentiment transfer, a subtask of TST, enabling the transformation of positive sentiment sentences to negative and vice versa. To establish a high-quality base for further research, we refined and corrected an existing English dataset of 1,000 sentences for sentiment transfer based on Yelp reviews, and we introduce a new human-translated Bangla dataset that parallels its English counterpart. Furthermore, we offer multiple benchmark models that serve as a validation of the dataset and baseline for further research.- Anthology ID:
- 2023.banglalp-1.5
- Volume:
- Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Farig Sadeque, Ruhul Amin
- Venue:
- BanglaLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 34–47
- Language:
- URL:
- https://aclanthology.org/2023.banglalp-1.5
- DOI:
- 10.18653/v1/2023.banglalp-1.5
- Cite (ACL):
- Sourabrata Mukherjee, Akanksha Bansal, Pritha Majumdar, Atul Kr. Ojha, and Ondřej Dušek. 2023. Low-Resource Text Style Transfer for Bangla: Data & Models. In Proceedings of the First Workshop on Bangla Language Processing (BLP-2023), pages 34–47, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Low-Resource Text Style Transfer for Bangla: Data & Models (Mukherjee et al., BanglaLP 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.banglalp-1.5.pdf