Abstract
Individuals engaging in online communication frequently express personal opinions with informal styles (e.g., memes and emojis). While Language Models (LMs) with informal communications have been widely discussed, a unique and emphatic style, the Repetitive Lengthening Form (RLF), has been overlooked for years. In this paper, we explore answers to two research questions: 1) Is RLF important for SA? 2) Can LMs understand RLF? Inspired by previous linguistic research, we curate **Lengthening**, the first multi-domain dataset with 850k samples focused on RLF for sentiment analysis. Moreover, we introduce **Explnstruct**, a two-stage Explainable Instruction Tuning framework aimed at improving both the performance and explainability of LLMs for RLF. We further propose a novel unified approach to quantify LMs’ understanding of informal expressions. We show that RLF sentences are expressive expressions and can serve as signatures of document-level sentiment. Additionally, RLF has potential value for online content analysis. Our comprehensive results show that fine-tuned Pre-trained Language Models (PLMs) can surpass zero-shot GPT-4 in performance but not in explanation for RLF. Finally, we show ExpInstruct can improve the open-sourced LLMs to match zero-shot GPT-4 in performance and explainability for RLF with limited samples. Code and sample data are available at https://github.com/Tom-Owl/OverlookedRLF- Anthology ID:
- 2024.findings-emnlp.952
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 16225–16238
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-emnlp.952/
- DOI:
- 10.18653/v1/2024.findings-emnlp.952
- Cite (ACL):
- Lei Wang and Eduard Dragut. 2024. The Overlooked Repetitive Lengthening Form in Sentiment Analysis. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 16225–16238, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- The Overlooked Repetitive Lengthening Form in Sentiment Analysis (Wang & Dragut, Findings 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-emnlp.952.pdf