Cai Xu


2025

pdf bib
Cross-lingual Multimodal Sentiment Analysis for Low-Resource Languages via Language Family Disentanglement and Rethinking Transfer
Long Chen | Shuoyu Guan | Xiaohua Huang | Wen-Jing Wang | Cai Xu | Ziyu Guan | Wei Zhao
Findings of the Association for Computational Linguistics: ACL 2025

Existing multimodal sentiment analysis (MSA) methods have achieved significant success, leveraging cross-modal large-scale models (LLMs) and extensive pre-training data. However, these methods struggle to handle MSA tasks in low-resource languages. While multilingual LLMs enable cross-lingual transfer, they are limited to textual data and cannot address multimodal scenarios. To achieve MSA in low-resource languages, we propose a novel transfer learning framework named Language Family Disentanglement and Rethinking Transfer (LFD-RT). During pre-training, we establish cross-lingual and cross-modal alignments, followed by a language family disentanglement module that enhances the sharing of language universals within families while reducing noise from cross-family alignments. We propose a rethinking strategy for unsupervised fine-tuning that adapts the pre-trained model to MSA tasks in low-resource languages. Experimental results demonstrate the superiority of our method and its strong language-transfer capability on target low-resource languages. We commit to making our code and data publicly available, and the access link will be provided here.

2024

pdf bib
SC2: Towards Enhancing Content Preservation and Style Consistency in Long Text Style Transfer
Jie Zhao | Ziyu Guan | Cai Xu | Wei Zhao | Yue Jiang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Text style transfer (TST) aims to vary the style polarity of text while preserving the semantic content. Although recent advancements have demonstrated remarkable progress in short TST, it remains a relatively straightforward task with limited practical applications. The more comprehensive long TST task presents two challenges: (1) existing methods encounter difficulties in accurately evaluating content attributes in multiple words, leading to content degradation; (2) the conventional vanilla style classifier loss encounters obstacles in maintaining consistent style across multiple generated sentences.In this paper, we propose a novel method SC2, where a multilayer Joint Style-Content Weighed (JSCW) module and a Style Consistency loss are designed to address the two issues. The JSCW simultaneously assesses the amounts of style and content attributes within a token, aiming to acquire a lossless content representation and thereby enhancing content preservation. The multiple JSCW layers further progressively refine content representations. We design a style consistency loss to ensure the generated multiple sentences consistently reflect the target style polarity. Moreover, we incorporate a denoising non-autoregressive decoder to accelerate the training. We conduct plentiful experiments and the results show significant improvements of SC2 over competitive baselines. Our code: https://github.com/jiezhao6/SC2.