Yanbin Hao


2025

pdf bib
Mixture of Multimodal Adapters for Sentiment Analysis
Kezhou Chen | Shuo Wang | Huixia Ben | Shengeng Tang | Yanbin Hao
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Pre-trained language model (PLM) have achieved great success in text sentiment analysis. However, in practical applications, sentiment is not only conveyed through language but also hidden in other modalities. Therefore, multimodal sentiment analysis (MSA) has attracted increasing research interest. Compared to text sentiment analysis, MSA is challenging since (1) emotions hidden in body movements or vocal timbres eclipse traditional analytical methods, and (2) transferring PLM to MSA task requires huge training parameters. Thus, to solve these issues, we introduce the Mixture of Multimodal Adapters (MMA) into the PLM. Specifically, we first design a mixture-of-multimodal-experts module to capture and fuse emotional movements from different data. Meanwhile, we use a compression parameter for each expert to reduce the training burden. We apply our method to two benchmark datasets and achieve state-of-the-art performance with a tiny trainable parameter count. For example, compared to the current state-of-the-art method, AcFormer, we only need 1/22 of its training parameters amount (130M6M) to achieve better results.

2020

pdf bib
Cross-sentence Pre-trained Model for Interactive QA matching
Jinmeng Wu | Yanbin Hao
Proceedings of the Twelfth Language Resources and Evaluation Conference

Semantic matching measures the dependencies between query and answer representations, it is an important criterion for evaluating whether the matching is successful. In fact, such matching does not examine each sentence individually, context information outside a sentence should be considered equally important to the syntactic context inside a sentence. We proposed a new QA matching model, built upon a cross-sentence context-aware architecture. An interactive attention mechanism with a pre-trained language model is proposed to automatically select salient positional answer representations that contribute more significantly to the answer relevance of a given question. In addition to the context information captured at each word position, we incorporate a new quantity of context information jump to facilitate the attention weight formulation. This reflects the amount of new information brought by the next word and is computed by modeling the joint probability between two adjacent word states. The proposed method is compared to multiple state-of-the-art ones evaluated using the TREC library, WikiQA, and the Yahoo! community question datasets. Experimental results show that the proposed method outperforms satisfactorily the competing ones.