Yanbing Liu
2026
Two Streams, One Sarcasm: Orthogonal Expert Tuning for Holistic Multimodal Sarcasm Understanding
Diandian Guo | Cong Cao | Fangfang Yuan | Pin Xu | Cheng Hu | Zhicheng Zhang | Yu Liu | Yanbing Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Diandian Guo | Cong Cao | Fangfang Yuan | Pin Xu | Cheng Hu | Zhicheng Zhang | Yu Liu | Yanbing Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Multimodal Sarcasm Understanding (MSU) comprises multiple subtasks, demanding both incongruity perception and intent reasoning. However, this progress is impeded by two bottlenecks. First, the lack of a unified benchmark for holistic satirical cognition hinders comprehensive evaluation of MSU. Second, jointly modeling these heterogeneous subtasks often leads to feature entanglement. Specifically, while subtasks share a dependence on incongruity, they diverge in granular focus, causing specific execution patterns to erode the fundamental perception capability. To address these challenges, we make two contributions. First, we introduce DocMSU-PLUS, a comprehensive benchmark covering five cognitive dimensions of MSU. All tasks are reformulated into multiple-choice questions (MCQs), enabling a unified accuracy-based evaluation. Second, we propose the Dual Orthogonal Stream Experts (DOSE) framework. DOSE structurally decouples experts into orthogonal shared perception and private execution streams to physically block gradient interference between tasks. Experiments demonstrate that DOSE achieves superior performance on DocMSU-PLUS, effectively balancing general perception with task-specific adaptation.
LitVISTA: A Benchmark for Narrative Orchestration in Literary Text
Mingzhe Lu | Yiwen Wang | Yanbing Liu | Qi You | Chong Liu | Ruize Qin | Haoyu Dong | Wenyu Zhang | JiaRui Zhang | Yue Hu | Yunpeng Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Mingzhe Lu | Yiwen Wang | Yanbing Liu | Qi You | Chong Liu | Ruize Qin | Haoyu Dong | Wenyu Zhang | JiaRui Zhang | Yue Hu | Yunpeng Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Computational narrative analysis aims to capture rhythm, tension, and emotional dynamics in literary texts. Existing large language models can generate long stories but overly focus on causal coherence, neglecting the complex story arcs and orchestration inherent in human narratives. This suggests a structural misalignment between model- and human-generated narratives.We therefore position narrative analysis as a diagnostic proxy for generation and propose VISTA Space, a high-dimensional framework for narrative orchestration that unifies human and model perspectives while jointly characterizing narrative function and structure in a common space.We further introduce LitVISTA, a structurally annotated benchmark grounded in literary texts, which operationalizes VISTA Space for systematic evaluation of models’ narrative orchestration capabilities. Under an oracle setting with gold event anchors, we evaluate frontier LLMs including GPT, Claude, Grok, and Gemini. Results reveal systematic deficiencies, as current models struggle to jointly capture narrative function and structure and fail to form an integrated global view of literary narrative orchestration. End-to-end analysis further shows that failures are dominated by anchor identification and localization errors. Even advanced thinking modes yield mixed and often limited gains for literary narrative understanding.
2025
Can We Steer Reasoning Direction by Thinking Intervention?
Xingsheng Zhang | Luxi Xing | Chen Zhang | Yanbing Liu | Yifan Deng | Yunpeng Li | Yue Hu | Chenxu Niu
Findings of the Association for Computational Linguistics: EMNLP 2025
Xingsheng Zhang | Luxi Xing | Chen Zhang | Yanbing Liu | Yifan Deng | Yunpeng Li | Yue Hu | Chenxu Niu
Findings of the Association for Computational Linguistics: EMNLP 2025
Large Reason Models (LRMs) extend long reasoning process to solve complex tasks. However, due to the lack of fine-grained control, they often suffer from overthinking and erroneous reasoning problems, risking accuracy loss. To address this issue, we introduce Reasoning Direction Steering (RDS) to enable fine-grained control over LRMs’ reasoning behaviors by aligning reasoning trajectories with specific cognitive patterns. We develop a simple yet effective paradigm, Thinking Intervention, which explores two key dimensions - intervention positions and intervention styles - to achieve integration intervention throughout model reasoning processes. To validate the effectiveness of our approach, we conduct comprehensive experiments on multi-hop question answering tasks using state-of-the-art LRMs, including Qwen3-Series and R1-Series models. Experimental results demonstrate the efficacy of Thinking Intervention with 9.4% average improvement on R1-Series models and 1.9% improvement on Qwen3-Series models.
Emotion Transfer with Enhanced Prototype for Unseen Emotion Recognition in Conversation
Kun Peng | Cong Cao | Hao Peng | Guanlin Wu | Zhifeng Hao | Lei Jiang | Yanbing Liu | Philip S. Yu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Kun Peng | Cong Cao | Hao Peng | Guanlin Wu | Zhifeng Hao | Lei Jiang | Yanbing Liu | Philip S. Yu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Current Emotion Recognition in Conversation (ERC) research follows a closed-domain assumption. However, there is no clear consensus on emotion classification in psychology, which presents a challenge for models when it comes to recognizing previously unseen emotions in real-world applications. To bridge this gap, we introduce the Unseen Emotion Recognition in Conversation (UERC) task for the first time and propose **ProEmoTrans**, a solid prototype-based emotion transfer framework. This prototype-based approach shows promise but still faces key challenges: First, implicit expressions complicate emotion definition, which we address by proposing an LLM-enhanced description approach. Second, utterance encoding in long conversations is difficult, which we tackle with a proposed parameter-free mechanism for efficient encoding and overfitting prevention. Finally, the Markovian flow nature of emotions is hard to transfer, which we address with an improved Attention Viterbi Decoding (AVD) method to transfer seen emotion transitions to unseen emotions. Extensive experiments on three datasets show that our method serves as a strong baseline for preliminary exploration in this new area.
Multi-View Incongruity Learning for Multimodal Sarcasm Detection
Diandian Guo | Cong Cao | Fangfang Yuan | Yanbing Liu | Guangjie Zeng | Xiaoyan Yu | Hao Peng | Philip S. Yu
Proceedings of the 31st International Conference on Computational Linguistics
Diandian Guo | Cong Cao | Fangfang Yuan | Yanbing Liu | Guangjie Zeng | Xiaoyan Yu | Hao Peng | Philip S. Yu
Proceedings of the 31st International Conference on Computational Linguistics
Multimodal sarcasm detection (MSD) is essential for various downstream tasks. Existing MSD methods tend to rely on spurious correlations. These methods often mistakenly prioritize non-essential features yet still make correct predictions, demonstrating poor generalizability beyond training environments. Regarding this phenomenon, this paper undertakes several initiatives. Firstly, we identify two primary causes that lead to the reliance of spurious correlations. Secondly, we address these challenges by proposing a novel method that integrate Multimodal Incongruities via Contrastive Learning (MICL) for multimodal sarcasm detection. Specifically, we first leverage incongruity to drive multi-view learning from three views: token-patch, entity-object, and sentiment. Then, we introduce extensive data augmentation to mitigate the biased learning of the textual modality. Additionally, we construct a test set, SPMSD, which consists potential spurious correlations to evaluate the the model’s generalizability. Experimental results demonstrate the superiority of MICL on benchmark datasets, along with the analyses showcasing MICL’s advancement in mitigating the effect of spurious correlation.
2023
Mulan: A Multi-Level Alignment Model for Video Question Answering
Yu Fu | Cong Cao | Yuling Yang | Yuhai Lu | Fangfang Yuan | Dakui Wang | Yanbing Liu
Findings of the Association for Computational Linguistics: EMNLP 2023
Yu Fu | Cong Cao | Yuling Yang | Yuhai Lu | Fangfang Yuan | Dakui Wang | Yanbing Liu
Findings of the Association for Computational Linguistics: EMNLP 2023
Video Question Answering (VideoQA) aims to answer questions about the visual content of a video. Current methods mainly focus on improving joint representations of video and text. However, these methods pay little attention to the fine-grained semantic interaction between video and text. In this paper, we propose Mulan: a Multi-Level Alignment Model for Video Question Answering, which establishes alignment between visual and textual modalities at the object-level, frame-level, and video-level. Specifically, for object-level alignment, we propose a mask-guided visual feature encoding method and a visual-guided text description method to learn fine-grained spatial information. For frame-level alignment, we introduce the use of visual features from individual frames, combined with a caption generator, to learn overall spatial information within the scene. For video-level alignment, we propose an expandable ordinal prompt for textual descriptions, combined with visual features, to learn temporal information. Experimental results show that our method outperforms the state-of-the-art methods, even when utilizing the smallest amount of extra visual-language pre-training data and a reduced number of trainable parameters.
2021
Deep Differential Amplifier for Extractive Summarization
Ruipeng Jia | Yanan Cao | Fang Fang | Yuchen Zhou | Zheng Fang | Yanbing Liu | Shi Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Ruipeng Jia | Yanan Cao | Fang Fang | Yuchen Zhou | Zheng Fang | Yanbing Liu | Shi Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
For sentence-level extractive summarization, there is a disproportionate ratio of selected and unselected sentences, leading to flatting the summary features when maximizing the accuracy. The imbalanced classification of summarization is inherent, which can’t be addressed by common algorithms easily. In this paper, we conceptualize the single-document extractive summarization as a rebalance problem and present a deep differential amplifier framework. Specifically, we first calculate and amplify the semantic difference between each sentence and all other sentences, and then apply the residual unit as the second item of the differential amplifier to deepen the architecture. Finally, to compensate for the imbalance, the corresponding objective loss of minority class is boosted by a weighted cross-entropy. In contrast to previous approaches, this model pays more attention to the pivotal information of one sentence, instead of all the informative context modeling by recurrent or Transformer architecture. We demonstrate experimentally on two benchmark datasets that our summarizer performs competitively against state-of-the-art methods. Our source code will be available on Github.
Search
Fix author
Co-authors
- Cong Cao 4
- Fangfang Yuan 3
- Diandian Guo 2
- Yue Hu (胡月) 2
- Yunpeng Li 2
- Philip S. Yu 2
- Yanan Cao 1
- Yifan Deng 1
- Haoyu Dong 1
- Fang Fang 1
- Zheng Fang 1
- Yu Fu 1
- Zhifeng Hao 1
- Cheng Hu 1
- Ruipeng Jia 1
- Lei Jiang 1
- Chong Liu 1
- Yu Liu 1
- Mingzhe Lu 1
- Yuhai Lu 1
- Chenxu Niu 1
- Hao Peng 1
- Hao Peng 1
- Kun Peng 1
- Ruize Qin 1
- Dakui Wang 1
- Shi Wang 1
- Yiwen Wang 1
- Guanlin Wu 1
- Luxi Xing 1
- Pin Xu 1
- Yuling Yang 1
- Qi You 1
- Xiaoyan Yu 1
- Guangjie Zeng 1
- Chen Zhang 1
- JiaRui Zhang 1
- Wenyu Zhang 1
- Xingsheng Zhang 1
- Zhicheng Zhang 1
- Yuchen Zhou 1