Dynamic Routing Transformer Network for Multimodal Sarcasm Detection

Yuan Tian; Nan Xu; Ruike Zhang; Wenji Mao

doi:10.18653/v1/2023.acl-long.139

Dynamic Routing Transformer Network for Multimodal Sarcasm Detection

Yuan Tian, Nan Xu, Ruike Zhang, Wenji Mao

Abstract

Multimodal sarcasm detection is an important research topic in natural language processing and multimedia computing, and benefits a wide range of applications in multiple domains. Most existing studies regard the incongruity between image and text as the indicative clue in identifying multimodal sarcasm. To capture cross-modal incongruity, previous methods rely on fixed architectures in network design, which restricts the model from dynamically adjusting to diverse image-text pairs. Inspired by routing-based dynamic network, we model the dynamic mechanism in multimodal sarcasm detection and propose the Dynamic Routing Transformer Network (DynRT-Net). Our method utilizes dynamic paths to activate different routing transformer modules with hierarchical co-attention adapting to cross-modal incongruity. Experimental results on a public dataset demonstrate the effectiveness of our method compared to the state-of-the-art methods. Our codes are available at https://github.com/TIAN-viola/DynRT.

Anthology ID:: 2023.acl-long.139
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2468–2480
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2023.acl-long.139/
DOI:: 10.18653/v1/2023.acl-long.139
Bibkey:
Cite (ACL):: Yuan Tian, Nan Xu, Ruike Zhang, and Wenji Mao. 2023. Dynamic Routing Transformer Network for Multimodal Sarcasm Detection. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2468–2480, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Dynamic Routing Transformer Network for Multimodal Sarcasm Detection (Tian et al., ACL 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2023.acl-long.139.pdf
Video:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2023.acl-long.139.mp4

PDF Cite Search Video Fix data