Abstract
In multimodal machine learning, additive late-fusion is a straightforward approach to combine the feature representations from different modalities, in which the final prediction can be formulated as the sum of unimodal predictions. While it has been found that certain late-fusion models can achieve competitive performance with lower computational costs compared to complex multimodal interactive models, how to effectively search for a good late-fusion model is still an open question. Moreover, for different modalities, the best unimodal models may work under significantly different learning rates due to the nature of the modality and the computational flow of the model; thus, selecting a global learning rate for late-fusion models can result in a vanishing gradient for some modalities. To help address these issues, we propose a Modality-Specific Learning Rate (MSLR) method to effectively build late-fusion multimodal models from fine-tuned unimodal models. We investigate three different strategies to assign learning rates to different modalities. Our experiments show that MSLR outperforms global learning rates on multiple tasks and settings, and enables the models to effectively learn each modality.- Anthology ID:
- 2022.findings-acl.143
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2022
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1824–1834
- Language:
- URL:
- https://aclanthology.org/2022.findings-acl.143
- DOI:
- 10.18653/v1/2022.findings-acl.143
- Cite (ACL):
- Yiqun Yao and Rada Mihalcea. 2022. Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1824–1834, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion (Yao & Mihalcea, Findings 2022)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2022.findings-acl.143.pdf
- Data
- MELD