Abstract
In text ranking, it is generally believed that the cross-encoders already gather sufficient token interaction information via the attention mechanism in the hidden layers. However, our results show that the cross-encoders can consistently benefit from additional token interaction in the similarity computation at the last layer. We introduce CELI (Cross-Encoder with Late Interaction), which incorporates a late interaction layer into the current cross-encoder models. This simple method brings 5% improvement on BEIR without compromising in-domain effectiveness or search latency. Extensive experiments show that this finding is consistent across different sizes of the cross-encoder models and the first-stage retrievers. Our findings suggest that boiling all information into the [CLS] token is a suboptimal use for cross-encoders, and advocate further studies to investigate its relevance score mechanism.- Anthology ID:
- 2024.naacl-short.16
- Volume:
- Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Kevin Duh, Helena Gomez, Steven Bethard
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 188–196
- Language:
- URL:
- https://aclanthology.org/2024.naacl-short.16
- DOI:
- Cite (ACL):
- Crystina Zhang, Minghan Li, and Jimmy Lin. 2024. CELI: Simple yet Effective Approach to Enhance Out-of-Domain Generalization of Cross-Encoders.. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers), pages 188–196, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- CELI: Simple yet Effective Approach to Enhance Out-of-Domain Generalization of Cross-Encoders. (Zhang et al., NAACL 2024)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2024.naacl-short.16.pdf