MICO: Selective Search with Mutual Information Co-training
Zhanyu Wang, Xiao Zhang, Hyokun Yun, Choon Hui Teo, Trishul Chilimbi
Abstract
In contrast to traditional exhaustive search, selective search first clusters documents into several groups before all the documents are searched exhaustively by a query, to limit the search executed within one group or only a few groups. Selective search is designed to reduce the latency and computation in modern large-scale search systems. In this study, we propose MICO, a Mutual Information CO-training framework for selective search with minimal supervision using the search logs. After training, MICO does not only cluster the documents, but also routes unseen queries to the relevant clusters for efficient retrieval. In our empirical experiments, MICO significantly improves the performance on multiple metrics of selective search and outperforms a number of existing competitive baselines.- Anthology ID:
- 2022.coling-1.102
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 1179–1192
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.102
- DOI:
- Cite (ACL):
- Zhanyu Wang, Xiao Zhang, Hyokun Yun, Choon Hui Teo, and Trishul Chilimbi. 2022. MICO: Selective Search with Mutual Information Co-training. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1179–1192, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- MICO: Selective Search with Mutual Information Co-training (Wang et al., COLING 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.coling-1.102.pdf
- Code
- aws/selective-search-with-mutual-information-cotraining