SEARCHER: Shared Embedding Architecture for Effective Retrieval
Joel Barry, Elizabeth Boschee, Marjorie Freedman, Scott Miller
Abstract
We describe an approach to cross lingual information retrieval that does not rely on explicit translation of either document or query terms. Instead, both queries and documents are mapped into a shared embedding space where retrieval is performed. We discuss potential advantages of the approach in handling polysemy and synonymy. We present a method for training the model, and give details of the model implementation. We present experimental results for two cases: Somali-English and Bulgarian-English CLIR.- Anthology ID:
- 2020.clssts-1.4
- Volume:
- Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020)
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Kathy McKeown, Douglas W. Oard, Elizabeth, Richard Schwartz
- Venue:
- CLSSTS
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 22–25
- Language:
- English
- URL:
- https://aclanthology.org/2020.clssts-1.4
- DOI:
- Cite (ACL):
- Joel Barry, Elizabeth Boschee, Marjorie Freedman, and Scott Miller. 2020. SEARCHER: Shared Embedding Architecture for Effective Retrieval. In Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020), pages 22–25, Marseille, France. European Language Resources Association.
- Cite (Informal):
- SEARCHER: Shared Embedding Architecture for Effective Retrieval (Barry et al., CLSSTS 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2020.clssts-1.4.pdf