SEARCHER: Shared Embedding Architecture for Effective Retrieval

Joel Barry, Elizabeth Boschee, Marjorie Freedman, Scott Miller


Abstract
We describe an approach to cross lingual information retrieval that does not rely on explicit translation of either document or query terms. Instead, both queries and documents are mapped into a shared embedding space where retrieval is performed. We discuss potential advantages of the approach in handling polysemy and synonymy. We present a method for training the model, and give details of the model implementation. We present experimental results for two cases: Somali-English and Bulgarian-English CLIR.
Anthology ID:
2020.clssts-1.4
Volume:
Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020)
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Kathy McKeown, Douglas W. Oard, Elizabeth, Richard Schwartz
Venue:
CLSSTS
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
22–25
Language:
English
URL:
https://aclanthology.org/2020.clssts-1.4
DOI:
Bibkey:
Cite (ACL):
Joel Barry, Elizabeth Boschee, Marjorie Freedman, and Scott Miller. 2020. SEARCHER: Shared Embedding Architecture for Effective Retrieval. In Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020), pages 22–25, Marseille, France. European Language Resources Association.
Cite (Informal):
SEARCHER: Shared Embedding Architecture for Effective Retrieval (Barry et al., CLSSTS 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2020.clssts-1.4.pdf