Silver Retriever: Advancing Neural Passage Retrieval for Polish Question Answering

Piotr Rybak, Maciej Ogrodniczuk


Abstract
Modern open-domain question answering systems often rely on accurate and efficient retrieval components to find passages containing the facts necessary to answer the question. Recently, neural retrievers have gained popularity over lexical alternatives due to their superior performance. However, most of the work concerns popular languages such as English or Chinese. For others, such as Polish, few models are available. In this work, we present Silver Retriever, a neural retriever for Polish trained on a diverse collection of manually or weakly labeled datasets. Silver Retriever achieves much better results than other Polish models and is competitive with larger multilingual models. Together with the model, we open-source five new passage retrieval datasets.
Anthology ID:
2024.lrec-main.1291
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
14826–14831
Language:
URL:
https://aclanthology.org/2024.lrec-main.1291
DOI:
Bibkey:
Cite (ACL):
Piotr Rybak and Maciej Ogrodniczuk. 2024. Silver Retriever: Advancing Neural Passage Retrieval for Polish Question Answering. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 14826–14831, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Silver Retriever: Advancing Neural Passage Retrieval for Polish Question Answering (Rybak & Ogrodniczuk, LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.lrec-main.1291.pdf