CAIDAS at SemEval-2025 Task 7: Enriching Sparse Datasets with LLM-Generated Content for Improved Information Retrieval
Dominik Benchert, Severin Meßlinger, Sven Goller, Jonas Kaiser, Jan Pfister, Andreas Hotho
Abstract
The focus of SemEval-2024 Task 7 is the retrieval of relevant fact-checks for social media posts across multiple languages. We approach this task with an enhanced bi-encoder retrieval setup, which is designed to match social media posts with relevant fact-checks using synthetic data from LLMs. We explored and analyzed two main approaches for generating synthetic posts. Either based on existing fact-checks or on existing posts. Our approach achieved an S@10 score of 89.53% for the monolingual task and 74.48% for the crosslingual task, ranking 16th out of 28 and 13th out of 29, respectively. Without data augmentation, scores would have been 88.69 (17th) and 72.93 (15th).- Anthology ID:
- 2025.semeval-1.214
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1623–1638
- Language:
- URL:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.214/
- DOI:
- Cite (ACL):
- Dominik Benchert, Severin Meßlinger, Sven Goller, Jonas Kaiser, Jan Pfister, and Andreas Hotho. 2025. CAIDAS at SemEval-2025 Task 7: Enriching Sparse Datasets with LLM-Generated Content for Improved Information Retrieval. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1623–1638, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- CAIDAS at SemEval-2025 Task 7: Enriching Sparse Datasets with LLM-Generated Content for Improved Information Retrieval (Benchert et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.214.pdf