Juan Pérez


2025

pdf bib
Shouth NLP at SemEval-2025 Task 7: Multilingual Fact-Checking Retrieval Using Contrastive Learning
Juan Pérez | Santiago Lares
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

We present a multilingual fact-checking re-trieval system for the SemEval-2025 task ofmatching social media posts with relevant factchecks. Our approach utilizes a contrastivelearning framework built on the multilingual E5model architecture, fine-tuned on the provideddataset. The system achieves a Success@10score of 0.867 on the official test set, with per-formance variations between languages. Wedemonstrate that input prefixes and language-specific corpus filtering significantly improveretrieval performance. Our analysis reveals in-teresting patterns in cross-lingual transfer, withspecifically strong results on Malaysian andThai languages. We make our code public forfurther research and development.