SemEval-2025 Task 5: LLMs4Subjects - LLM-based Automated Subject Tagging for a National Technical Library’s Open-Access Catalog

Jennifer D’Souza, Sameer Sadruddin, Holger Israel, Mathias Begoin, Diana Slawig


Abstract
We present SemEval-2025 Task 5: LLMs4Subjects, a shared task on automated subject tagging for scientific and technical records in English and German using the GND taxonomy. Participants developed LLM-based systems to recommend top-k subjects, evaluated through quantitative metrics (precision, recall, F1-score) and qualitative assessments by subject specialists. Results highlight the effectiveness of LLM ensembles, synthetic data generation, and multilingual processing, offering insights into applying LLMs for digital library classification.
Anthology ID:
2025.semeval-1.328
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2570–2583
Language:
URL:
https://preview.aclanthology.org/more-markup/2025.semeval-1.328/
DOI:
Bibkey:
Cite (ACL):
Jennifer D’Souza, Sameer Sadruddin, Holger Israel, Mathias Begoin, and Diana Slawig. 2025. SemEval-2025 Task 5: LLMs4Subjects - LLM-based Automated Subject Tagging for a National Technical Library’s Open-Access Catalog. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2570–2583, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
SemEval-2025 Task 5: LLMs4Subjects - LLM-based Automated Subject Tagging for a National Technical Library’s Open-Access Catalog (D’Souza et al., SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/more-markup/2025.semeval-1.328.pdf