Jim at SemEval-2025 Task 5: Multilingual BERT Ensemble

Jim Hahn


Abstract
The SemEval-2025 Task 5 calls for the utilization of LLM capabilities to apply controlled subject labels to record descriptions in the multilingual library collection of the German National Library of Science and Technology. The multilingual BERT ensemble system described herein produces subject labels for various record types, including articles, books, conference papers, reports, and theses. Results indicate that for English language article records, bidirectional encoder-only LLMs can achieve high recall in automated subject assignment.
Anthology ID:
2025.semeval-1.313
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2407–2412
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.313/
DOI:
Bibkey:
Cite (ACL):
Jim Hahn. 2025. Jim at SemEval-2025 Task 5: Multilingual BERT Ensemble. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2407–2412, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Jim at SemEval-2025 Task 5: Multilingual BERT Ensemble (Hahn, SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.313.pdf