Improving Occupational ISCO Classification of Multilingual Swiss Job Postings with LLM-Refined Training Data

Ann-Sophie Gnehm, Simon Clematide


Abstract
Classifying occupations in multilingual job postings is challenging due to noisy labels, language variation, and domain-specific terminology. We present a method that refines silver-standard ISCO labels by consolidating them with predictions from pre-fine-tuned models, using large language model (LLM) evaluations to resolve discrepancies. The refined labels are used in Multiple Negatives Ranking (MNR) training for SentenceBERT-based classification. This approach substantially improves performance, raising Top-1 accuracy on silver data from 37.2% to 58.3% and reaching up to 80% precision on held-out data—an over 30-point gain validated by both GPT and human raters. The model benefits from cross-lingual transfer, with particularly strong gains in French and Italian. These results demonstrate hat LLM-guided label refinement can substantially improve multilingual occupation classification in fine-grained taxonomies such as CH-ISCO with 670 classes.
Anthology ID:
2025.findings-acl.1124
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
21834–21847
Language:
URL:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.1124/
DOI:
Bibkey:
Cite (ACL):
Ann-Sophie Gnehm and Simon Clematide. 2025. Improving Occupational ISCO Classification of Multilingual Swiss Job Postings with LLM-Refined Training Data. In Findings of the Association for Computational Linguistics: ACL 2025, pages 21834–21847, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Improving Occupational ISCO Classification of Multilingual Swiss Job Postings with LLM-Refined Training Data (Gnehm & Clematide, Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.1124.pdf