Abstract
This paper presents text mining approaches on German-speaking job advertisements to enable social science research on the development of the labour market over the last 30 years. In order to build text mining applications providing information about profession and main task of a job, as well as experience and ICT skills needed, we experiment with transfer learning and domain adaptation. Our main contribution consists in building language models which are adapted to the domain of job advertisements, and their assessment on a broad range of machine learning problems. Our findings show the large value of domain adaptation in several respects. First, it boosts the performance of fine-tuned task-specific models consistently over all evaluation experiments. Second, it helps to mitigate rapid data shift over time in our special domain, and enhances the ability to learn from small updates with new, labeled task data. Third, domain-adaptation of language models is efficient: With continued in-domain pre-training we are able to outperform general-domain language models pre-trained on ten times more data. We share our domain-adapted language models and data with the research community.- Anthology ID:
- 2022.lrec-1.414
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 3892–3901
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.414
- DOI:
- Cite (ACL):
- Ann-Sophie Gnehm, Eva Bühlmann, and Simon Clematide. 2022. Evaluation of Transfer Learning and Domain Adaptation for Analyzing German-Speaking Job Advertisements. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3892–3901, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Evaluation of Transfer Learning and Domain Adaptation for Analyzing German-Speaking Job Advertisements (Gnehm et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2022.lrec-1.414.pdf