Julia Wunderle

2025

pdf bib abs
LLäMmlein: Transparent, Compact and Competitive German-Only Language Models from Scratch
Jan Pfister | Julia Wunderle | Andreas Hotho
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We transparently create two German-only decoder models, LLäMmlein 120M and 1B, from scratch and publish them, along with the training data, for the (German) NLP research community to use. The model training involved several key steps, including data preprocessing/filtering, the creation of a German tokenizer, the training itself, as well as the evaluation of the final models on various benchmarks, also against existing models. Throughout the training process, multiple checkpoints were saved in equal intervals and analyzed using the German SuperGLEBer benchmark to gain insights into the models’ learning process.Compared to state-of-the-art models on the SuperGLEBer benchmark, both LLäMmlein models performed competitively, consistently matching or surpassing models with similar parameter sizes. The results show that the models’ quality scales with size as expected, but performance improvements on some tasks plateaued early during training, offering valuable insights into resource allocation for future models.

2024

pdf bib abs
OtterlyObsessedWithSemantics at SemEval-2024 Task 4: Developing a Hierarchical Multi-Label Classification Head for Large Language Models
Julia Wunderle | Julian Schubert | Antonella Cacciatore | Albin Zehe | Jan Pfister | Andreas Hotho
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

For our submission for Subtask 1, we developed a custom classification head that is designed to be applied atop of a Large Language Model. We reconstructed the hierarchy across multiple fully connected layers, allowing us to incorporate previous foundational decisions in subsequent, more fine-grained layers. To find the best hyperparameters, we conducted a grid-search and to compete in the multilingual setting, we translated all documents to English.

2023

pdf bib
Pointer Networks: A Unified Approach to Extracting German Opinions
Julia Wunderle | Jan Pfister | Andreas Hotho
Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)

Co-authors

Venues

Fix author