Joey Öhman


2023

pdf
Superlim: A Swedish Language Understanding Evaluation Benchmark
Aleksandrs Berdicevskis | Gerlof Bouma | Robin Kurtz | Felix Morger | Joey Öhman | Yvonne Adesam | Lars Borin | Dana Dannélls | Markus Forsberg | Tim Isbister | Anna Lindahl | Martin Malmsten | Faton Rekathati | Magnus Sahlgren | Elena Volodina | Love Börjeson | Simon Hengchen | Nina Tahmasebi
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We present Superlim, a multi-task NLP benchmark and analysis platform for evaluating Swedish language models, a counterpart to the English-language (Super)GLUE suite. We describe the dataset, the tasks, the leaderboard and report the baseline results yielded by a reference implementation. The tested models do not approach ceiling performance on any of the tasks, which suggests that Superlim is truly difficult, a desirable quality for a benchmark. We address methodological challenges, such as mitigating the Anglocentric bias when creating datasets for a less-resourced language; choosing the most appropriate measures; documenting the datasets and making the leaderboard convenient and transparent. We also highlight other potential usages of the dataset, such as, for instance, the evaluation of cross-lingual transfer learning.

pdf
DaLAJ-GED - a dataset for Grammatical Error Detection tasks on Swedish
Elena Volodina | Yousuf Ali Mohammed | Aleksandrs Berdicevskis | Gerlof Bouma | Joey Öhman
Proceedings of the 12th Workshop on NLP for Computer Assisted Language Learning

2022

pdf
Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish
Ariel Ekgren | Amaru Cuba Gyllensten | Evangelia Gogoulou | Alice Heiman | Severine Verlinden | Joey Öhman | Fredrik Carlsson | Magnus Sahlgren
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present GTP-SW3, a 3.5 billion parameter autoregressive language model, trained on a newly created 100 GB Swedish corpus. This paper provides insights with regards to data collection and training, while highlights the challenges of proper model evaluation. The results of quantitive evaluation through perplexity indicate that GPT-SW3 is a competent model in comparison with existing autoregressive models of similar size. Additionally, we perform an extensive prompting study which reveals the good text generation capabilities of GTP-SW3.

pdf
Fine-Grained Controllable Text Generation Using Non-Residual Prompting
Fredrik Carlsson | Joey Öhman | Fangyu Liu | Severine Verlinden | Joakim Nivre | Magnus Sahlgren
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The introduction of immensely large Causal Language Models (CLMs) has rejuvenated the interest in open-ended text generation. However, controlling the generative process for these Transformer-based models is at large an unsolved problem. Earlier work has explored either plug-and-play decoding strategies, or more powerful but blunt approaches such as prompting. There hence currently exists a trade-off between fine-grained control, and the capability for more expressive high-level instructions. To alleviate this trade-off, we propose an encoder-decoder architecture that enables intermediate text prompts at arbitrary time steps. We propose a resource-efficient method for converting a pre-trained CLM into this architecture, and demonstrate its potential on various experiments, including the novel task of contextualized word inclusion. Our method provides strong results on multiple experimental settings, proving itself to be both expressive and versatile.