Gustav Eje Henter
2026
How Much Data Is Enough Data? A New Motion Capture Corpus for Probabilistic Sign Language Generation
Anna Klezovich | Johanna Mesch | Gustav Eje Henter | Jonas Beskow
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Anna Klezovich | Johanna Mesch | Gustav Eje Henter | Jonas Beskow
Proceedings of the Fifteenth Language Resources and Evaluation Conference
We present a new 4.1 hours long high-quality motion capture sign language dataset for Swedish Sign Language — STS Mocap v1. The dataset consists of high quality multimodal data: body tracked with markers, fingers tracked with Manus Quantum Metagloves, face tracked with iPhone LiveLink app in MetaHuman Animator mode, and corresponding textual sentence translation to spoken Swedish. With the help of this dataset, we show that four hours of motion capture data is enough for generative modeling of sign language conditioned on 2D pose. In comparison, training the same flow-matching model on only 30 minutes of this data, which is a common size for sign language motion capture datasets, shows a significant degradation in the quality of the synthesized data.
2024
Exploring Internal Numeracy in Language Models: A Case Study on ALBERT
Ulme Wennberg | Gustav Eje Henter
Proceedings of the 2nd Workshop on Mathematical Natural Language Processing @ LREC-COLING 2024
Ulme Wennberg | Gustav Eje Henter
Proceedings of the 2nd Workshop on Mathematical Natural Language Processing @ LREC-COLING 2024
It has been found that Transformer-based language models have the ability to perform basic quantitative reasoning. In this paper, we propose a method for studying how these models internally represent numerical data, and use our proposal to analyze the ALBERT family of language models. Specifically, we extract the learned embeddings these models use to represent tokens that correspond to numbers and ordinals, and subject these embeddings to Principal Component Analysis (PCA). PCA results reveal that ALBERT models of different sizes, trained and initialized separately, consistently learn to use the axes of greatest variation to represent the approximate ordering of various numerical concepts. Numerals and their textual counterparts are represented in separate clusters, but increase along the same direction in 2D space. Our findings illustrate that language models, trained purely to model text, can intuit basic mathematical concepts, opening avenues for NLP applications that intersect with quantitative reasoning.
2021
The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models
Ulme Wennberg | Gustav Eje Henter
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Ulme Wennberg | Gustav Eje Henter
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Mechanisms for encoding positional information are central for transformer-based language models. In this paper, we analyze the position embeddings of existing language models, finding strong evidence of translation invariance, both for the embeddings themselves and for their effect on self-attention. The degree of translation invariance increases during training and correlates positively with model performance. Our findings lead us to propose translation-invariant self-attention (TISA), which accounts for the relative position between tokens in an interpretable fashion without needing conventional position embeddings. Our proposal has several theoretical advantages over existing position-representation approaches. Proof-of-concept experiments show that it improves on regular ALBERT on GLUE tasks, while only adding orders of magnitude less positional parameters.