Aki Härmä

Also published as: Aki Harma

2025

pdf bib abs
Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data
Anton Changalidis | Aki Härmä
Proceedings of the First Workshop on Large Language Model Memorization (L2M2)

This paper studies how the model architecture and data configurations influence the empirical memorization capacity of generative transformers. The models are trained using synthetic text datasets derived from the Systematized Nomenclature of Medicine (SNOMED) knowledge graph: triplets, representing static connections, and sequences, simulating complex relation patterns. The results show that embedding size is the primary determinant of learning speed and capacity, while additional layers provide limited benefits and may hinder performance on simpler datasets. Activation functions play a crucial role, and Softmax demonstrates greater stability and capacity. Furthermore, increasing the complexity of the data set seems to improve the final memorization. These insights improve our understanding of transformer memory mechanisms and provide a framework for optimizing model design with structured real-world data.

pdf bib abs
Emergence of symbolic abstraction heads for in-context learning in large language models
Ali Al-Saeedi | Aki Harma
Proceedings of Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning @ COLING 2025

Large Language Models (LLMs) based on self-attention circuits are able to perform, at inference time, novel reasoning tasks, but the mechanisms inside the models are currently not fully understood. We assume that LLMs are able to generalize abstract patterns from the input and form an internal symbolic internal representation of the content. In this paper, we study this by analyzing the performance of small LLM models trained with sequences of instantiations of abstract sequential symbolic patterns or templates. It is shown that even a model with two layers is able to learn an abstract template and use it to generate correct output representing the pattern. This can be seen as a form of symbolic inference taking place inside the network. In this paper, we call the emergent mechanism abstraction head. Identifying mechanisms of symbolic reasoning in a neural network can help to find new ways to merge symbolic and neural processing.

E-health applications aim to support the user in adopting healthy habits. An important feature is to provide insights into the user’s lifestyle. To actively engage the user in the insight mining process, we propose an ontology-based framework with a Controlled Natural Language interface, which enables the user to ask for specific insights and to customize personal information.

Co-authors

Venues

neusymbridge1

Fix author

Aki Härmä

2025

2024

2020

2018

Co-authors

Venues