@inproceedings{jeon-strube-2025-entity,
    title = "Entity Tracking in Small Language Models: An Attention-Based Study of Parameter-Efficient Fine-Tuning",
    author = "Jeon, Sungho  and
      Strube, Michael",
    editor = "Strube, Michael  and
      Braud, Chloe  and
      Hardmeier, Christian  and
      Li, Junyi Jessy  and
      Loaiciga, Sharid  and
      Zeldes, Amir  and
      Li, Chuyuan",
    booktitle = "Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025)",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.codi-1.4/",
    pages = "42--53",
    ISBN = "979-8-89176-343-2",
    abstract = "The ability to track entities is fundamental for language understanding, yet the internal mechanisms governing this capability in Small Language Models (SLMs) are poorly understood. Previous studies often rely on indirect probing or complex interpretability methods, leaving a gap for lightweight diagnostics that connect model behavior to performance. To bridge this gap, we introduce a framework to analyze entity tracking by measuring the attention flow between entity and non-entity tokens within SLMs. We apply this to analyze models both before and after Parameter-Efficient Fine-Tuning (PEFT). Our analysis reveals two key findings. First, SLMs' attentional strategies vary significantly with text type, but entities consistently receive a high degree of focus. Second, we show that PEFT {--} specifically QLoRA {--} dramatically improves classification performance on entity-centric tasks by increasing the model{'}s attentional focus on entity-related tokens. Our work provides direct evidence for how PEFT can refine a model{'}s internal mechanisms and establishes attention analysis as a valuable, lightweight diagnostic tool for interpreting and improving SLMs."
}Markdown (Informal)
[Entity Tracking in Small Language Models: An Attention-Based Study of Parameter-Efficient Fine-Tuning](https://preview.aclanthology.org/ingest-emnlp/2025.codi-1.4/) (Jeon & Strube, CODI 2025)
ACL