Jakub Binkowski
2025
When Will the Tokens End? Graph-Based Forecasting for LLMs Output Length
Grzegorz Piotrowski
|
Mateusz Bystroński
|
Mikołaj Hołysz
|
Jakub Binkowski
|
Grzegorz Chodak
|
Tomasz Jan Kajdanowicz
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Large Language Models (LLMs) are typically trained to predict the next token in a sequence. However, their internal representations often encode signals that go beyond immediate next-token prediction. In this work, we investigate whether these hidden states also carry information about the remaining length of the generated output—an implicit form of foresight (CITATION). We formulate this as a regression problem where, at generation step t, the target is the number of remaining tokens yt = T - t, with T as the total output length.We propose two approaches: (1) an aggregation-based model that combines hidden states from multiple transformer layers ℓ ∈ {8, \dots, 15} using element-wise operations such as mean or sum, and (2) a Layerwise Graph Regressor that treats layerwise hidden states as nodes in a fully connected graph and applies a Graph Neural Network (GNN) to predict yt. Both models operate on frozen LLM embeddings without requiring end-to-end fine-tuning.Accurately estimating remaining output length has both theoretical and practical implications. From an interpretability standpoint, it suggests that LLMs internally track their generation progress. From a systems perspective, it enables optimizations such as output-length-aware scheduling (CITATION). Our graph-based model achieves state-of-the-art performance on the Alpaca dataset using LLaMA-3-8B-Instruct, reducing normalized mean absolute error (NMAE) by over 50% in short-output scenarios.
2024
Empowering Small-Scale Knowledge Graphs: A Strategy of Leveraging General-Purpose Knowledge Graphs for Enriched Embeddings
Albert Sawczyn
|
Jakub Binkowski
|
Piotr Bielak
|
Tomasz Kajdanowicz
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Knowledge-intensive tasks pose a significant challenge for Machine Learning (ML) techniques. Commonly adopted methods, such as Large Language Models (LLMs), often exhibit limitations when applied to such tasks. Nevertheless, there have been notable endeavours to mitigate these challenges, with a significant emphasis on augmenting LLMs through Knowledge Graphs (KGs). While KGs provide many advantages for representing knowledge, their development costs can deter extensive research and applications. Addressing this limitation, we introduce a framework for enriching embeddings of small-scale domain-specific Knowledge Graphs with well-established general-purpose KGs. Adopting our method, a modest domain-specific KG can benefit from a performance boost in downstream tasks when linked to a substantial general-purpose KG. Experimental evaluations demonstrate a notable enhancement, with up to a 44% increase observed in the Hits@10 metric. This relatively unexplored research direction can catalyze more frequent incorporation of KGs in knowledge-intensive tasks, resulting in more robust, reliable ML implementations, which hallucinates less than prevalent LLM solutions.
Search
Fix author
Co-authors
- Piotr Bielak 1
- Mateusz Bystroński 1
- Grzegorz Chodak 1
- Mikołaj Hołysz 1
- Tomasz Kajdanowicz 1
- show all...