Evgeny Frolov

2026

Fast and Accurate Fisher-Guided Quantization via Efficient Kronecker Factorization
Viktoriia A. Chekalina | Gerasin Timofey | Andrey Kuznetsov | Evgeny Frolov
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Quantization has shown strong results in preserving model quality under compression. However, under aggressive bit-width reductions, even quantization may require additional information to prevent performance degradation. A natural source of it is second-order curvature information, captured by the Hessian. Since the Hessian of the model layers is prohibitively large, direct computation is infeasible, making structured parameterizations and approximations crucial in practice.In this work, we propose efficient Kronecker-factored approximation yielding state-of-the-art performance when integrated into existing quantization schemes. Evaluations on the LLaMA and Qwen model families show near-baseline quality at 4-bit compression and only a 5–6% degradation at 2-bit. Moreover, our method substantially accelerates the most expensive component in second-order quantization – Hessian parameterization – achieving up to a 10× speedup over prior approaches.

2022

pdf bib abs

MEKER: Memory Efficient Knowledge Embedding Representation for Link Prediction and Question Answering
Viktoriia Chekalina | Anton Razzhigaev | Albert Sayapin | Evgeny Frolov | Alexander Panchenko
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Knowledge Graphs (KGs) are symbolically structured storages of facts. The KG embedding contains concise data used in NLP tasks requiring implicit information about the real world. Furthermore, the size of KGs that may be useful in actual NLP assignments is enormous, and creating embedding over it has memory cost issues. We represent KG as a 3rd-order binary tensor and move beyond the standard CP decomposition (CITATION) by using a data-specific generalized version of it (CITATION). The generalization of the standard CP-ALS algorithm allows obtaining optimization gradients without a backpropagation mechanism. It reduces the memory needed in training while providing computational benefits. We propose a MEKER, a memory-efficient KG embedding model, which yields SOTA-comparable performance on link prediction tasks and KG-based Question Answering.

Co-authors

Albert Sayapin 1

Gerasin Timofey 1

Venues

ACL2

Fix author