Zuev Aleksandr


2025

pdf bib
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
Maxim Zhelnin | Viktor Moskvoretskii | Egor Shvetsov | Maria Krylova | Venediktov Egor | Zuev Aleksandr | Evgeny Burnaev
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Parameter Efficient Fine-Tuning (PEFT) methods have gained popularity and democratized the usage of Large Language Models (LLMs). Recent studies have shown that a small subset of weights significantly impacts performance. Based on this observation, we introduce a novel PEFT method, called Gaussian noise Injected Fine Tuning of Salient Weights (GIFT-SW). Our method updates only salient columns, while injecting Gaussian noise into non-salient ones. To identify these columns, we developed a generalized sensitivity metric that extends and unifies metrics from previous studies. Experiments with LLaMA models demonstrate that GIFT-SW outperforms full fine-tuning and modern PEFT methods under the same computational budget. Moreover, GIFT-SW offers practical advantages to recover performance of models subjected to mixed-precision quantization with keeping salient weights in full precision.

pdf bib
Hypercomplex Transformer: Novel Attention Mechanism
Maxim Gordeev | Zuev Aleksandr | Mikhail Bakulin | Andrey Latyshev | Dmitry Kozlov | Yiwu Yao | Voronova Anastasia
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

Self-attention mechanisms have become foundational across modern deep learning architectures. Recent efforts focus on improving their efficiency, particularly for signal processing tasks. The existing approaches employ complex-valued representations for inputs and weights and achieve higher accuracy at the cost of increased model size and inference latency. Dual-numbered algebra offers a promising alternative that allows efficient multiplication and faster inference with the same representational capacity. Inspired by previous studies in the field of hypercomplex neural networks, we introduce a generalized hypercomplex attention block and integrate it into Transformer-based models for EEG classification. Our experiments include adaptation of the hypercomplex models, so that the number of parameters is equal to that of their real-valued counterparts. Across all scenarios, the dual- and complex-numbered models consistently outperform the real ones, demonstrating superior accuracy. This work presents hypercomplex attention as a competitive and computationally efficient strategy with potential value to solve multiple NLP tasks.