Anastasia Voznyuk

2025

pdf bib abs
Quantifying Logical Consistency in Transformers via Query-Key Alignment
Eduard Tulchinskii | Laida Kushnareva | Anastasia Voznyuk | Andrei Andriiainen | Irina Piontkovskaya | Evgeny Burnaev | Serguei Barannikov
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) excel at many NLP tasks, yet their multi-step logical reasoning remains unreliable. Existing solutions such as Chain-of-Thought prompting generate intermediate steps but provide no internal check of their logical coherence. In this paper, we use the “QK-score”, a lightweight metric based on query–key alignments within transformer attention heads, to evaluate the logical reasoning capabilities of LLMs. Our method automatically identifies attention heads that play a key role in distinguishing valid from invalid logical inferences, enabling efficient inference-time evaluation via a single forward pass. It reveals latent reasoning structure in LLMs and provides a scalable mechanistic alternative to ablation-based analysis. Across three benchmarks: ProntoQA-OOD, PARARULE-Plus, and MultiLogicEval, with models ranging from 1.5B to 70B parameters, the selected heads predict logical validity up to 14% better than the model probabilities, and remain robust under distractors and increasing reasoning depth of d≤ 6.

Artificial Text Detection (ATD) is becoming increasingly important with the rise of advanced Large Language Models (LLMs). Despite numerous efforts, no single algorithm performs consistently well across different types of unseen text or guarantees effective generalization to new LLMs. Interpretability plays a crucial role in achieving this goal. In this study, we enhance ATD interpretability by using Sparse Autoencoders (SAE) to extract features from Gemma-2-2B’s residual stream. We identify both interpretable and efficient features, analyzing their semantics and relevance through domain- and model-specific statistics, a steering approach, and manual or LLM-based interpretation of obtained features. Our methods offer valuable insights into how texts from various models differ from human-written content. We show that modern LLMs have a distinct writing style, especially in information-dense domains, even though they can produce human-like outputs with personalized prompts. The code for this paper is available at https://github.com/pyashy/SAE_ATD.

pdf bib abs
Advacheck at GenAI Detection Task 1: AI Detection Powered by Domain-Aware Multi-Tasking
German Gritsai | Anastasia Voznyuk | Ildar Khabutdinov | Andrey Grabovoy
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)

The paper describes a system designed by Advacheck team to recognise machine-generated and human-written texts in the monolingual subtask of GenAI Detection Task 1 competition. Our developed system is a multi-task architecture with shared Transformer Encoder between several classification heads. One head is responsible for binary classification between human-written and machine-generated texts, while the other heads are auxiliary multiclass classifiers for texts of different domains from particular datasets. As multiclass heads were trained to distinguish the domains presented in the data, they provide a better understanding of the samples. This approach led us to achieve the first place in the official ranking with 83.07% macro F1-score on the test set and bypass the baseline by 10%. We further study obtained system through ablation, error and representation analyses, finding that multi-task learning outperforms single-task mode and simultaneous tasks form a cluster structure in embeddings space.

pdf bib abs
Advacheck at SemEval-2025 Task 3: Combining NER and RAG to Spot Hallucinations in LLM Answers
Anastasia Voznyuk | German Gritsai | Andrey Grabovoy
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

The Mu-SHROOM competition in the SemEval-2025 Task 3 aims to tackle the problem of detecting spans with hallucinations in texts, generated by Large Language Models (LLMs). Our developed system, submitted to this task, is a joint architecture that utilises Named Entity Recognition (NER), Retrieval-Augmented Generation (RAG) and LLMs to gather, compare and analyse information in the texts provided by organizers. We extract entities potentially capable of containing hallucinations with NER, aggregate relevant topics for them using RAG, then verify and provide a verdict on the extracted information using the LLMs. This approach allowed with a certain level of quality to find hallucinations not only in facts, but misspellings in names and titles, which was not always accepted by human annotators in ground truth markup. We also point out some inconsistencies within annotators spans, that perhaps affected scores of all participants.

2024

pdf bib abs
DeepPavlov 1.0: Your Gateway to Advanced NLP Models Backed by Transformers and Transfer Learning
Maksim Savkin | Anastasia Voznyuk | Fedor Ignatov | Anna Korzanova | Dmitry Karpov | Alexander Popov | Vasily Konovalov
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present DeepPavlov 1.0, an open-source framework for using Natural Language Processing (NLP) models by leveraging transfer learning techniques. DeepPavlov 1.0 is created for modular and configuration-driven development of state-of-the-art NLP models and supports a wide range of NLP model applications. DeepPavlov 1.0 is designed for practitioners with limited knowledge of NLP/ML. DeepPavlov is based on PyTorch and supports HuggingFace transformers. DeepPavlov is publicly released under the Apache 2.0 license and provides access to an online demo.

pdf bib abs
DeepPavlov at SemEval-2024 Task 8: Leveraging Transfer Learning for Detecting Boundaries of Machine-Generated Texts
Anastasia Voznyuk | Vasily Konovalov
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

The Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection shared task in the SemEval-2024 competition aims to tackle the problem of misusing collaborative human-AI writing. Although there are a lot of existing detectors of AI content, they are often designed to give a binary answer and thus may not be suitable for more nuanced problem of finding the boundaries between human-written and machine-generated texts, while hybrid human-AI writing becomes more and more popular. In this paper, we address the boundary detection problem. Particularly, we present a pipeline for augmenting data for supervised fine-tuning of DeBERTaV3. We receive new best MAE score, according to the leaderboard of the competition, with this pipeline.