Savas Ozkan
2026
Decoding Text Spans for Efficient and Accurate Named-Entity Recognition
Andrea Maracani | Savas Ozkan | Junyi Zhu | Sinan Mutlu | Mete Ozay
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Andrea Maracani | Savas Ozkan | Junyi Zhu | Sinan Mutlu | Mete Ozay
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Named Entity Recognition (NER) is a key component in industrial information extraction pipelines, where systems must satisfy strict latency and throughput constraints in addition to strong accuracy. State-of-the-art NER accuracy is often achieved by span-based frameworks, which construct span representations from token encodings and classify candidate spans. However, many span-based methods enumerate large numbers of candidates and process each candidate with marker-augmented inputs, substantially increasing inference cost and limiting scalability in large-scale deployments. In this work, we propose SpanDec, an efficient span-based NER framework that targets this bottleneck. Our main insight is that span representation interactions can be computed effectively at the final transformer stage, avoiding redundant computation in earlier layers via a lightweight decoder dedicated to span representations. We further introduce a span filtering mechanism during enumeration to prune unlikely candidates before expensive processing. Across multiple benchmarks, SpanDec matches competitive span-based baselines while improving throughput and reducing computational cost, yielding a better accuracy–efficiency trade-off suitable for high-volume serving and on-device applications.
2025
Multi-Task Pre-Finetuning of Lightweight Transformer Encoders for Text Classification and NER
Junyi Zhu | Savas Ozkan | Andrea Maracani | Sinan Mutlu | Cho Jung Min | Mete Ozay
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Junyi Zhu | Savas Ozkan | Andrea Maracani | Sinan Mutlu | Cho Jung Min | Mete Ozay
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Deploying natural language processing (NLP) models on mobile platforms requires models that can adapt across diverse applications while remaining efficient in memory and computation. We investigate pre-finetuning strategies to enhance the adaptability of lightweight BERT-like encoders for two fundamental NLP task families: named entity recognition (NER) and text classification. While pre-finetuning improves downstream performance for each task family individually, we find that naïve multi-task pre-finetuning introduces conflicting optimization signals that degrade overall performance. To address this, we propose a simple yet effective multi-task pre-finetuning framework based on task-primary LoRA modules, which enables a single shared encoder backbone with modular adapters. Our approach achieves performance comparable to individual pre-finetuning while meeting practical deployment constraint. Experiments on 21 downstream tasks show average improvements of +0.8% for NER and +8.8% for text classification, demonstrating the effectiveness of our method for versatile mobile NLP applications.
2024
A Study of Parameter Efficient Fine-tuning by Learning to Efficiently Fine-Tune
Taha Ceritli | Savas Ozkan | Jeongwon Min | Eunchung Noh | Cho Jung Min | Mete Ozay
Findings of the Association for Computational Linguistics: EMNLP 2024
Taha Ceritli | Savas Ozkan | Jeongwon Min | Eunchung Noh | Cho Jung Min | Mete Ozay
Findings of the Association for Computational Linguistics: EMNLP 2024
The growing size of large language models (LLMs) requires parameter-efficient fine-tuning (PEFT) methods for their adaptation to new tasks. Existing methods, such as Low-Rank Adaptation (LoRA), typically involve model adaptation by training the PEFT parameters. One open problem required to be solved to effectively employ these methods is the identification of PEFT parameters. More precisely, related works identify PEFT parameters by projecting high dimensional parameters of LLMs onto low dimensional parameter manifolds with predefined projections, or identifying PEFT parameters as projections themselves. To study this problem, we propose a new approach called Learning to Efficiently Fine-tune (LEFT) where we aim to learn spaces of PEFT parameters from data. In order to learn how to generate the PEFT parameters on a learned parameter space while fine-tuning the LLMs, we propose the Parameter Generation (PG) method. In the experimental analyses, we examine the effectiveness of our solutions exploring accuracy of fine-tuned LLMs and characteristics of PEFT parameters on benchmark GLUE tasks.