Sinan Mutlu
2026
Decoding Text Spans for Efficient and Accurate Named-Entity Recognition
Andrea Maracani | Savas Ozkan | Junyi Zhu | Sinan Mutlu | Mete Ozay
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Andrea Maracani | Savas Ozkan | Junyi Zhu | Sinan Mutlu | Mete Ozay
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Named Entity Recognition (NER) is a key component in industrial information extraction pipelines, where systems must satisfy strict latency and throughput constraints in addition to strong accuracy. State-of-the-art NER accuracy is often achieved by span-based frameworks, which construct span representations from token encodings and classify candidate spans. However, many span-based methods enumerate large numbers of candidates and process each candidate with marker-augmented inputs, substantially increasing inference cost and limiting scalability in large-scale deployments. In this work, we propose SpanDec, an efficient span-based NER framework that targets this bottleneck. Our main insight is that span representation interactions can be computed effectively at the final transformer stage, avoiding redundant computation in earlier layers via a lightweight decoder dedicated to span representations. We further introduce a span filtering mechanism during enumeration to prune unlikely candidates before expensive processing. Across multiple benchmarks, SpanDec matches competitive span-based baselines while improving throughput and reducing computational cost, yielding a better accuracy–efficiency trade-off suitable for high-volume serving and on-device applications.
2025
Multi-Task Pre-Finetuning of Lightweight Transformer Encoders for Text Classification and NER
Junyi Zhu | Savas Ozkan | Andrea Maracani | Sinan Mutlu | Cho Jung Min | Mete Ozay
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Junyi Zhu | Savas Ozkan | Andrea Maracani | Sinan Mutlu | Cho Jung Min | Mete Ozay
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Deploying natural language processing (NLP) models on mobile platforms requires models that can adapt across diverse applications while remaining efficient in memory and computation. We investigate pre-finetuning strategies to enhance the adaptability of lightweight BERT-like encoders for two fundamental NLP task families: named entity recognition (NER) and text classification. While pre-finetuning improves downstream performance for each task family individually, we find that naïve multi-task pre-finetuning introduces conflicting optimization signals that degrade overall performance. To address this, we propose a simple yet effective multi-task pre-finetuning framework based on task-primary LoRA modules, which enables a single shared encoder backbone with modular adapters. Our approach achieves performance comparable to individual pre-finetuning while meeting practical deployment constraint. Experiments on 21 downstream tasks show average improvements of +0.8% for NER and +8.8% for text classification, demonstrating the effectiveness of our method for versatile mobile NLP applications.