Adam Gonczarek
2023
Classical Out-of-Distribution Detection Methods Benchmark in Text Classification Tasks
Mateusz Baran
|
Joanna Baran
|
Mateusz Wójcik
|
Maciej Zięba
|
Adam Gonczarek
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
State-of-the-art models can perform well in controlled environments, but they often struggle when presented with out-of-distribution (OOD) examples, making OOD detection a critical component of NLP systems. In this paper, we focus on highlighting the limitations of existing approaches to OOD detection in NLP. Specifically, we evaluated eight OOD detection methods that are easily integrable into existing NLP systems and require no additional OOD data or model modifications. One of our contributions is providing a well-structured research environment that allows for full reproducibility of the results. Additionally, our analysis shows that existing OOD detection methods for NLP tasks are not yet sufficiently sensitive to capture all samples characterized by various types of distributional shifts. Particularly challenging testing scenarios arise in cases of background shift and randomly shuffled word order within in domain texts. This highlights the need for future work to develop more effective OOD detection approaches for the NLP problems, and our work provides a well-defined foundation for further research in this area.
Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform
Mateusz Wójcik
|
Witold Kościukiewicz
|
Mateusz Baran
|
Tomasz Kajdanowicz
|
Adam Gonczarek
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Production deployments in complex systems require ML architectures to be highly efficient and usable against multiple tasks. Particularly demanding are classification problems in which data arrives in a streaming fashion and each class is presented separately. Recent methods with stochastic gradient learning have been shown to struggle in such setups or have limitations like memory buffers, and being restricted to specific domains that disable its usage in real-world scenarios. For this reason, we present a fully differentiable architecture based on the Mixture of Experts model, that enables the training of high-performance classifiers when examples from each class are presented separately. We conducted exhaustive experiments that proved its applicability in various domains and ability to learn online in production environments. The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods.
Search
Co-authors
- Joanna Baran 1
- Maciej Zięba 1
- Mateusz Baran 2
- Mateusz Wójcik 2
- Tomasz Kajdanowicz 1
- show all...
Venues
- acl2