Unsupervised Out-of-Domain Detection via Pre-trained Transformers

Keyang Xu; Tongzheng Ren; Shikun Zhang; Yihao Feng; Caiming Xiong

doi:10.18653/v1/2021.acl-long.85

Unsupervised Out-of-Domain Detection via Pre-trained Transformers

Keyang Xu, Tongzheng Ren, Shikun Zhang, Yihao Feng, Caiming Xiong

Abstract

Deployed real-world machine learning applications are often subject to uncontrolled and even potentially malicious inputs. Those out-of-domain inputs can lead to unpredictable outputs and sometimes catastrophic safety issues. Prior studies on out-of-domain detection require in-domain task labels and are limited to supervised classification scenarios. Our work tackles the problem of detecting out-of-domain samples with only unsupervised in-domain data. We utilize the latent representations of pre-trained transformers and propose a simple yet effective method to transform features across all layers to construct out-of-domain detectors efficiently. Two domain-specific fine-tuning approaches are further proposed to boost detection accuracy. Our empirical evaluations of related methods on two datasets validate that our method greatly improves out-of-domain detection ability in a more general scenario.

Anthology ID:: 2021.acl-long.85
Volume:: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: August
Year:: 2021
Address:: Online
Editors:: Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:: ACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1052–1061
Language:
URL:: https://aclanthology.org/2021.acl-long.85
DOI:: 10.18653/v1/2021.acl-long.85
Bibkey:
Cite (ACL):: Keyang Xu, Tongzheng Ren, Shikun Zhang, Yihao Feng, and Caiming Xiong. 2021. Unsupervised Out-of-Domain Detection via Pre-trained Transformers. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1052–1061, Online. Association for Computational Linguistics.
Cite (Informal):: Unsupervised Out-of-Domain Detection via Pre-trained Transformers (Xu et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2021.acl-long.85.pdf
Video:: https://preview.aclanthology.org/nschneid-patch-4/2021.acl-long.85.mp4
Code: rivercold/BERT-unsupervised-OOD
Data: Multi30K, SNLI, SST

PDF Search Code Video