Puyang Xu


2022

pdf
Open World Classification with Adaptive Negative Samples
Ke Bai | Guoyin Wang | Jiwei Li | Sunghyun Park | Sungjin Lee | Puyang Xu | Ricardo Henao | Lawrence Carin
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Open world classification is a task in natural language processing with key practical relevance and impact.Since the open or unknown category data only manifests in the inference phase, finding a model with a suitable decision boundary accommodating for the identification of known classes and discrimination of the open category is challenging.The performance of existing models is limited by the lack of effective open category data during the training stage or the lack of a good mechanism to learn appropriate decision boundaries.We propose an approach based on Adaptive Negative Samples (ANS) designed to generate effective synthetic open category samples in the training stage and without requiring any prior knowledge or external datasets.Empirically, we find a significant advantage in using auxiliary one-versus-rest binary classifiers, which effectively utilize the generated negative samples and avoid the complex threshold-seeking stage in previous works.Extensive experiments on three benchmark datasets show that ANS achieves significant improvements over state-of-the-art methods.

2018

pdf
An End-to-end Approach for Handling Unknown Slot Values in Dialogue State Tracking
Puyang Xu | Qi Hu
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We highlight a practical yet rarely discussed problem in dialogue state tracking (DST), namely handling unknown slot values. Previous approaches generally assume predefined candidate lists and thus are not designed to output unknown values, especially when the spoken language understanding (SLU) module is absent as in many end-to-end (E2E) systems. We describe in this paper an E2E architecture based on the pointer network (PtrNet) that can effectively extract unknown slot values while still obtains state-of-the-art accuracy on the standard DSTC2 benchmark. We also provide extensive empirical evidence to show that tracking unknown values can be challenging and our approach can bring significant improvement with the help of an effective feature dropout technique.

2011

pdf
Efficient Subsampling for Training Complex Language Models
Puyang Xu | Asela Gunawardana | Sanjeev Khudanpur
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing