Chunyang Xiao


2024

pdf
Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-Context Models
Xinyu Liu | Runsong Zhao | Pengcheng Huang | Chunyang Xiao | Bei Li | Jingang Wang | Tong Xiao | JingBo Zhu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Numerous recent works target to extend effective context length for language models and various methods, tasks and benchmarks exist to measure model’s effective memory length. However, through thorough investigations, we find limitations for currently existing evaluations on model’s memory. We provide an extensive survey for limitations in this work and propose a new method called forgetting curve to measure the memorization capability of long-context models. We show that forgetting curve has the advantage of being robust to the tested corpus and the experimental settings, of not relying on prompt and can be applied to any model size. We apply our forgetting curve to a large variety of models involving both transformer and RNN/SSM based architectures. Our measurement provides empirical evidence for the effectiveness of transformer extension techniques while raises questions for the effective length of RNN/SSM based models. We also examine the difference between our measurement and existing benchmarks as well as popular metrics for various models.

2019

pdf bib
Grammatical Sequence Prediction for Real-Time Neural Semantic Parsing
Chunyang Xiao | Christoph Teichmann | Konstantine Arkoudas
Proceedings of the Workshop on Deep Learning and Formal Languages: Building Bridges

While sequence-to-sequence (seq2seq) models achieve state-of-the-art performance in many natural language processing tasks, they can be too slow for real-time applications. One performance bottleneck is predicting the most likely next token over a large vocabulary; methods to circumvent this bottleneck are a current research topic. We focus specifically on using seq2seq models for semantic parsing, where we observe that grammars often exist which specify valid formal representations of utterance semantics. By developing a generic approach for restricting the predictions of a seq2seq model to grammatically permissible continuations, we arrive at a widely applicable technique for speeding up semantic parsing. The technique leads to a 74% speed-up on an in-house dataset with a large vocabulary, compared to the same neural model without grammatical restrictions

2016

pdf
Orthogonality regularizer for question answering
Chunyang Xiao | Guillaume Bouchard | Marc Dymetman | Claire Gardent
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics

pdf
Sequence-based Structured Prediction for Semantic Parsing
Chunyang Xiao | Marc Dymetman | Claire Gardent
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf
Reversibility reconsidered: finite-state factors for efficient probabilistic sampling in parsing and generation
Marc Dymetman | Sriram Venkatapathy | Chunyang Xiao
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing