Chen Chen


2021

pdf bib
On the Transformer Growth for Progressive BERT Training
Xiaotao Gu | Liyuan Liu | Hongkun Yu | Jing Li | Chen Chen | Jiawei Han
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

As the excessive pre-training cost arouses the need to improve efficiency, considerable efforts have been made to train BERT progressively–start from an inferior but low-cost model and gradually increase the computational complexity. Our objective is to help advance the understanding of such Transformer growth and discover principles that guide progressive training. First, we find that similar to network architecture selection, Transformer growth also favors compound scaling. Specifically, while existing methods only conduct network growth in a single dimension, we observe that it is beneficial to use compound growth operators and balance multiple dimensions (e.g., depth, width, and input length of the model). Moreover, we explore alternative growth operators in each dimension via controlled comparison to give practical guidance for operator selection. In light of our analyses, the proposed method CompoundGrow speeds up BERT pre-training by 73.6% and 82.2% for the base and large models respectively while achieving comparable performances.

2020

pdf bib
Adversarial Self-Supervised Data-Free Distillation for Text Classification
Xinyin Ma | Yongliang Shen | Gongfan Fang | Chen Chen | Chenghao Jia | Weiming Lu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Large pre-trained transformer-based language models have achieved impressive results on a wide range of NLP tasks. In the past few years, Knowledge Distillation(KD) has become a popular paradigm to compress a computationally expensive model to a resource-efficient lightweight model. However, most KD algorithms, especially in NLP, rely on the accessibility of the original training dataset, which may be unavailable due to privacy issues. To tackle this problem, we propose a novel two-stage data-free distillation method, named Adversarial self-Supervised Data-Free Distillation (AS-DFD), which is designed for compressing large-scale transformer-based models (e.g., BERT). To avoid text generation in discrete space, we introduce a Plug & Play Embedding Guessing method to craft pseudo embeddings from the teacher’s hidden knowledge. Meanwhile, with a self-supervised module to quantify the student’s ability, we adapt the difficulty of pseudo embeddings in an adversarial training manner. To the best of our knowledge, our framework is the first data-free distillation framework designed for NLP tasks. We verify the effectiveness of our method on several text classification datasets.

pdf bib
SynET: Synonym Expansion using Transitivity
Jiale Yu | Yongliang Shen | Xinyin Ma | Chenghao Jia | Chen Chen | Weiming Lu
Findings of the Association for Computational Linguistics: EMNLP 2020

In this paper, we study a new task of synonym expansion using transitivity, and propose a novel approach named SynET, which considers both the contexts of two given synonym pairs. It introduces an auxiliary task to reduce the impact of noisy sentences, and proposes a Multi-Perspective Entity Matching Network to match entities from multiple perspectives. Extensive experiments on a real-world dataset show the effectiveness of our approach.

2019

pdf bib
Essentia: Mining Domain-specific Paraphrases with Word-Alignment Graphs
Danni Ma | Chen Chen | Behzad Golshan | Wang-Chiew Tan
Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)

Paraphrases are important linguistic resources for a wide variety of NLP applications. Many techniques for automatic paraphrase mining from general corpora have been proposed. While these techniques are successful at discovering generic paraphrases, they often fail to identify domain-specific paraphrases (e.g., staff, concierge in the hospitality domain). This is because current techniques are often based on statistical methods, while domain-specific corpora are too small to fit statistical methods. In this paper, we present an unsupervised graph-based technique to mine paraphrases from a small set of sentences that roughly share the same topic or intent. Our system, Essentia, relies on word-alignment techniques to create a word-alignment graph that merges and organizes tokens from input sentences. The resulting graph is then used to generate candidate paraphrases. We demonstrate that our system obtains high quality paraphrases, as evaluated by crowd workers. We further show that the majority of the identified paraphrases are domain-specific and thus complement existing paraphrase databases.

2016

pdf bib
Chinese Zero Pronoun Resolution with Deep Neural Networks
Chen Chen | Vincent Ng
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Chinese Zero Pronoun Resolution: A Joint Unsupervised Discourse-Aware Model Rivaling State-of-the-Art Resolvers
Chen Chen | Vincent Ng
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
Chinese Event Coreference Resolution: An Unsupervised Probabilistic Model Rivaling Supervised Resolvers
Chen Chen | Vincent Ng
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
SinoCoreferencer: An End-to-End Chinese Event Coreference Resolver
Chen Chen | Vincent Ng
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Compared to entity coreference resolution, there is a relatively small amount of work on event coreference resolution. Much work on event coreference was done for English. In fact, to our knowledge, there are no publicly available results on Chinese event coreference resolution. This paper describes the design, implementation, and evaluation of SinoCoreferencer, an end-to-end state-of-the-art ACE-style Chinese event coreference system. We have made SinoCoreferencer publicly available, in hope to facilitate the development of high-level Chinese natural language applications that can potentially benefit from event coreference information.

pdf bib
Chinese Zero Pronoun Resolution: An Unsupervised Probabilistic Model Rivaling Supervised Resolvers
Chen Chen | Vincent Ng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Relieving the Computational Bottleneck: Joint Inference for Event Extraction with High-Dimensional Features
Deepak Venugopal | Chen Chen | Vibhav Gogate | Vincent Ng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Chinese Zero Pronoun Resolution: Some Recent Advances
Chen Chen | Vincent Ng
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Modeling Comma Placement in Chinese Text for Better Readability using Linguistic Features and Gaze Information
Tadayoshi Hara | Chen Chen | Yoshinobu Kano | Akiko Aizawa
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations

pdf bib
Chinese Event Coreference Resolution: Understanding the State of the Art
Chen Chen | Vincent Ng
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Linguistically Aware Coreference Evaluation Metrics
Chen Chen | Vincent Ng
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

pdf bib
Combining the Best of Two Worlds: A Hybrid Approach to Multilingual Coreference Resolution
Chen Chen | Vincent Ng
Joint Conference on EMNLP and CoNLL - Shared Task

pdf bib
Joint Modeling for Chinese Event Extraction with Rich Linguistic Features
Chen Chen | Vincent Ng
Proceedings of COLING 2012

pdf bib
Chinese Noun Phrase Coreference Resolution: Insights into the State of the Art
Chen Chen | Vincent Ng
Proceedings of COLING 2012: Posters

2010

pdf bib
A Pipeline Approach to Chinese Personal Name Disambiguation
Yang Song | Zhengyan He | Chen Chen | Houfeng Wang
CIPS-SIGHAN Joint Conference on Chinese Language Processing

2009

pdf bib
Clustering Technique in Multi-Document Personal Name Disambiguation
Chen Chen | Junfeng Hu | Houfeng Wang
Proceedings of the ACL-IJCNLP 2009 Student Research Workshop