Chen Chen


2023

pdf
Ideology Prediction from Scarce and Biased Supervision: Learn to Disregard the “What” and Focus on the “How”!
Chen Chen | Dylan Walker | Venkatesh Saligrama
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs. This problem is motivated by the fact that manual data-labeling is expensive, while self-reported labels are often scarce and exhibit significant selection bias. We propose a novel statistical model that decomposes the document embeddings into a linear superposition of two vectors; a latent neutral context vector independent of ideology, and a latent position vector aligned with ideology. We train an end-to-end model that has intermediate contextual and positional vectors as outputs. At deployment time, our model predicts labels for input documents by exclusively leveraging the predicted positional vectors. On two benchmark datasets we show that our model is capable of outputting predictions even when trained with as little as 5% biased data, and is significantly more accurate than the state-of-the-art. Through crowd-sourcing we validate the neutrality of contextual vectors, and show that context filtering results in ideological concentration, allowing for prediction on out-of-distribution examples.

pdf
MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition
Yuchen Hu | Chen Chen | Ruizhe Li | Heqing Zou | Eng Siong Chng
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Audio-visual speech recognition (AVSR) attracts a surge of research interest recently by leveraging multimodal signals to understand human speech. Mainstream approaches addressing this task have developed sophisticated architectures and techniques for multi-modality fusion and representation learning. However, the natural heterogeneity of different modalities causes distribution gap between their representations, making it challenging to fuse them. In this paper, we aim to learn the shared representations across modalities to bridge their gap. Different from existing similar methods on other multimodal tasks like sentiment analysis, we focus on the temporal contextual dependencies considering the sequence-to-sequence task setting of AVSR. In particular, we propose an adversarial network to refine frame-level modality-invariant representations (MIR-GAN), which captures the commonality across modalities to ease the subsequent multimodal fusion process. Extensive experiments on public benchmarks LRS3 and LRS2 show that our approach outperforms the state-of-the-arts.

pdf
Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Yuchen Hu | Ruizhe Li | Chen Chen | Chengwei Qin | Qiu-Shi Zhu | Eng Siong Chng
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Audio-visual speech recognition (AVSR) provides a promising solution to ameliorate the noise-robustness of audio-only speech recognition with visual information. However, most existing efforts still focus on audio modality to improve robustness considering its dominance in AVSR task, with noise adaptation techniques such as front-end denoise processing. Though effective, these methods are usually faced with two practical challenges: 1) lack of sufficient labeled noisy audio-visual training data in some real-world scenarios and 2) less optimal model generality to unseen testing noises. In this work, we investigate the noise-invariant visual modality to strengthen robustness of AVSR, which can adapt to any testing noises while without dependence on noisy training data, a.k.a., unsupervised noise adaptation. Inspired by human perception mechanism, we propose a universal viseme-phoneme mapping (UniVPM) approach to implement modality transfer, which can restore clean audio from visual signals to enable speech recognition under any noisy conditions. Extensive experiments on public benchmarks LRS3 and LRS2 show that our approach achieves the state-of-the-art under various noisy as well as clean conditions. In addition, we also outperform previous state-of-the-arts on visual speech recognition task.

pdf
TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities
Zhe Zhao | Yudong Li | Cheng Hou | Jing Zhao | Rong Tian | Weijie Liu | Yiren Chen | Ningyuan Sun | Haoyan Liu | Weiquan Mao | Han Guo | Weigang Gou | Taiqiang Wu | Tao Zhu | Wenhang Shi | Chen Chen | Shan Huang | Sihong Chen | Liqun Liu | Feifei Li | Xiaoshuai Chen | Xingwu Sun | Zhanhui Kang | Xiaoyong Du | Linlin Shen | Kimmo Yan
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Recently, the success of pre-training in text domain has been fully extended to vision, audio, and cross-modal scenarios. The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework. In this paper, we present TencentPretrain, a toolkit supporting pre-training models of different modalities. The core feature of TencentPretrain is the modular design. The toolkit uniformly divides pre-training models into 5 components: embedding, encoder, target embedding, decoder, and target. As almost all of common modules are provided in each component, users can choose the desired modules from different components to build a complete pre-training model. The modular design enables users to efficiently reproduce existing pre-training models or build brand-new one. We test the toolkit on text, vision, and audio benchmarks and show that it can match the performance of the original implementations.

pdf
UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning
Heqing Zou | Meng Shen | Chen Chen | Yuchen Hu | Deepu Rajan | Eng Siong Chng
Findings of the Association for Computational Linguistics: ACL 2023

Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks. However, traditional aggregation-based multimodal fusion methods ignore the inter-modality relationship, treat each modality equally, suffer sensor noise, and thus reduce multimodal learning performance. In this work, we propose a novel multimodal contrastive method to explore more reliable multimodal representations under the weak supervision of unimodal predicting. Specifically, we first capture task-related unimodal representations and the unimodal predictions from the introduced unimodal predicting task. Then the unimodal representations are aligned with the more effective one by the designed multimodal contrastive method under the supervision of the unimodal predictions. Experimental results with fused features on two image-text classification benchmarks UPMC-Food-101 and N24News show that our proposed Unimodality-Supervised MultiModal Contrastive UniS-MMC learning method outperforms current state-of-the-art multimodal methods. The detailed ablation study and analysis further demonstrate the advantage of our proposed method.

pdf
Type Enhanced BERT for Correcting NER Errors
Kuai Li | Chen Chen | Tao Yang | Tianming Du | Peijie Yu | Dong Du | Feng Zhang
Findings of the Association for Computational Linguistics: ACL 2023

We introduce the task of correcting named entity recognition (NER) errors without re-training model.After an NER model is trained and deployed in production,it makes prediction errors, which usually need to be fixed quickly.To address this problem, we firstly construct a gazetteer containing named entities and corresponding possible entity types.And then, we propose type enhanced BERT (TyBERT),a method that integrates the named entity’s type information into BERT by an adapter layer.When errors are identified, we can repair the model by updating the gazetteer.In other words, the gazetteer becomes a trigger to control NER model’s output.The experiment results in multiple corpus show the effectiveness of our method, which outperforms strong baselines.x

pdf
Dipping PLMs Sauce: Bridging Structure and Text for Effective Knowledge Graph Completion via Conditional Soft Prompting
Chen Chen | Yufei Wang | Aixin Sun | Bing Li | Kwok-Yan Lam
Findings of the Association for Computational Linguistics: ACL 2023

Knowledge Graph Completion (KGC) often requires both KG structural and textual information to be effective. Pre-trained Language Models (PLMs) have been used to learn the textual information, usually under the fine-tune paradigm for the KGC task. However, the fine-tuned PLMs often overwhelmingly focus on the textual information and overlook structural knowledge. To tackle this issue, this paper proposes CSProm-KG (Conditional Soft Prompts for KGC) which maintains a balance between structural information and textual knowledge. CSProm-KG only tunes the parameters of Conditional Soft Prompts that are generated by the entities and relations representations. We verify the effectiveness of CSProm-KG on three popular static KGC benchmarks WN18RR, FB15K-237 and Wikidata5M, and two temporal KGC benchmarks ICEWS14 and ICEWS05-15. CSProm-KG outperforms competitive baseline models and sets new state-of-the-art on these benchmarks. We conduct further analysis to show (i) the effectiveness of our proposed components, (ii) the efficiency of CSProm-KG, and (iii) the flexibility of CSProm-KG.

2022

pdf
Knowledge Is Flat: A Seq2Seq Generative Framework for Various Knowledge Graph Completion
Chen Chen | Yufei Wang | Bing Li | Kwok-Yan Lam
Proceedings of the 29th International Conference on Computational Linguistics

Knowledge Graph Completion (KGC) has been recently extended to multiple knowledge graph (KG) structures, initiating new research directions, e.g. static KGC, temporal KGC and few-shot KGC. Previous works often design KGC models closely coupled with specific graph structures, which inevitably results in two drawbacks: 1) structure-specific KGC models are mutually incompatible; 2) existing KGC methods are not adaptable to emerging KGs. In this paper, we propose KG-S2S, a Seq2Seq generative framework that could tackle different verbalizable graph structures by unifying the representation of KG facts into “flat” text, regardless of their original form. To remedy the KG structure information loss from the “flat” text, we further improve the input representations of entities and relations, and the inference algorithm in KG-S2S. Experiments on five benchmarks show that KG-S2S outperforms many competitive baselines, setting new state-of-the-art performance. Finally, we analyze KG-S2S’s ability on the different relations and the Non-entity Generations.

pdf
Rule Based Event Extraction for Artificial Social Intelligence
Remo Nitschke | Yuwei Wang | Chen Chen | Adarsh Pyarelal | Rebecca Sharp
Proceedings of the First Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning

Natural language (as opposed to structured communication modes such as Morse code) is by far the most common mode of communication between humans, and can thus provide significant insight into both individual mental states and interpersonal dynamics. As part of DARPA’s Artificial Social Intelligence for Successful Teams (ASIST) program, we are developing an AI agent team member that constructs and maintains models of their human teammates and provides appropriate task-relevant advice to improve team processes and mission performance. One of the key components of this agent is a module that uses a rule-based approach to extract task-relevant events from natural language utterances in real time, and publish them for consumption by downstream components. In this case study, we evaluate the performance of our rule-based event extraction system on a recently conducted ASIST experiment consisting of a simulated urban search and rescue mission in Minecraft. We compare the performance of our approach with that of a zero-shot neural classifier, and find that our approach outperforms the classifier for all event types, even when the classifier is used in an oracle setting where it knows how many events should be extracted from each utterance.

pdf
Extracted BERT Model Leaks More Information than You Think!
Xuanli He | Lingjuan Lyu | Chen Chen | Qiongkai Xu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

The collection and availability of big data, combined with advances in pre-trained models (e.g. BERT), have revolutionized the predictive performance of natural language processing tasks. This allows corporations to provide machine learning as a service (MLaaS) by encapsulating fine-tuned BERT-based models as APIs. Due to significant commercial interest, there has been a surge of attempts to steal remote services via model extraction. Although previous works have made progress in defending against model extraction attacks, there has been little discussion on their performance in preventing privacy leakage. This work bridges this gap by launching an attribute inference attack against the extracted BERT model. Our extensive experiments reveal that model extraction can cause severe privacy leakage even when victim models are facilitated with state-of-the-art defensive strategies.

2021

pdf
On the Transformer Growth for Progressive BERT Training
Xiaotao Gu | Liyuan Liu | Hongkun Yu | Jing Li | Chen Chen | Jiawei Han
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

As the excessive pre-training cost arouses the need to improve efficiency, considerable efforts have been made to train BERT progressively–start from an inferior but low-cost model and gradually increase the computational complexity. Our objective is to help advance the understanding of such Transformer growth and discover principles that guide progressive training. First, we find that similar to network architecture selection, Transformer growth also favors compound scaling. Specifically, while existing methods only conduct network growth in a single dimension, we observe that it is beneficial to use compound growth operators and balance multiple dimensions (e.g., depth, width, and input length of the model). Moreover, we explore alternative growth operators in each dimension via controlled comparison to give practical guidance for operator selection. In light of our analyses, the proposed method CompoundGrow speeds up BERT pre-training by 73.6% and 82.2% for the base and large models respectively while achieving comparable performances.

2020

pdf
Adversarial Self-Supervised Data-Free Distillation for Text Classification
Xinyin Ma | Yongliang Shen | Gongfan Fang | Chen Chen | Chenghao Jia | Weiming Lu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Large pre-trained transformer-based language models have achieved impressive results on a wide range of NLP tasks. In the past few years, Knowledge Distillation(KD) has become a popular paradigm to compress a computationally expensive model to a resource-efficient lightweight model. However, most KD algorithms, especially in NLP, rely on the accessibility of the original training dataset, which may be unavailable due to privacy issues. To tackle this problem, we propose a novel two-stage data-free distillation method, named Adversarial self-Supervised Data-Free Distillation (AS-DFD), which is designed for compressing large-scale transformer-based models (e.g., BERT). To avoid text generation in discrete space, we introduce a Plug & Play Embedding Guessing method to craft pseudo embeddings from the teacher’s hidden knowledge. Meanwhile, with a self-supervised module to quantify the student’s ability, we adapt the difficulty of pseudo embeddings in an adversarial training manner. To the best of our knowledge, our framework is the first data-free distillation framework designed for NLP tasks. We verify the effectiveness of our method on several text classification datasets.

pdf
SynET: Synonym Expansion using Transitivity
Jiale Yu | Yongliang Shen | Xinyin Ma | Chenghao Jia | Chen Chen | Weiming Lu
Findings of the Association for Computational Linguistics: EMNLP 2020

In this paper, we study a new task of synonym expansion using transitivity, and propose a novel approach named SynET, which considers both the contexts of two given synonym pairs. It introduces an auxiliary task to reduce the impact of noisy sentences, and proposes a Multi-Perspective Entity Matching Network to match entities from multiple perspectives. Extensive experiments on a real-world dataset show the effectiveness of our approach.

2019

pdf
Essentia: Mining Domain-specific Paraphrases with Word-Alignment Graphs
Danni Ma | Chen Chen | Behzad Golshan | Wang-Chiew Tan
Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)

Paraphrases are important linguistic resources for a wide variety of NLP applications. Many techniques for automatic paraphrase mining from general corpora have been proposed. While these techniques are successful at discovering generic paraphrases, they often fail to identify domain-specific paraphrases (e.g., staff, concierge in the hospitality domain). This is because current techniques are often based on statistical methods, while domain-specific corpora are too small to fit statistical methods. In this paper, we present an unsupervised graph-based technique to mine paraphrases from a small set of sentences that roughly share the same topic or intent. Our system, Essentia, relies on word-alignment techniques to create a word-alignment graph that merges and organizes tokens from input sentences. The resulting graph is then used to generate candidate paraphrases. We demonstrate that our system obtains high quality paraphrases, as evaluated by crowd workers. We further show that the majority of the identified paraphrases are domain-specific and thus complement existing paraphrase databases.

2016

pdf
Chinese Zero Pronoun Resolution with Deep Neural Networks
Chen Chen | Vincent Ng
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf
Chinese Event Coreference Resolution: An Unsupervised Probabilistic Model Rivaling Supervised Resolvers
Chen Chen | Vincent Ng
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Chinese Zero Pronoun Resolution: A Joint Unsupervised Discourse-Aware Model Rivaling State-of-the-Art Resolvers
Chen Chen | Vincent Ng
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf
Chinese Zero Pronoun Resolution: An Unsupervised Probabilistic Model Rivaling Supervised Resolvers
Chen Chen | Vincent Ng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Relieving the Computational Bottleneck: Joint Inference for Event Extraction with High-Dimensional Features
Deepak Venugopal | Chen Chen | Vibhav Gogate | Vincent Ng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
SinoCoreferencer: An End-to-End Chinese Event Coreference Resolver
Chen Chen | Vincent Ng
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Compared to entity coreference resolution, there is a relatively small amount of work on event coreference resolution. Much work on event coreference was done for English. In fact, to our knowledge, there are no publicly available results on Chinese event coreference resolution. This paper describes the design, implementation, and evaluation of SinoCoreferencer, an end-to-end state-of-the-art ACE-style Chinese event coreference system. We have made SinoCoreferencer publicly available, in hope to facilitate the development of high-level Chinese natural language applications that can potentially benefit from event coreference information.

2013

pdf
Chinese Zero Pronoun Resolution: Some Recent Advances
Chen Chen | Vincent Ng
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf
Modeling Comma Placement in Chinese Text for Better Readability using Linguistic Features and Gaze Information
Tadayoshi Hara | Chen Chen | Yoshinobu Kano | Akiko Aizawa
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations

pdf
Chinese Event Coreference Resolution: Understanding the State of the Art
Chen Chen | Vincent Ng
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf
Linguistically Aware Coreference Evaluation Metrics
Chen Chen | Vincent Ng
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

pdf
Joint Modeling for Chinese Event Extraction with Rich Linguistic Features
Chen Chen | Vincent Ng
Proceedings of COLING 2012

pdf
Chinese Noun Phrase Coreference Resolution: Insights into the State of the Art
Chen Chen | Vincent Ng
Proceedings of COLING 2012: Posters

pdf
Combining the Best of Two Worlds: A Hybrid Approach to Multilingual Coreference Resolution
Chen Chen | Vincent Ng
Joint Conference on EMNLP and CoNLL - Shared Task

2010

pdf
A Pipeline Approach to Chinese Personal Name Disambiguation
Yang Song | Zhengyan He | Chen Chen | Houfeng Wang
CIPS-SIGHAN Joint Conference on Chinese Language Processing

2009

pdf
Clustering Technique in Multi-Document Personal Name Disambiguation
Chen Chen | Junfeng Hu | Houfeng Wang
Proceedings of the ACL-IJCNLP 2009 Student Research Workshop