Liang Chen


2022

pdf
Hierarchical Curriculum Learning for AMR Parsing
Peiyi Wang | Liang Chen | Tianyu Liu | Damai Dai | Yunbo Cao | Baobao Chang | Zhifang Sui
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Abstract Meaning Representation (AMR) parsing aims to translate sentences to semantic representation with a hierarchical structure, and is recently empowered by pretrained sequence-to-sequence models. However, there exists a gap between their flat training objective (i.e., equally treats all output tokens) and the hierarchical AMR structure, which limits the model generalization. To bridge this gap, we propose a Hierarchical Curriculum Learning (HCL) framework with Structure-level (SC) and Instance-level Curricula (IC). SC switches progressively from core to detail AMR semantic elements while IC transits from structure-simple to -complex AMR instances during training. Through these two warming-up processes, HCL reduces the difficulty of learning complex structures, thus the flat model can better adapt to the AMR hierarchy. Extensive experiments on AMR2.0, AMR3.0, structure-complex and out-of-distribution situations verify the effectiveness of HCL.

pdf
Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation
Liang Chen | Runxin Xu | Baobao Chang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Label smoothing and vocabulary sharing are two widely used techniques in neural machine translation models. However, we argue that simply applying both techniques can be conflicting and even leads to sub-optimal performance. When allocating smoothed probability, original label smoothing treats the source-side words that would never appear in the target language equally to the real target-side words, which could bias the translation model. To address this issue, we propose Masked Label Smoothing (MLS), a new mechanism that masks the soft label probability of source-side words to zero. Simple yet effective, MLS manages to better integrate label smoothing with vocabulary sharing. Our extensive experiments show that MLS consistently yields improvement over original label smoothing on different datasets, including bilingual and multilingual translation from both translation quality and model’s calibration. Our code is released at https://github.com/PKUnlp-icler/MLS

pdf
ATP: AMRize Then Parse! Enhancing AMR Parsing with PseudoAMRs
Liang Chen | Peiyi Wang | Runxin Xu | Tianyu Liu | Zhifang Sui | Baobao Chang
Findings of the Association for Computational Linguistics: NAACL 2022

As Abstract Meaning Representation (AMR) implicitly involves compound semantic annotations, we hypothesize auxiliary tasks which are semantically or formally related can better enhance AMR parsing. We find that 1) Semantic role labeling (SRL) and dependency parsing (DP), would bring more performance gain than other tasks e.g. MT and summarization in the text-to-AMR transition even with much less data. 2) To make a better fit for AMR, data from auxiliary tasks should be properly “AMRized” to PseudoAMR before training. Knowledge from shallow level parsing tasks can be better transferred to AMR Parsing with structure transform. 3) Intermediate-task learning is a better paradigm to introduce auxiliary tasks to AMR parsing, compared to multitask learning. From an empirical perspective, we propose a principled method to involve auxiliary tasks to boost AMR parsing. Extensive experiments show that our method achieves new state-of-the-art performance on different benchmarks especially in topology-related scores. Code and models are released at https://github.com/PKUnlp-icler/ATP.

2021

pdf
DialogSum: A Real-Life Scenario Dialogue Summarization Dataset
Yulong Chen | Yang Liu | Liang Chen | Yue Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2003

pdf
A Differential LSI Method for Document Classification
Liang Chen | Naoyuki Tokuda | Akira Nagai
Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages

pdf
A Patent Document Retrieval System Addressing Both Semantic and Syntactic Properties
Liang Chen | Naoyuki Tokuda | Hisahiro Adachi
Proceedings of the ACL-2003 Workshop on Patent Corpus Processing

1999

pdf
A new diagnostic system for J-E translation ILTS by global matching algorithm and POST parser
Liang Chen | Naoyuki Tokuda
Proceedings of Machine Translation Summit VII

A new diagnostic system has been developed for an interactive template-structured intelligent language tutoring system (ILTS) for Japanese-English translation where an efficient heaviest common sequence (HCS) matching algorithm and a ‘part-of-speech tagged (POST) parser’ play a key role. This is implemented by exploiting the system template which consists of a complex transition networks comprising both model (correct) translations and many typical erroneous translations characteristic of nonnative beginners all collected and extracted from translations of about 200 monitors. By selecting, from among many candidates’ paths in the system template, a path having a HCS with the student’s input translation as a best matched sentence, the template structure of the diagnostic system allows the potentially complicated bug finding processes in natural language to be implemented by a much simpler and efficient HCS string matching algorithm [20]. To improve the precision of a parser, we have developed a ‘probabilistic POST parser’ where we have eliminated ambiguity in part-of-speeches by manually pre-assigning POS tags to all words in potentially correct paths of the template. Combining the templatebased diagnostic system and the parser, we found that the ILTS is capable of providing most adequate diagnostic messages and a tutoring strategy with appropriate comments after analyzing the keyed-in translated sentences from students.

1997

pdf
QuickSet: Multimodal Interaction for Simulation Set-up and Control
Philip R. Cohen | Michael Johnston | David McGee | Sharon Oviatt | Jay Pittman | Ira Smith | Liang Chen | Josh Clow
Fifth Conference on Applied Natural Language Processing