Yidong Chen


2022

pdf
Towards Robust Neural Machine Translation with Iterative Scheduled Data-Switch Training
Zhongjian Miao | Xiang Li | Liyan Kang | Wen Zhang | Chulun Zhou | Yidong Chen | Bin Wang | Min Zhang | Jinsong Su
Proceedings of the 29th International Conference on Computational Linguistics

Most existing methods on robust neural machine translation (NMT) construct adversarial examples by injecting noise into authentic examples and indiscriminately exploit two types of examples. They require the model to translate both the authentic source sentence and its adversarial counterpart into the identical target sentence within the same training stage, which may be a suboptimal choice to achieve robust NMT. In this paper, we first conduct a preliminary study to confirm this claim and further propose an Iterative Scheduled Data-switch Training Framework to mitigate this problem. Specifically, we introduce two training stages, iteratively switching between authentic and adversarial examples. Compared with previous studies, our model focuses more on just one type of examples at each single stage, which can better exploit authentic and adversarial examples, and thus obtaining a better robust NMT model. Moreover, we introduce an improved curriculum learning method with a sampling strategy to better schedule the process of noise injection. Experimental results show that our model significantly surpasses several competitive baselines on four translation benchmarks. Our source code is available at https://github.com/DeepLearnXMU/RobustNMT-ISDST.

2021

pdf
A Multi-Task Approach for Improving Biomedical Named Entity Recognition by Incorporating Multi-Granularity information
Yiqi Tong | Yidong Chen | Xiaodong Shi
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf
一种基于IDLSTM+CRF的中文主地域抽取方法(A Chinese Main Location Extraction Method based on IDLSTM+CRF)
Yiqi Tong (童逸琦) | Peigen Ye (叶培根) | Biao Fu (付彪) | Yidong Chen (陈毅东) | Xiaodong Shi (史晓东)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

新闻文本通常会涉及多个地域,主地域则描述了文本舆情内容的地域属性,是进行舆情分析的关键属性。目前深度学习领域针对主地域自动抽取的研究还比较少。基于此,本文构建了一个基于IDLSTM+CRF的主地域抽取系统。该系统通过地名识别、主地域抽取、主地域补全三大模块实现对主地域标签的自动抽取和补全。在公开数据集上的实验结果表明,我们的方法在地名识别任务上要优于BiLSTM+CRF等模型。而对于主地域抽取任务,目前还没有标准的中文主地域评测集合。针对该问题,我们标注并开源了1226条验证集和1500条测试集。最终,我们的主地域抽取系统在两个集合上分别取得了91.7%和84.8%的抽取准确率,并成功运用于线上生产环境。

pdf
XMU’s Simultaneous Translation System at NAACL 2021
Shuangtao Li | Jinming Hu | Boli Wang | Xiaodong Shi | Yidong Chen
Proceedings of the Second Workshop on Automatic Simultaneous Translation

This paper describes our two systems submitted to the simultaneous translation evaluation at the 2nd automatic simultaneous translation workshop.

2020

pdf
A Document-Level Neural Machine Translation Model with Dynamic Caching Guided by Theme-Rheme Information
Yiqi Tong | Jiangbin Zheng | Hongkang Zhu | Yidong Chen | Xiaodong Shi
Proceedings of the 28th International Conference on Computational Linguistics

Research on document-level Neural Machine Translation (NMT) models has attracted increasing attention in recent years. Although the proposed works have proved that the inter-sentence information is helpful for improving the performance of the NMT models, what information should be regarded as context remains ambiguous. To solve this problem, we proposed a novel cache-based document-level NMT model which conducts dynamic caching guided by theme-rheme information. The experiments on NIST evaluation sets demonstrate that our proposed model achieves substantial improvements over the state-of-the-art baseline NMT models. As far as we know, we are the first to introduce theme-rheme theory into the field of machine translation.

2018

pdf
XMU Neural Machine Translation Systems for WAT2018 Myanmar-English Translation Task
Boli Wang | Jinming Hu | Yidong Chen | Xiaodong Shi
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation

2017

pdf
Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings
Changxing Wu | Xiaodong Shi | Yidong Chen | Jinsong Su | Boli Wang
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We introduce a simple and effective method to learn discourse-specific word embeddings (DSWE) for implicit discourse relation recognition. Specifically, DSWE is learned by performing connective classification on massive explicit discourse data, and capable of capturing discourse relationships between words. On the PDTB data set, using DSWE as features achieves significant improvements over baselines.

pdf
XMU Neural Machine Translation Online Service
Boli Wang | Zhixing Tan | Jinming Hu | Yidong Chen | Xiaodong Shi
Proceedings of the IJCNLP 2017, System Demonstrations

We demonstrate a neural machine translation web service. Our NMT service provides web-based translation interfaces for a variety of language pairs. We describe the architecture of NMT runtime pipeline and the training details of NMT models. We also show several applications of our online translation interfaces.

pdf
XMU Neural Machine Translation Systems for WMT 17
Zhixing Tan | Boli Wang | Jinming Hu | Yidong Chen | Xiaodong Shi
Proceedings of the Second Conference on Machine Translation

pdf
XMU Neural Machine Translation Systems for WAT 2017
Boli Wang | Zhixing Tan | Jinming Hu | Yidong Chen | Xiaodong Shi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

This paper describes the Neural Machine Translation systems of Xiamen University for the shared translation tasks of WAT 2017. Our systems are based on the Encoder-Decoder framework with attention. We participated in three subtasks. We experimented subword segmentation, synthetic training data and model ensembling. Experiments show that all these methods can give substantial improvements.

2016

pdf
Bilingually-constrained Synthetic Data for Implicit Discourse Relation Recognition
Changxing Wu | Xiaodong Shi | Yidong Chen | Yanzhou Huang | Jinsong Su
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2014

pdf
On-going Cooperative Research towards Developing Economy-Oriented Chinese-French SMT Systems with a New SMT Framework
Yidong Chen | Lingxiao Wang | Christian Boitet | Xiaodong Shi
Proceedings of TALN 2014 (Volume 2: Short Papers)

2013

pdf
Improving Alignment of System Combination by Using Multi-objective Optimization
Tian Xia | Zongcheng Ji | Shaodan Zhai | Yidong Chen | Qun Liu | Shaojun Wang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf
Towards Automatic Construction of Knowledge Bases from Chinese Online Resources
Liwei Chen | Yansong Feng | Yidong Chen | Lei Zou | Dongyan Zhao
Proceedings of ACL 2012 Student Research Workshop

pdf
Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information
Jinsong Su | Hua Wu | Haifeng Wang | Yidong Chen | Xiaodong Shi | Huailin Dong | Qun Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2011

pdf
Improving the Hierarchical Phrase-Based Translation Model
Xiaodong Shi | Xiang Zhu | Yidong Chen
Proceedings of Machine Translation Summit XIII: Papers

2010

pdf
Chinese Personal Name Disambiguation: Technical Report of Natural Language Processing Lab of Xiamen University
Xiang Zhu | Xiaodong Shi | Ningfeng Liu | YingMei Guo | Yidong Chen
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf
Chinese Word Sense Induction based on Hierarchical Clustering Algorithm
Ke Cai | Xiaodong Shi | Yidong Chen | Zhehuang Huang | Yan Gao
CIPS-SIGHAN Joint Conference on Chinese Language Processing

2007

pdf
The XMU SMT system for IWSLT 2007
Yidong Chen | Xiaodong Shi | Changle Zhou
Proceedings of the Fourth International Workshop on Spoken Language Translation

In this paper, an overview of the XMU statistical machine translation (SMT) system for the 2007 IWSLT Speech Translation Evaluation is given. Our system is a phrase-based system with a reordering model based on chunking and reordering of source language. In this year’s evaluation, we participated in the open data track for Clean Transcripts for the Chinese-English translation direction. The system ranked the 12th among the 15 participating systems.

2006

pdf
The XMU phrase-based statistical machine translation system for IWSLT 2006
Yidong Chen | Xiaodong Shi | Changle Zhou
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign