Yu Hong


Sub-Word Alignment is Still Useful: A Vest-Pocket Method for Enhancing Low-Resource Machine Translation
Minhan Xu | Yu Hong
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We leverage embedding duplication between aligned sub-words to extend the Parent-Child transfer learning method, so as to improve low-resource machine translation. We conduct experiments on benchmark datasets of My-En, Id-En and Tr-En translation scenarios. The test results show that our method produces substantial improvements, achieving the BLEU scores of 22.5, 28.0 and 18.1 respectively. In addition, the method is computationally efficient which reduces the consumption of training time by 63.8%, reaching the duration of 1.6 hours when training on a Tesla 16GB P100 GPU. All the models and source codes in the experiments will be made publicly available to support reproducible research.

Capturing Conversational Interaction for Question Answering via Global History Reasoning
Jin Qian | Bowei Zou | Mengxing Dong | Xiao Li | AiTi Aw | Yu Hong
Findings of the Association for Computational Linguistics: NAACL 2022

Conversational Question Answering (ConvQA) is required to answer the current question, conditioned on the observable paragraph-level context and conversation history. Previous works have intensively studied history-dependent reasoning. They perceive and absorb topic-related information of prior utterances in the interactive encoding stage. It yielded significant improvement compared to history-independent reasoning. This paper further strengthens the ConvQA encoder by establishing long-distance dependency among global utterances in multi-turn conversation. We use multi-layer transformers to resolve long-distance relationships, which potentially contribute to the reweighting of attentive information in historical utterances.Experiments on QuAC show that our method obtains a substantial improvement (1%), yielding the F1 score of 73.7%. All source codes are available at https://github.com/jaytsien/GHR.

Unregulated Chinese-to-English Data Expansion Does NOT Work for Neural Event Detection
Zhongqiu Li | Yu Hong | Jie Wang | Shiming He | Jianmin Yao | Guodong Zhou
Proceedings of the 29th International Conference on Computational Linguistics

We leverage cross-language data expansion and retraining to enhance neural Event Detection (abbr., ED) on English ACE corpus. Machine translation is utilized for expanding English training set of ED from that of Chinese. However, experimental results illustrate that such strategy actually results in performance degradation. The survey of translations suggests that the mistakenly-aligned triggers in the expanded data negatively influences the retraining process. We refer this phenomenon to “trigger falsification”. To overcome the issue, we apply heuristic rules for regulating the expanded data, fixing the distracting samples that contain the falsified triggers. The supplementary experiments show that the rule-based regulation is beneficial, yielding the improvement of about 1.6% F1-score for ED. We additionally prove that, instead of transfer learning from the translated ED data, the straight data combination by random pouring surprisingly performs better.

Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word-based Graph Parsing
Shilin Zhou | Qingrong Xia | Zhenghua Li | Yu Zhang | Yu Hong | Min Zhang
Proceedings of the 29th International Conference on Computational Linguistics

This paper proposes to cast end-to-end span-based SRL as a word-based graph parsing task. The major challenge is how to represent spans at the word level. Borrowing ideas from research on Chinese word segmentation and named entity recognition, we propose and compare four different schemata of graph representation, i.e., BES, BE, BIES, and BII, among which we find that the BES schema performs the best. We further gain interesting insights through detailed analysis. Moreover, we propose a simple constrained Viterbi procedure to ensure the legality of the output graph according to the constraints of the SRL structure. We conduct experiments on two widely used benchmark datasets, i.e., CoNLL05 and CoNLL12. Results show that our word-based graph parsing approach achieves consistently better performance than previous results, under all settings of end-to-end and predicate-given, without and with pre-trained language models (PLMs). More importantly, our model can parse 669/252 sentences per second, without and with PLMs respectively.

Taking Actions Separately: A Bidirectionally-Adaptive Transfer Learning Method for Low-Resource Neural Machine Translation
Xiaolin Xing | Yu Hong | Minhan Xu | Jianmin Yao | Guodong Zhou
Proceedings of the 29th International Conference on Computational Linguistics

Training Neural Machine Translation (NMT) models suffers from sparse parallel data, in the infrequent translation scenarios towards low-resource source languages. The existing solutions primarily concentrate on the utilization of Parent-Child (PC) transfer learning. It transfers well-trained NMT models on high-resource languages (namely Parent NMT) to low-resource languages, so as to produce Child NMT models by fine-tuning. It has been carefully demonstrated that a variety of PC variants yield significant improvements for low-resource NMT. In this paper, we intend to enhance PC-based NMT by a bidirectionally-adaptive learning strategy. Specifically, we divide inner constituents (6 transformers) of Parent encoder into two “teams”, i.e., T1 and T2. During representation learning, T1 learns to encode low-resource languages conditioned on bilingual shareable latent space. Generative adversarial network and masked language modeling are used for space-shareable encoding. On the other hand, T2 is straightforwardly transferred to low-resource languages, and fine-tuned together with T1 for low-resource translation. Briefly, T1 and T2 take actions separately for different goals. The former aims to adapt to characteristics of low-resource languages during encoding, while the latter adapts to translation experiences learned from high-resource languages. We experiment on benchmark corpora SETIMES, conducting low-resource NMT for Albanian (Sq), Macedonian (Mk), Croatian (Hr) and Romanian (Ro). Experimental results show that our method yields substantial improvements, which allows the NMT performance to reach BLEU4-scores of 62.24%, 56.93%, 50.53% and 54.65% for Sq, Mk, Hr and Ro, respectively.

DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating the Robustness of Question Matching Models
Hongyu Zhu | Yan Chen | Jing Yan | Jing Liu | Yu Hong | Ying Chen | Hua Wu | Haifeng Wang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

In this paper, we focus on the robustness evaluation of Chinese Question Matching (QM) models. Most of the previous work on analyzing robustness issues focus on just one or a few types of artificial adversarial examples. Instead, we argue that a comprehensive evaluation should be conducted on natural texts, which takes into account the fine-grained linguistic capabilities of QM models. For this purpose, we create a Chinese dataset namely DuQM which contains natural questions with linguistic perturbations to evaluate the robustness of QM models. DuQM contains 3 categories and 13 subcategories with 32 linguistic perturbations. The extensive experiments demonstrate that DuQM has a better ability to distinguish different models. Importantly, the detailed breakdown of evaluation by the linguistic phenomena in DuQM helps us easily diagnose the strength and weakness of different models. Additionally, our experiment results show that the effect of artificial adversarial examples does not work on natural texts. Our baseline codes and a leaderboard are now publicly available.


DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications
Hongxuan Tang | Hongyu Li | Jing Liu | Yu Hong | Hua Wu | Haifeng Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Machine reading comprehension (MRC) is a crucial task in natural language processing and has achieved remarkable advancements. However, most of the neural MRC models are still far from robust and fail to generalize well in real-world applications. In order to comprehensively verify the robustness and generalization of MRC models, we introduce a real-world Chinese dataset – DuReader_robust . It is designed to evaluate the MRC models from three aspects: over-sensitivity, over-stability and generalization. Comparing to previous work, the instances in DuReader_robust are natural texts, rather than the altered unnatural texts. It presents the challenges when applying MRC models to real-world applications. The experimental results show that MRC models do not perform well on the challenge test set. Moreover, we analyze the behavior of existing models on the challenge test set, which may provide suggestions for future model development. The dataset and codes are publicly available at https://github.com/baidu/DuReader.

Winnowing Knowledge for Multi-choice Question Answering
Yeqiu Li | Bowei Zou | Zhifeng Li | Ai Ti Aw | Yu Hong | Qiaoming Zhu
Findings of the Association for Computational Linguistics: EMNLP 2021

We tackle multi-choice question answering. Acquiring related commonsense knowledge to the question and options facilitates the recognition of the correct answer. However, the current reasoning models suffer from the noises in the retrieved knowledge. In this paper, we propose a novel encoding method which is able to conduct interception and soft filtering. This contributes to the harvesting and absorption of representative information with less interference from noises. We experiment on CommonsenseQA. Experimental results illustrate that our method yields substantial and consistent improvements compared to the strong Bert, RoBERTa and Albert-based baselines.

CVAE-based Re-anchoring for Implicit Discourse Relation Classification
Zujun Dou | Yu Hong | Yu Sun | Guodong Zhou
Findings of the Association for Computational Linguistics: EMNLP 2021

Training implicit discourse relation classifiers suffers from data sparsity. Variational AutoEncoder (VAE) appears to be the proper solution. It is because ideally VAE is capable of generating inexhaustible varying samples, and this facilitates selective data augmentation. However, our experiments show that coupling VAE with the RoBERTa-based classifier results in severe performance degradation. We ascribe the unusual phenomenon to erroneous sampling that would happen when VAE pursued variations. To overcome the problem, we develop a re-anchoring strategy, where Conditional VAE (CVAE) is used for estimating the risk of erroneous sampling, and meanwhile migrating the anchor to reduce the risk. The test results on PDTB v2.0 illustrate that, compared to the RoBERTa-based baseline, re-anchoring yields substantial improvements. Besides, we observe that re-anchoring can cooperate with other auxiliary strategies (transfer learning and interactive attention mechanism) to further improve the baseline, obtaining the F-scores of about 55%, 63%, 80% and 44% for the four main relation types (Comparison, Contingency, Expansion, Temporality) in the binary classification (Yes/No) scenario.

pdf bib
Feature-level Incongruence Reduction for Multimodal Translation
Zhifeng Li | Yu Hong | Yuchen Pan | Jian Tang | Jianmin Yao | Guodong Zhou
Proceedings of the Second Workshop on Advances in Language and Vision Research

Caption translation aims to translate image annotations (captions for short). Recently, Multimodal Neural Machine Translation (MNMT) has been explored as the essential solution. Besides of linguistic features in captions, MNMT allows visual(image) features to be used. The integration of multimodal features reinforces the semantic representation and considerably improves translation performance. However, MNMT suffers from the incongruence between visual and linguistic features. To overcome the problem, we propose to extend MNMT architecture with a harmonization network, which harmonizes multimodal features(linguistic and visual features)by unidirectional modal space conversion. It enables multimodal translation to be carried out in a seemingly monomodal translation pipeline. We experiment on the golden Multi30k-16 and 17. Experimental results show that, compared to the baseline,the proposed method yields the improvements of 2.2% BLEU for the scenario of translating English captions into German (En→De) at best,7.6% for the case of English-to-French translation(En→Fr) and 1.5% for English-to-Czech(En→Cz). The utilization of harmonization network leads to the competitive performance to the-state-of-the-art.


Using a Penalty-based Loss Re-estimation Method to Improve Implicit Discourse Relation Classification
Xiao Li | Yu Hong | Huibin Ruan | Zhen Huang
Proceedings of the 28th International Conference on Computational Linguistics

We tackle implicit discourse relation classification, a task of automatically determining semantic relationships between arguments. The attention-worthy words in arguments are crucial clues for classifying the discourse relations. Attention mechanisms have been proven effective in highlighting the attention-worthy words during encoding. However, our survey shows that some inessential words are unintentionally misjudged as the attention-worthy words and, therefore, assigned heavier attention weights than should be. We propose a penalty-based loss re-estimation method to regulate the attention learning process, integrating penalty coefficients into the computation of loss by means of overstability of attention weight distributions. We conduct experiments on the Penn Discourse TreeBank (PDTB) corpus. The test results show that our loss re-estimation method leads to substantial improvements for a variety of attention mechanisms, and it obtains highly competitive performance compared to the state-of-the-art methods.

NUT-RC: Noisy User-generated Text-oriented Reading Comprehension
Rongtao Huang | Bowei Zou | Yu Hong | Wei Zhang | AiTi Aw | Guodong Zhou
Proceedings of the 28th International Conference on Computational Linguistics

Reading comprehension (RC) on social media such as Twitter is a critical and challenging task due to its noisy, informal, but informative nature. Most existing RC models are developed on formal datasets such as news articles and Wikipedia documents, which severely limit their performances when directly applied to the noisy and informal texts in social media. Moreover, these models only focus on a certain type of RC, extractive or generative, but ignore the integration of them. To well address these challenges, we come up with a noisy user-generated text-oriented RC model. In particular, we first introduce a set of text normalizers to transform the noisy and informal texts to the formal ones. Then, we integrate the extractive and the generative RC model by a multi-task learning mechanism and an answer selection module. Experimental results on TweetQA demonstrate that our NUT-RC model significantly outperforms the state-of-the-art social media-oriented RC models.

Interactively-Propagative Attention Learning for Implicit Discourse Relation Recognition
Huibin Ruan | Yu Hong | Yang Xu | Zhen Huang | Guodong Zhou | Min Zhang
Proceedings of the 28th International Conference on Computational Linguistics

We tackle implicit discourse relation recognition. Both self-attention and interactive-attention mechanisms have been applied for attention-aware representation learning, which improves the current discourse analysis models. To take advantages of the two attention mechanisms simultaneously, we develop a propagative attention learning model using a cross-coupled two-channel network. We experiment on Penn Discourse Treebank. The test results demonstrate that our model yields substantial improvements over the baselines (BiLSTM and BERT).

Argumentation Mining on Essays at Multi Scales
Hao Wang | Zhen Huang | Yong Dou | Yu Hong
Proceedings of the 28th International Conference on Computational Linguistics

Argumentation mining on essays is a new challenging task in natural language processing, which aims to identify the types and locations of argumentation components. Recent research mainly models the task as a sequence tagging problem and deal with all the argumentation components at word level. However, this task is not scale-independent. Some types of argumentation components which serve as core opinions on essays or paragraphs, are at essay level or paragraph level. Sequence tagging method conducts reasoning by local context words, and fails to effectively mine these components. To this end, we propose a multi-scale argumentation mining model, where we respectively mine different types of argumentation components at corresponding levels. Besides, an effective coarse-to-fine argumentation fusion mechanism is proposed to further improve the performance. We conduct a serial of experiments on the Persuasive Essay dataset (PE2.0). Experimental results indicate that our model outperforms existing models on mining all types of argumentation components.

Don’t Eclipse Your Arts Due to Small Discrepancies: Boundary Repositioning with a Pointer Network for Aspect Extraction
Zhenkai Wei | Yu Hong | Bowei Zou | Meng Cheng | Jianmin Yao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The current aspect extraction methods suffer from boundary errors. In general, these errors lead to a relatively minor difference between the extracted aspects and the ground-truth. However, they hurt the performance severely. In this paper, we propose to utilize a pointer network for repositioning the boundaries. Recycling mechanism is used, which enables the training data to be collected without manual intervention. We conduct the experiments on the benchmark datasets SE14 of laptop and SE14-16 of restaurant. Experimental results show that our method achieves substantial improvements over the baseline, and outperforms state-of-the-art methods.

基于多任务学习的生成式阅读理解(Generative Reading Comprehension via Multi-task Learning)
Jin Qian (钱锦) | Rongtao Huang (黄荣涛) | Bowei Zou (邹博伟) | Yu Hong (洪宇)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


汉语否定焦点识别研究:数据集与基线系统(Research on Chinese Negative Focus Identification: Dataset and Baseline)
Jiaxuan Sheng (盛佳璇) | Bowei Zou (邹博伟) | Longxiang Shen (沈龙骧) | Jing Ye (叶静) | Yu Hong (洪宇)
Proceedings of the 19th Chinese National Conference on Computational Linguistics



Negative Focus Detection via Contextual Attention Mechanism
Longxiang Shen | Bowei Zou | Yu Hong | Guodong Zhou | Qiaoming Zhu | AiTi Aw
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Negation is a universal but complicated linguistic phenomenon, which has received considerable attention from the NLP community over the last decade, since a negated statement often carries both an explicit negative focus and implicit positive meanings. For the sake of understanding a negated statement, it is critical to precisely detect the negative focus in context. However, how to capture contextual information for negative focus detection is still an open challenge. To well address this, we come up with an attention-based neural network to model contextual information. In particular, we introduce a framework which consists of a Bidirectional Long Short-Term Memory (BiLSTM) neural network and a Conditional Random Fields (CRF) layer to effectively encode the order information and the long-range context dependency in a sentence. Moreover, we design two types of attention mechanisms, word-level contextual attention and topic-level contextual attention, to take advantage of contextual information across sentences from both the word perspective and the topic perspective, respectively. Experimental results on the SEM’12 shared task corpus show that our approach achieves the best performance on negative focus detection, yielding an absolute improvement of 2.11% over the state-of-the-art. This demonstrates the great effectiveness of the two types of contextual attention mechanisms.


Self-regulation: Employing a Generative Adversarial Network to Improve Event Detection
Yu Hong | Wenxuan Zhou | Jingli Zhang | Guodong Zhou | Qiaoming Zhu
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Due to the ability of encoding and mapping semantic information into a high-dimensional latent feature space, neural networks have been successfully used for detecting events to a certain extent. However, such a feature space can be easily contaminated by spurious features inherent in event detection. In this paper, we propose a self-regulated learning approach by utilizing a generative adversarial network to generate spurious features. On the basis, we employ a recurrent network to eliminate the fakes. Detailed experiments on the ACE 2005 and TAC-KBP 2015 corpora show that our proposed method is highly effective and adaptable.

Using active learning to expand training data for implicit discourse relation recognition
Yang Xu | Yu Hong | Huibin Ruan | Jianmin Yao | Min Zhang | Guodong Zhou
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We tackle discourse-level relation recognition, a problem of determining semantic relations between text spans. Implicit relation recognition is challenging due to the lack of explicit relational clues. The increasingly popular neural network techniques have been proven effective for semantic encoding, whereby widely employed to boost semantic relation discrimination. However, learning to predict semantic relations at a deep level heavily relies on a great deal of training data, but the scale of the publicly available data in this field is limited. In this paper, we follow Rutherford and Xue (2015) to expand the training data set using the corpus of explicitly-related arguments, by arbitrarily dropping the overtly presented discourse connectives. On the basis, we carry out an experiment of sampling, in which a simple active learning approach is used, so as to take the informative instances for data expansion. The goal is to verify whether the selective use of external data not only reduces the time consumption of retraining but also ensures a better system performance. Using the expanded training data, we retrain a convolutional neural network (CNN) based classifer which is a simplified version of Qin et al. (2016)’s stacking gated relation recognizer. Experimental results show that expanding the training set with small-scale carefully-selected external data yields substantial performance gain, with the improvements of about 4% for accuracy and 3.6% for F-score. This allows a weak classifier to achieve a comparable performance against the state-of-the-art systems.

Incorporating Image Matching Into Knowledge Acquisition for Event-Oriented Relation Recognition
Yu Hong | Yang Xu | Huibin Ruan | Bowei Zou | Jianmin Yao | Guodong Zhou
Proceedings of the 27th International Conference on Computational Linguistics

Event relation recognition is a challenging language processing task. It is required to determine the relation class of a pair of query events, such as causality, under the condition that there isn’t any reliable clue for use. We follow the traditional statistical approach in this paper, speculating the relation class of the target events based on the relation-class distributions on the similar events. There is minimal supervision used during the speculation process. In particular, we incorporate image processing into the acquisition of similar event instances, including the utilization of images for visually representing event scenes, and the use of the neural network based image matching for approximate calculation between events. We test our method on the ACE-R2 corpus and compared our model with the fully-supervised neural network models. Experimental results show that we achieve a comparable performance to CNN while slightly better than LSTM.

Adversarial Feature Adaptation for Cross-lingual Relation Classification
Bowei Zou | Zengzhuang Xu | Yu Hong | Guodong Zhou
Proceedings of the 27th International Conference on Computational Linguistics

Relation Classification aims to classify the semantic relationship between two marked entities in a given sentence. It plays a vital role in a variety of natural language processing applications. Most existing methods focus on exploiting mono-lingual data, e.g., in English, due to the lack of annotated data in other languages. In this paper, we come up with a feature adaptation approach for cross-lingual relation classification, which employs a generative adversarial network (GAN) to transfer feature representations from one language with rich annotated data to another language with scarce annotated data. Such a feature adaptation approach enables feature imitation via the competition between a relation classification network and a rival discriminator. Experimental results on the ACE 2005 multilingual training corpus, treating English as the source language and Chinese the target, demonstrate the effectiveness of our proposed approach, yielding an improvement of 5.7% over the state-of-the-art.


pdf bib
Building a Cross-document Event-Event Relation Corpus
Yu Hong | Tongtao Zhang | Tim O’Gorman | Sharone Horowit-Hendler | Heng Ji | Martha Palmer
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)

Image-Image Search for Comparable Corpora Construction
Yu Hong | Liang Yao | Mengyi Liu | Tongtao Zhang | Wenxuan Zhou | Jianmin Yao | Heng Ji
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)

We present a novel method of comparable corpora construction. Unlike the traditional methods which heavily rely on linguistic features, our method only takes image similarity into consid-eration. We use an image-image search engine to obtain similar images, together with the cap-tions in source language and target language. On the basis, we utilize captions of similar imag-es to construct sentence-level bilingual corpora. Experiments on 10,371 target captions show that our method achieves a precision of 0.85 in the top search results.


Biography-Dependent Collaborative Entity Archiving for Slot Filling
Yu Hong | Xiaobin Wang | Yadong Chen | Jian Wang | Tongtao Zhang | Heng Ji
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing


An Iterative Link-based Method for Parallel Web Page Mining
Le Liu | Yu Hong | Jun Lu | Jun Lang | Heng Ji | Jianmin Yao
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Constructing Information Networks Using One Single Model
Qi Li | Heng Ji | Yu Hong | Sujian Li
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Effective Selection of Translation Model Training Data
Le Liu | Yu Hong | Hao Liu | Xing Wang | Jianmin Yao
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)


Thread Cleaning and Merging for Microblog Topic Detection
Jianfeng Zhang | Yunqing Xia | Bin Ma | Jianmin Yao | Yu Hong
Proceedings of 5th International Joint Conference on Natural Language Processing

Using Cross-Entity Inference to Improve Event Extraction
Yu Hong | Jianfeng Zhang | Bin Ma | Jianmin Yao | Guodong Zhou | Qiaoming Zhu
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

Factual or Satisfactory: What Search Results Are Better?
Yu Hong | Jun Lu | Shiqi Zhao
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation


Jumping Distance based Chinese Person Name Disambiguation
Yu Hong | Fei Pei | Yue-hui Yang | Jian-min Yao | Qiao-ming Zhu
CIPS-SIGHAN Joint Conference on Chinese Language Processing

A Novel Method for Bilingual Web Page Acquisition from Search Engine Web Records
Yanhui Feng | Yu Hong | Zhenxiang Yan | Jianmin Yao | Qiaoming Zhu
Coling 2010: Posters

Negative Feedback: The Forsaken Nature Available for Re-ranking
Yu Hong | Qing-qing Cai | Song Hua | Jian-min Yao | Qiao-ming Zhu
Coling 2010: Posters