Zhengjie Huang


2020

pdf bib
PGL at TextGraphs 2020 Shared Task: Explanation Regeneration using Language and Graph Learning Methods
Weibin Li | Yuxiang Lu | Zhengjie Huang | Weiyue Su | Jiaxiang Liu | Shikun Feng | Yu Sun
Proceedings of the Graph-based Methods for Natural Language Processing (TextGraphs)

This paper describes the system designed by the Baidu PGL Team which achieved the first place in the TextGraphs 2020 Shared Task. The task focuses on generating explanations for elementary science questions. Given a question and its corresponding correct answer, we are asked to select the facts that can explain why the answer is correct for the question and answering (QA) from a large knowledge base. To address this problem, we use a pre-trained language model to recall the top-K relevant explanations for each question. Then, we adopt a re-ranking approach based on a pre-trained language model to rank the candidate explanations. To further improve the rankings, we also develop an architecture consisting both powerful pre-trained transformers and GNNs to tackle the multi-hop inference problem. The official evaluation shows that, our system can outperform the second best system by 1.91 points.

pdf bib
Kk2018 at SemEval-2020 Task 9: Adversarial Training for Code-Mixing Sentiment Classification
Jiaxiang Liu | Xuyi Chen | Shikun Feng | Shuohuan Wang | Xuan Ouyang | Yu Sun | Zhengjie Huang | Weiyue Su
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Code switching is a linguistic phenomenon which may occur within a multilingual setting where speakers share more than one language. With the increasing communication between groups with different languages, this phenomenon is more and more popular. However, there are little research and data in this area, especially in code-mixing sentiment classification. In this work, the domain transfer learning from state-of-the-art uni-language model ERNIE is tested on the code-mixing dataset, and surprisingly, a strong baseline is achieved. And further more, the adversarial training with a multi-lingual model is used to achieved 1st place of SemEval-2020 Task9 Hindi-English sentiment classification competition.

pdf bib
ERNIE at SemEval-2020 Task 10: Learning Word Emphasis Selection by Pre-trained Language Model
Zhengjie Huang | Shikun Feng | Weiyue Su | Xuyi Chen | Shuohuan Wang | Jiaxiang Liu | Xuan Ouyang | Yu Sun
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper describes the system designed by ERNIE Team which achieved the first place in SemEval-2020 Task 10: Emphasis Selection For Written Text in Visual Media. Given a sentence, we are asked to find out the most important words as the suggestion for automated design. We leverage the unsupervised pre-training model and finetune these models on our task. After our investigation, we found that the following models achieved an excellent performance in this task: ERNIE 2.0, XLM-ROBERTA, ROBERTA and ALBERT. We combine a pointwise regression loss and a pairwise ranking loss which is more close to the final Match m metric to finetune our models. And we also find that additional feature engineering and data augmentation can help improve the performance. Our best model achieves the highest score of 0.823 and ranks first for all kinds of metrics.