Kuo Liao


2024

pdf
Enhancing Reinforcement Learning with Label-Sensitive Reward for Natural Language Understanding
Kuo Liao | Shuang Li | Meng Zhao | Liqun Liu | Mengge Xue | Zhenyu Hu | Honglin Han | Chengguo Yin
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recent strides in large language models (LLMs) have yielded remarkable performance, leveraging reinforcement learning from human feedback (RLHF) to significantly enhance generation and alignment capabilities. However, RLHF encounters numerous challenges, including the objective mismatch issue, leading to suboptimal performance in Natural Language Understanding (NLU) tasks.To address this limitation, we propose a novel Reinforcement Learning framework enhanced with Label-sensitive Reward (RLLR) to amplify the performance of LLMs in NLU tasks. By incorporating label-sensitive pairs into reinforcement learning, our method aims to adeptly capture nuanced label-sensitive semantic features during RL, thereby enhancing natural language understanding.Experiments conducted on five diverse foundation models across eight tasks showcase promising results. In comparison to Supervised Fine-tuning models (SFT), RLLR demonstrates an average performance improvement of 1.54%. Compared with RLHF models, the improvement averages at 0.69%. These results reveal the effectiveness of our method for LLMs in NLU tasks.

pdf
Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors
Mengge Xue | Zhenyu Hu | Liqun Liu | Kuo Liao | Shuang Li | Honglin Han | Meng Zhao | Chengguo Yin
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Multiple-Choice Questions (MCQs) constitute a critical area of research in the study of Large Language Models (LLMs). Previous works have investigated the selection bias problem in MCQs within few-shot scenarios, in which the LLM’s performance may be influenced by the presentation of answer choices, leaving the selection bias during Supervised Fine-Tuning (SFT) unexplored. In this paper, we reveal that selection bias persists in the SFT phase , primarily due to the LLM’s inadequate Multiple Choice Symbol Binding (MCSB) ability. This limitation implies that the model struggles to associate the answer options with their corresponding symbols (e.g., A/B/C/D) effectively. To enhance the model’s MCSB capability, we first incorporate option contents into the loss function and subsequently adjust the weights of the option symbols and contents, guiding the model to understand the option content of the current symbol. Based on this, we introduce an efficient SFT algorithm for MCQs, termed Point-wise Intelligent Feedback (PIF). PIF constructs negative instances by randomly combin- ing the incorrect option contents with all candidate symbols, and proposes a point-wise loss to provide feedback on these negative samples into LLMs. Our experimental results demonstrate that PIF significantly reduces the model’s selection bias by improving its MCSB capability. Remarkably, PIF exhibits a substantial enhancement in the accuracy for MCQs.

2020

pdf
HIT-SCIR at SemEval-2020 Task 5: Training Pre-trained Language Model with Pseudo-labeling Data for Counterfactuals Detection
Xiao Ding | Dingkui Hao | Yuewei Zhang | Kuo Liao | Zhongyang Li | Bing Qin | Ting Liu
Proceedings of the Fourteenth Workshop on Semantic Evaluation

We describe our system for Task 5 of SemEval 2020: Modelling Causal Reasoning in Language: Detecting Counterfactuals. Despite deep learning has achieved significant success in many fields, it still hardly drives today’s AI to strong AI, as it lacks of causation, which is a fundamental concept in human thinking and reasoning. In this task, we dedicate to detecting causation, especially counterfactuals from texts. We explore multiple pre-trained models to learn basic features and then fine-tune models with counterfactual data and pseudo-labeling data. Our team HIT-SCIR wins the first place (1st) in Sub-task 1 — Detecting Counterfactual Statements and is ranked 4th in Sub-task 2 — Detecting Antecedent and Consequence. In this paper we provide a detailed description of the approach, as well as the results obtained in this task.

2019

pdf
Event Representation Learning Enhanced with External Commonsense Knowledge
Xiao Ding | Kuo Liao | Ting Liu | Zhongyang Li | Junwen Duan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Prior work has proposed effective methods to learn event representations that can capture syntactic and semantic information over text corpus, demonstrating their effectiveness for downstream tasks such as script event prediction. On the other hand, events extracted from raw texts lacks of commonsense knowledge, such as the intents and emotions of the event participants, which are useful for distinguishing event pairs when there are only subtle differences in their surface realizations. To address this issue, this paper proposes to leverage external commonsense knowledge about the intent and sentiment of the event. Experiments on three event-related tasks, i.e., event similarity, script event prediction and stock market prediction, show that our model obtains much better event embeddings for the tasks, achieving 78% improvements on hard similarity task, yielding more precise inferences on subsequent events under given contexts, and better accuracies in predicting the volatilities of the stock market.