Ru Li

2024

Structured entailment tree can exhibit the reasoning chains from knowledge facts to predicted answers, which is important for constructing an explainable question answering system. Existing works mainly include directly generating the entire tree and stepwise generating the proof steps. The stepwise methods can exploit combinatoriality and generalize to longer steps, but they have large fact search spaces and error accumulation problems resulting in the generation of invalid steps. In this paper, inspired by the Dual Process Theory in cognitive science, we propose FRVA, a Fact-Retrieval and Verification Augmented bidirectional entailment tree generation method that contains two systems. Specifically, System 1 makes intuitive judgments through the fact retrieval module and filters irrelevant facts to reduce the search space. System 2 designs a deductive-abductive bidirectional reasoning module, and we construct cross-verification and multi-view contrastive learning to make the generated proof steps closer to the target hypothesis. We enhance the reliability of the stepwise proofs to mitigate error propagation. Experiment results on EntailmentBank show that FRVA outperforms previous models and achieves state-of-the-art performance in fact selection and structural correctness.

The task of model editing becomes popular for correcting inaccurate or outdated parametric knowledge in Large Language Models (LLMs). However, there are major limitations of state of the art (SOTA) model editing methods, including the excessive memorization issue caused by the direct editing methods, as well as the error propagation and knowledge conflict issues from the memory enhancement methods, resulting in hindering models’ *portability*, e.g., the ability to transfer the new knowledge to related one-hop or multi-hop content. To address these issues, we propose the InstructEd method, the idea of which is to insert soft instructions into the attention module so as to facilitate interactions between instructions and questions and to understand and utilize new facts. Our main findings are: (i) InstructEd has achieved SOTA performance on three datasets for one-hop/multi-hop evaluation with LLaMAs and GPT2, achieving 10% (5%) improvement in one-hop (multi-hop) model editing.(ii) Different from earlier methods on editing parameters in FFN, we show that editing attention can also help. (iii) Model editing is highly related to retrieval augmented methods, which can help improve the locality of model editing while slightly decrease the editing performance with hops.

pdf abs
NutFrame: Frame-based Conceptual Structure Induction with LLMs
Shaoru Guo | Yubo Chen | Kang Liu | Ru Li | Jun Zhao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Conceptual structure is fundamental to human cognition and natural language understanding. It is significant to explore whether Large Language Models (LLMs) understand such knowledge. Since FrameNet serves as a well-defined conceptual structure knowledge resource, with meaningful frames, fine-grained frame elements, and rich frame relations, we construct a benchmark for coNceptual structure induction based on FrameNet, called NutFrame. It contains three sub-tasks: Frame Induction, Frame Element Induction, and Frame Relation Induction. In addition, we utilize prompts to induce conceptual structure of Framenet with LLMs. Furthermore, we conduct extensive experiments on NutFrame to evaluate various widely-used LLMs. Experimental results demonstrate that FrameNet induction remains a challenge for LLMs.

pdf abs
Hyperspherical Multi-Prototype with Optimal Transport for Event Argument Extraction
Guangjun Zhang | Hu Zhang | YuJie Wang | Ru Li | Hongye Tan | Jiye Liang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Event Argument Extraction (EAE) aims to extract arguments for specified events from a text. Previous research has mainly focused on addressing long-distance dependencies of arguments, modeling co-occurrence relationships between roles and events, but overlooking potential inductive biases: (i) semantic differences among arguments of the same type and (ii) large margin separation between arguments of the different types. Inspired by prototype networks, we introduce a new model named HMPEAE, which takes the two inductive biases above as targets to locate prototypes and guide the model to learn argument representations based on these prototypes.Specifically, we set multiple prototypes to represent each role to capture intra-class differences. Simultaneously, we use hypersphere as the output space for prototypes, defining large margin separation between prototypes to encourage the model to learn significant differences between different types of arguments effectively.We solve the “argument-prototype” assignment as an optimal transport problem to optimize the argument representation and minimize the absolute distance between arguments and prototypes to achieve compactness within sub-clusters. Experimental results on the RAMS and WikiEvents datasets show that HMPEAE achieves state-of-the-art performances.

pdf abs
AGR: Reinforced Causal Agent-Guided Self-explaining Rationalization
Yunxiao Zhao | Zhiqiang Wang | Xiaoli Li | Jiye Liang | Ru Li
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Most existing rationalization approaches are susceptible to degeneration accumulation due to a lack of effective control over the learning direction of the model during training. To address this issue, we propose a novel approach AGR (Agent-Guided Rationalization), guiding the next action of the model based on its current training state. Specifically, we introduce causal intervention calculus to quantify the causal effects inherent during rationale training, and utilize reinforcement learning process to refine the learning bias of them. Furthermore, we pretrain an agent within this reinforced causal environment to guide the next step of the model. We theoretically demonstrate that a good model needs the desired guidance, and empirically show the effectiveness of our approach, outperforming existing state-of-the-art methods on BeerAdvocate and HotelReview datasets.

2023

pdf abs
基于框架语义场景图的零形式填充方法(A Null Instantiation Filling Method based Frame Semantic Scenario Graph)
Yuzhi Wang (王俞智) | Ru Li (李茹) | Xuefeng Su (苏雪峰) | Zhichao Yan (闫智超) | Juncai Li (李俊材)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics

“零形式填充是在篇章上下文中为给定句子中的隐式框架语义角色找到相应的填充内容。传统的零形式填充方法采用pipeline模型,容易造成错误传播,并且忽略了显式语义角色及其填充内容的重要性。针对上述问题,本文提出了一种端到端的零形式填充方法,该方法结合汉语框架网信息构建出框架语义场景图并利用GAT对其建模,得到融合了显式框架元素信息的候选填充项表示,增强了模型对句中隐式语义成分的识别能力。在汉语零形式填充数据集上的实验表明,本文提出的模型相较于基于Bert的基线模型F1值提升了9.16%,证明了本文提出方法的有效性。”

pdf abs
CCL23-Eval 任务3总结报告:汉语框架语义解析评测(Overview of CCL23-Eval Task 1:Chinese FrameNet Semantic Parsing)
Juncai Li (李俊材) | Zhichao Yan (闫智超) | Xuefeng Su (苏雪峰) | Boxiang Ma (马博翔) | Peiyuan Yang1 (杨沛渊) | Ru Li (李茹)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“汉语框架语义解析评测任务致力于提升机器模型理解细粒度语义信息的能力。该评测数据集包括20000条标注的框架语义解析例句和近700个框架信息。评测任务分为框架识别、论元范围识别和论元角色识别三个子任务,最终成绩根据这三个任务的得分综合计算。本次评测受到工业界和学术界的广泛关注,共有55支队伍报名参赛,其中12支队伍提交了结果,我们选取5支队伍的模型进行结果复现,最终来自四川的李作恒以71.49的分数排名第一。该任务的更多信息,包括系统提交、评测结果以及数据资源,可从CCL-2023汉语框架语义解析评测任务网址1查看。”

pdf abs
CCL23-Eval 任务9总结报告:汉语高考阅读理解对抗鲁棒评测 (Overview of CCL23-Eval Task 9: Adversarial Robustness Evaluation for Chinese Gaokao Reading Comprehension)
Yaxin Guo (郭亚鑫) | Guohang Yan (闫国航) | Hongye Tan (谭红叶) | Ru Li (李茹)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“汉语高考阅读理解对抗鲁棒评测任务致力于提升机器阅读理解模型在复杂、真实对抗环境下的鲁棒性。本次任务设计了四种对抗攻击策略(关键词扰动、推理逻辑扰动、时空属性扰动、因果关系扰动),构建了对抗鲁棒子集GCRC advRobust。任务需要根据给定的文章和问题从4个选项中选择正确的答案。本次评测受到工业界和学术界的广泛关注,共有29支队伍报名参赛,但由于难度较大,仅有8支队伍提交了结果。有关该任务的所有技术信息,包括系统提交、官方结果以及支持资源和软件的链接,可从任务网站获取1。”

pdf abs
Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering
Yujie Wang | Hu Zhang | Jiye Liang | Ru Li
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recently, knowledge graphs (KGs) have won noteworthy success in commonsense question answering. Existing methods retrieve relevant subgraphs in the KGs through key entities and reason about the answer with language models (LMs) and graph neural networks. However, they ignore (i) optimizing the knowledge representation and structure of subgraphs and (ii) deeply fusing heterogeneous QA context with subgraphs. In this paper, we propose a dynamic heterogeneous-graph reasoning method with LMs and knowledge representation learning (DHLK), which constructs a heterogeneous knowledge graph (HKG) based on multiple knowledge sources and optimizes the structure and knowledge representation of the HKG using a two-stage pruning strategy and knowledge representation learning (KRL). It then performs joint reasoning by LMs and Relation Mask Self-Attention (RMSA). Specifically, DHLK filters key entities based on the dictionary vocabulary to achieve the first-stage pruning while incorporating the paraphrases in the dictionary into the subgraph to construct the HKG. Then, DHLK encodes and fuses the QA context and HKG using LM, and dynamically removes irrelevant KG entities based on the attention weights of LM for the second-stage pruning. Finally, DHLK introduces KRL to optimize the knowledge representation and perform answer reasoning on the HKG by RMSA.We evaluate DHLK at CommonsenseQA and OpenBookQA, and show its improvement on existing LM and LM+KG methods.

Event Detection (ED) is a critical task that aims to identify events of certain types in plain text. Neural models have achieved great success on ED, thus coming with a desire for higher interpretability. Existing works mainly exploit words or phrases of the input text to explain models’ inner mechanisms. However, for ED, the event structure, comprising of an event trigger and a set of arguments, are more enlightening clues to explain model behaviors. To this end, we propose a Trigger-Argument based Explanation method (TAE), which can utilize event structure knowledge to uncover a faithful interpretation for the existing ED models at neuron level. Specifically, we design group, sparsity, support mechanisms to construct the event structure from structuralization, compactness, and faithfulness perspectives. We evaluate our model on the large-scale MAVEN and the widely-used ACE 2005 datasets, and observe that TAE is able to reveal the process by which the model predicts. Experimental results also demonstrate that TAE can not only improve the interpretability on standard evaluation metrics, but also effectively facilitate the human understanding.

Event ontology provides a shared and formal specification about what happens in the real world and can benefit many natural language understanding tasks. However, the independent development of event ontologies often results in heterogeneous representations that raise the need for establishing alignments between semantically related events. There exists a series of works about ontology alignment (OA), but they only focus on the entity-based OA, and neglect the event-based OA. To fill the gap, we construct an Event Ontology Alignment (EventOA) dataset based on FrameNet and Wikidata, which consists of 900+ event type alignments and 8,000+ event argument alignments. Furthermore, we propose a multi-view event ontology alignment (MEOA) method, which utilizes description information (i.e., name, alias and definition) and neighbor information (i.e., subclass and superclass) to obtain richer representation of the event ontologies. Extensive experiments show that our MEOA outperforms the existing entity-based OA methods and can serve as a strong baseline for EventOA research.

The task of sequential model editing is to fix erroneous knowledge in Pre-trained Language Models (PLMs) efficiently, precisely and continuously. Although existing methods can deal with a small number of modifications, these methods experience a performance decline or require additional annotated data, when the number of edits increases. In this paper, we propose a Retrieval Augmented Sequential Model Editing framework (RASE) that leverages factual information to enhance editing generalization and to guide the identification of edits by retrieving related facts from the fact-patch memory we constructed. Our main findings are: (i) State-of-the-art models can hardly correct massive mistakes stably and efficiently; (ii) Even if we scale up to thousands of edits, RASE can significantly enhance editing generalization and maintain consistent performance and efficiency; (iii) RASE can edit large-scale PLMs and increase the performance of different editors. Moreover, it can integrate with ChatGPT and further improve performance. Our code and data are available at: https://github.com/sev777/RASE.

pdf abs
Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs
Zhiwei Hu | Victor Basulto | Zhiliang Xiang | Ru Li | Jeff Pan
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Knowledge graph entity typing (KGET) aims at inferring plausible types of entities in knowledge graphs. Existing approaches to KGET focus on how to better encode the knowledge provided by the neighbors and types of an entity into its representation. However, they ignore the semantic knowledge provided by the way in which types can be clustered together. In this paper, we propose a novel method called Multi-view Contrastive Learning for knowledge graph Entity Typing MCLET, which effectively encodes the coarse-grained knowledge provided by clusters into entity and type embeddings. MCLET is composed of three modules: i) Multi-view Generation and Encoder module, which encodes structured information from entity-type, entity-cluster and cluster-type views; ii) Cross-view Contrastive Learning module, which encourages different views to collaboratively improve view-specific representations of entities and types; iii) Entity Typing Prediction module, which integrates multi-head attention and a Mixture-of-Experts strategy to infer missing entity types. Extensive experiments show the strong performance of MCLET compared to the state-of-the-art

2022

pdf abs
Transformer-based Entity Typing in Knowledge Graphs
Zhiwei Hu | Victor Gutierrez-Basulto | Zhiliang Xiang | Ru Li | Jeff Pan
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

We investigate the knowledge graph entity typing task which aims at inferring plausible entity types. In this paper, we propose a novel Transformer-based Entity Typing (TET) approach, effectively encoding the content of neighbours of an entity by means of a transformer mechanism. More precisely, TET is composed of three different mechanisms: a local transformer allowing to infer missing entity types by independently encoding the information provided by each of its neighbours; a global transformer aggregating the information of all neighbours of an entity into a single long sequence to reason about more complex entity types; and a context transformer integrating neighbours content in a differentiated way through information exchange between neighbour pairs, while preserving the graph structure. Furthermore, TET uses information about class membership of types to semantically strengthen the representation of an entity. Experiments on two real-world datasets demonstrate the superior performance of TET compared to the state-of-the-art.

pdf abs
基于GCN和门机制的汉语框架排歧方法(Chinese Frame Disambiguation Method Based on GCN and Gate Mechanism)
Yanan You (游亚男) | Ru Li (李茹) | Xuefeng Su (苏雪峰) | Zhichao Yan (闫智超) | Minshuai Sun (孙民帅) | Chao Wang (王超)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“汉语框架排歧旨在候选框架中给句子中的目标词选择一个符合其语义场景的框架。目前研究方法存在隐层向量的计算与目标词无关,并且忽略了句法结构信息对框架排歧的影响等缺陷。针对上述问题,使用GCN对句法结构信息进行建模;引入门机制过滤隐层向量中与目标词无关的噪声信息;并在此基础上,提出一种约束机制来约束模型的学习,改进向量表示。该模型在CFN、FN1.5和FN1.7数据集上优于当前最好模型,证明了方法的有效性。”

pdf abs
基于框架语义映射和类型感知的篇章事件抽取(Document-Level Event Extraction Based on Frame Semantic Mapping and Type Awareness)
Jiang Lu (卢江) | Ru Li (李茹) | Xuefeng Su (苏雪峰) | Zhichao Yan (闫智超) | Jiaxing Chen (陈加兴)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“篇章事件抽取是从给定的文本中识别其事件类型和事件论元。目前篇章事件普遍存在数据稀疏和多值论元耦合的问题。基于此,本文将汉语框架网(CFN)与中文篇章事件建立映射,同时引入滑窗机制和触发词释义改善了事件检测的数据稀疏问题;使用基于类型感知标签的多事件分离策略缓解了论元耦合问题。为了提升模型的鲁棒性,进一步引入对抗训练。本文提出的方法在DuEE-Fin和CCKS2021数据集上实验结果显著优于现有方法。”

2021

pdf abs
Integrating Semantic Scenario and Word Relations for Abstractive Sentence Summarization
Yong Guan | Shaoru Guo | Ru Li | Xiaoli Li | Hu Zhang
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recently graph-based methods have been adopted for Abstractive Text Summarization. However, existing graph-based methods only consider either word relations or structure information, which neglect the correlation between them. To simultaneously capture the word relations and structure information from sentences, we propose a novel Dual Graph network for Abstractive Sentence Summarization. Specifically, we first construct semantic scenario graph and semantic word relation graph based on FrameNet, and subsequently learn their representations and design graph fusion method to enhance their correlation and obtain better semantic representation for summary generation. Experimental results show our model outperforms existing state-of-the-art methods on two popular benchmark datasets, i.e., Gigaword and DUC 2004.

pdf abs
Frame Semantic-Enhanced Sentence Modeling for Sentence-level Extractive Text Summarization
Yong Guan | Shaoru Guo | Ru Li | Xiaoli Li | Hongye Tan
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Sentence-level extractive text summarization aims to select important sentences from a given document. However, it is very challenging to model the importance of sentences. In this paper, we propose a novel Frame Semantic-Enhanced Sentence Modeling for Extractive Summarization, which leverages Frame semantics to model sentences from both intra-sentence level and inter-sentence level, facilitating the text summarization task. In particular, intra-sentence level semantics leverage Frames and Frame Elements to model internal semantic structure within a sentence, while inter-sentence level semantics leverage Frame-to-Frame relations to model relationships among sentences. Extensive experiments on two benchmark corpus CNN/DM and NYT demonstrate that our model outperforms six state-of-the-art methods significantly.

pdf abs
A Knowledge-Guided Framework for Frame Identification
Xuefeng Su | Ru Li | Xiaoli Li | Jeff Z. Pan | Hu Zhang | Qinghua Chai | Xiaoqi Han
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Frame Identification (FI) is a fundamental and challenging task in frame semantic parsing. The task aims to find the exact frame evoked by a target word in a given sentence. It is generally regarded as a classification task in existing work, where frames are treated as discrete labels or represented using onehot embeddings. However, the valuable knowledge about frames is neglected. In this paper, we propose a Knowledge-Guided Frame Identification framework (KGFI) that integrates three types frame knowledge, including frame definitions, frame elements and frame-to-frame relations, to learn better frame representation, which guides the KGFI to jointly map target words and frames into the same embedding space and subsequently identify the best frame by calculating the dot-product similarity scores between the target word embedding and all of the frame embeddings. The extensive experimental results demonstrate KGFI significantly outperforms the state-of-the-art methods on two benchmark datasets.

2020

Sentence representation (SR) is the most crucial and challenging task in Machine Reading Comprehension (MRC). MRC systems typically only utilize the information contained in the sentence itself, while human beings can leverage their semantic knowledge. To bridge the gap, we proposed a novel Frame-based Sentence Representation (FSR) method, which employs frame semantic knowledge to facilitate sentence modelling. Specifically, different from existing methods that only model lexical units (LUs), Frame Representation Models, which utilize both LUs in frame and Frame-to-Frame (F-to-F) relations, are designed to model frames and sentences with attention schema. Our proposed FSR method is able to integrate multiple-frame semantic information to get much better sentence representations. Our extensive experimental results show that it performs better than state-of-the-art technologies on machine reading comprehension task.

pdf abs
Incorporating Syntax and Frame Semantics in Neural Network for Machine Reading Comprehension
Shaoru Guo | Yong Guan | Ru Li | Xiaoli Li | Hongye Tan
Proceedings of the 28th International Conference on Computational Linguistics

Machine reading comprehension (MRC) is one of the most critical yet challenging tasks in natural language understanding(NLU), where both syntax and semantics information of text are essential components for text understanding. It is surprising that jointly considering syntax and semantics in neural networks was never formally reported in literature. This paper makes the first attempt by proposing a novel Syntax and Frame Semantics model for Machine Reading Comprehension (SS-MRC), which takes full advantage of syntax and frame semantics to get richer text representation. Our extensive experimental results demonstrate that SS-MRC performs better than ten state-of-the-art technologies on machine reading comprehension task.

pdf abs
多模块联合的阅读理解候选句抽取(Evidence sentence extraction for reading comprehension based on multi-module)
Yu Ji (吉宇) | Xiaoyue Wang (王笑月) | Ru Li (李茹) | Shaoru Guo (郭少茹) | Yong Guan (关勇)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

机器阅读理解作为自然语言理解的关键任务,受到国内外学者广泛关注。针对多项选择型阅读理解中无线索标注且涉及多步推理致使候选句抽取困难的问题,本文提出一种基于多模块联合的候选句抽取模型。首先采用部分标注数据微调预训练模型;其次通过TF-IDF递归式抽取多跳推理问题中的候选句;最后结合无监督方式进一步筛选模型预测结果降低冗余性。本文在高考语文选择题及RACE数据集上进行验证,在候选句抽取中,本文方法相比于最优基线模型F1值提升3.44%,在下游答题任务中采用候选句作为模型输入较全文输入时准确率分别提高3.68%和3.6%,上述结果证实本文所提方法有效性。

pdf abs
基于Self-Attention的句法感知汉语框架语义角色标注(Syntax-Aware Chinese Frame Semantic Role Labeling Based on Self-Attention)
Xiaohui Wang (王晓晖) | Ru Li (李茹) | Zhiqiang Wang (王智强) | Qinghua Chai (柴清华) | Xiaoqi Han (韩孝奇)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

框架语义角色标注(Frame Semantic Role Labeling, FSRL)是基于FrameNet标注体系的语义分析任务。语义角色标注通常对句法有很强的依赖性,目前的语义角色标注模型大多基于双向长短时记忆网络Bi-LSTM,虽然可以获取句子中的长距离依赖信息,但无法很好获取句子中的句法信息。因此,引入self-attention机制来捕获句子中每个词的句法信息。实验结果表明,该模型在CFN(Chinese FrameNet,汉语框架网)数据集上的F1达到83.77%,提升了近11%。

Co-authors

Venues

ijcnlp2

starsem1

lrec1