Yiquan Wu


2024

pdf
Unleashing the Power of LLMs in Court View Generation by Stimulating Internal Knowledge and Incorporating External Knowledge
Yifei Liu | Yiquan Wu | Ang Li | Yating Zhang | Changlong Sun | Weiming Lu | Fei Wu | Kun Kuang
Findings of the Association for Computational Linguistics: NAACL 2024

Court View Generation (CVG) plays a vital role in the realm of legal artificial intelligence, which aims to support judges in crafting legal judgment documents. The court view consists of three essential judgment parts: the charge-related, law article-related, and prison term-related parts, each requiring specialized legal knowledge, rendering CVG a challenging task.Although Large Language Models (LLMs) have made remarkable strides in language generation, they encounter difficulties in the knowledge-intensive legal domain.Actually, there can be two types of knowledge: internal knowledge stored within LLMs’ parameters and external knowledge sourced from legal documents outside the models.In this paper, we decompose court views into different parts, stimulate internal knowledge, and incorporate external information to unleash the power of LLMs in the CVG task.To validate our method, we conduct a series of experiment results on two real-world datasets LAIC2021 and CJO2022. The experiments demonstrate that our method is capable of generating more accurate and reliable court views.

pdf
Latent Learningscape Guided In-context Learning
Anlai Zhou | Sunshine Jiang | Yifei Liu | Yiquan Wu | Kun Kuang | Jun Xiao
Findings of the Association for Computational Linguistics ACL 2024

The growing interest in leveraging large language models is driven by their exceptional imitation and reasoning capabilities. In-context learning (ICL), a streamlined method, has shown potential in boosting these models’ performance without modifying their underlying parameters, especially when supplied with suitable demonstrations. However, existing methods mainly choose demonstrations by comparing surface-level semantic similarities (e.g., based on embedding) and fall short of identifying the most fitting ones. This paper introduces the concept of a “latent learningscape”, a more nuanced representation that describes the characteristic of the demonstrations. Building on this concept, we develop a results-driven approach to characterize the latent learningscape features of demonstrations, which then inform the creation of more effective prompts. Through comprehensive testing across datasets in arithmetic, commonsense, and symbolic reasoning tasks, our approach outperforms leading models, showing an average increase in scores by 7.4 percentage points.

pdf
Chain-of-Quizzes: Pedagogy-inspired Example Selection in In-Context-Learning
Yiquan Wu | Anlai Zhou | Yuhang Liu | Yifei Liu | Adam Jatowt | Weiming Lu | Jun Xiao | Kun Kuang
Findings of the Association for Computational Linguistics ACL 2024

In-context learning (ICL) has emerged as a powerful tool for enhancing large language models (LLMs) in addressing downstream tasks. In this paper, we explore the vital task of example selection in ICL by mimicking the human learning process. We propose a Chain-of-Quizzes (CoQ) framework inspired by educational theories such as Bruner’s Spiral Learning and Mastery Learning theory. Specifically, our framework employs the LLMs to answer the quiz (question in the example) to sift ‘good’ examples, combines these examples iteratively with the increasing complexity, and utilizes a final exam to gauge the combined example chains. Our extensive experiments on diverse reasoning datasets show the proposed approach outperforms baseline models. These findings underscore the framework’s potential for future research.

pdf
Enhancing Court View Generation with Knowledge Injection and Guidance
Ang Li | Yiquan Wu | Yifei Liu | Kun Kuang | Fei Wu | Ming Cai
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Court View Generation (CVG) is a challenging task in the field of Legal Artificial Intelligence (LegalAI), which aims to generate court views based on the plaintiff claims and the fact descriptions. While Pretrained Language Models (PLMs) have showcased their prowess in natural language generation, their application to the complex, knowledge-intensive domain of CVG often reveals inherent limitations. In this paper, we present a novel approach, named Knowledge Injection and Guidance (KIG), designed to bolster CVG using PLMs. To efficiently incorporate domain knowledge during the training stage, we introduce a knowledge-injected prompt encoder for prompt tuning, thereby reducing computational overhead. Moreover, to further enhance the model’s ability to utilize domain knowledge, we employ a generating navigator, which dynamically guides the text generation process in the inference stage without altering the model’s architecture, making it readily transferable. Comprehensive experiments on real-world data demonstrate the effectiveness of our approach compared to several established baselines, especially in the responsivity of claims, where it outperforms the best baseline by 11.87%.

pdf
From Graph to Word Bag: Introducing Domain Knowledge to Confusing Charge Prediction
Ang Li | Qiangchao Chen | Yiquan Wu | Xiang Zhou | Kun Kuang | Fei Wu | Ming Cai
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Confusing charge prediction is a challenging task in legal AI, which involves predicting confusing charges based on fact descriptions. While existing charge prediction methods have shown impressive performance, they face significant challenges when dealing with confusing charges, such as Snatch and Robbery. In the legal domain, constituent elements play a pivotal role in distinguishing confusing charges. Constituent elements are fundamental behaviors underlying criminal punishment and have subtle distinctions among charges. In this paper, we introduce a novel From Graph to Word Bag (FWGB) approach, which introduces domain knowledge regarding constituent elements to guide the model in making judgments on confusing charges, much like a judge’s reasoning process. Specifically, we first construct a legal knowledge graph containing constituent elements to help select keywords for each charge, forming a word bag. Subsequently, to guide the model’s attention towards the differentiating information for each charge within the context, we expand the attention mechanism and introduce a new loss function with attention supervision through words in the word bag. We construct the confusing charges dataset from real-world judicial documents. Experiments demonstrate the effectiveness of our method, especially in maintaining exceptional performance in imbalanced label distributions.

2023

pdf
Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model Collaboration
Yiquan Wu | Siying Zhou | Yifei Liu | Weiming Lu | Xiaozhong Liu | Yating Zhang | Changlong Sun | Fei Wu | Kun Kuang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI, i.e., predicting the judgment of the case in terms of case fact description. Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems. Thus, it is worthwhile to explore the utilization of precedents in the LJP. Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task. These can be broken down into two categories: large language models (LLMs) and domain-specific models. LLMs are capable of interpreting and generating complex natural language, while domain models are efficient in learning task-specific information. In this paper, we propose the precedent-enhanced LJP framework (PLJP) – a system that leverages the strength of both LLM and domain models in the context of precedents. Specifically, the domain models are designed to provide candidate labels and find the proper precedents efficiently, and the large models will make the final prediction with an in-context precedents comprehension. Experiments on the real-world dataset demonstrate the effectiveness of our PLJP. Moreover, our work shows a promising direction for LLM and domain-model collaboration that can be generalized to other vertical domains.

pdf
Focus-aware Response Generation in Inquiry Conversation
Yiquan Wu | Weiming Lu | Yating Zhang | Adam Jatowt | Jun Feng | Changlong Sun | Fei Wu | Kun Kuang
Findings of the Association for Computational Linguistics: ACL 2023

Inquiry conversation is a common form of conversation that aims to complete the investigation (e.g., court hearing, medical consultation and police interrogation) during which a series of focus shifts occurs. While many models have been proposed to generate a smooth response to a given conversation history, neglecting the focus can limit performance in inquiry conversation where the order of the focuses plays there a key role. In this paper, we investigate the problem of response generation in inquiry conversation by taking the focus into consideration. We propose a novel Focus-aware Response Generation (FRG) method by jointly optimizing a multi-level encoder and a set of focal decoders to generate several candidate responses that correspond to different focuses. Additionally, a focus ranking module is proposed to predict the next focus and rank the candidate responses. Experiments on two orthogonal inquiry conversation datasets (judicial, medical domain) demonstrate that our method generates results significantly better in automatic metrics and human evaluation compared to the state-of-the-art approaches.

2022

pdf
Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework
Yiquan Wu | Yifei Liu | Weiming Lu | Yating Zhang | Jun Feng | Changlong Sun | Fei Wu | Kun Kuang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Legal judgment prediction (LJP) is a fundamental task in legal AI, which aims to assist the judge to hear the case and determine the judgment. The legal judgment usually consists of the law article, charge, and term of penalty. In the real trial scenario, the judge usually makes the decision step-by-step: first concludes the rationale according to the case’s facts and then determines the judgment. Recently, many models have been proposed and made tremendous progress in LJP, but most of them adopt an end-to-end manner that cannot be manually intervened by the judge for practical use. Moreover, existing models lack interpretability due to the neglect of rationale in the prediction process. Following the judge’s real trial logic, in this paper, we propose a novel Rationale-based Legal Judgment Prediction (RLJP) framework. In the RLJP framework, the LJP process is split into two steps. In the first phase, the model generates the rationales according to the fact description. Then it predicts the judgment based on the fact and the generated rationales. Extensive experiments on a real-world dataset show RLJP achieves the best results compared to the state-of-the-art models. Meanwhile, the proposed framework provides good interactivity and interpretability which enables practical use.

pdf
De-Bias for Generative Extraction in Unified NER Task
Shuai Zhang | Yongliang Shen | Zeqi Tan | Yiquan Wu | Weiming Lu
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Named entity recognition (NER) is a fundamental task to recognize specific types of entities from a given sentence. Depending on how the entities appear in the sentence, it can be divided into three subtasks, namely, Flat NER, Nested NER, and Discontinuous NER. Among the existing approaches, only the generative model can be uniformly adapted to these three subtasks. However, when the generative model is applied to NER, its optimization objective is not consistent with the task, which makes the model vulnerable to the incorrect biases. In this paper, we analyze the incorrect biases in the generation process from a causality perspective and attribute them to two confounders: pre-context confounder and entity-order confounder. Furthermore, we design Intra- and Inter-entity Deconfounding Data Augmentation methods to eliminate the above confounders according to the theory of backdoor adjustment. Experiments show that our method can improve the performance of the generative NER model in various datasets.

2020

pdf
De-Biased Court’s View Generation with Causality
Yiquan Wu | Kun Kuang | Yating Zhang | Xiaozhong Liu | Changlong Sun | Jun Xiao | Yueting Zhuang | Luo Si | Fei Wu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Court’s view generation is a novel but essential task for legal AI, aiming at improving the interpretability of judgment prediction results and enabling automatic legal document generation. While prior text-to-text natural language generation (NLG) approaches can be used to address this problem, neglecting the confounding bias from the data generation mechanism can limit the model performance, and the bias may pollute the learning outcomes. In this paper, we propose a novel Attentional and Counterfactual based Natural Language Generation (AC-NLG) method, consisting of an attentional encoder and a pair of innovative counterfactual decoders. The attentional encoder leverages the plaintiff’s claim and fact description as input to learn a claim-aware encoder from which the claim-related information in fact description can be emphasized. The counterfactual decoders are employed to eliminate the confounding bias in data and generate judgment-discriminative court’s views (both supportive and non-supportive views) by incorporating with a synergistic judgment predictive model. Comprehensive experiments show the effectiveness of our method under both quantitative and qualitative evaluation metrics.