Huiyuan Xie
2026
LexRel: Benchmarking Legal Relation Extraction for Chinese Civil Cases
Yida Cai | Ranjuexiao Hu | Huiyuan Xie | Chenyang Li | Yun Liu | Yuxiao Ye | Zhenghao Liu | Weixing Shen | Zhiyuan Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yida Cai | Ranjuexiao Hu | Huiyuan Xie | Chenyang Li | Yun Liu | Yuxiao Ye | Zhenghao Liu | Weixing Shen | Zhiyuan Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Legal relations serve as an important analytical framework for dispute resolution in civil cases. However, legal relations in Chinese civil cases remain underexplored in the field of legal AI, largely due to the absence of comprehensive schemas. In this work, we first introduce a comprehensive schema for legal relations in civil cases, which contains a hierarchical taxonomy and definitions of arguments. Based on this schema, we formulate a legal relation extraction task and present **LexRel**, an expert-annotated benchmark for legal relation extraction in the Chinese civil law domain. We use **LexRel** to evaluate state-of-the-art large language models (LLMs) on legal relation extraction, showing that current LLMs exhibit significant limitations in accurately identifying civil legal relations. Furthermore, we demonstrate that explicitly incorporating information about legal relations leads to promising performance gains on other downstream legal AI tasks.
2024
The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal
Huiyuan Xie | Felix Steffek | Joana De Faria | Christine Carter | Jonathan Rutherford
Proceedings of the Natural Legal Language Processing Workshop 2024
Huiyuan Xie | Felix Steffek | Joana De Faria | Christine Carter | Jonathan Rutherford
Proceedings of the Natural Legal Language Processing Workshop 2024
This paper explores the intersection of technological innovation and access to justice by developing a benchmark for predicting case outcomes in the UK Employment Tribunal (UKET). To address the challenge of extensive manual annotation, the study employs a large language model (LLM) for automatic annotation, resulting in the creation of the CLC-UKET dataset. The dataset consists of approximately 19,000 UKET cases and their metadata. Comprehensive legal annotations cover facts, claims, precedent references, statutory references, case outcomes, reasons and jurisdiction codes. Facilitated by the CLC-UKET data, we examine a multi-class case outcome prediction task in the UKET. Human predictions are collected to establish a performance reference for model comparison. Empirical results from baseline models indicate that finetuned transformer models outperform zero-shot and few-shot LLMs on the UKET prediction task. The performance of zero-shot LLMs can be enhanced by integrating task-related information into few-shot examples. We hope that the CLC-UKET dataset, along with human annotations and empirical findings, can serve as a valuable benchmark for employment-related dispute resolution.
2021
TIAGE: A Benchmark for Topic-Shift Aware Dialog Modeling
Huiyuan Xie | Zhenghao Liu | Chenyan Xiong | Zhiyuan Liu | Ann Copestake
Findings of the Association for Computational Linguistics: EMNLP 2021
Huiyuan Xie | Zhenghao Liu | Chenyan Xiong | Zhiyuan Liu | Ann Copestake
Findings of the Association for Computational Linguistics: EMNLP 2021
Human conversations naturally evolve around different topics and fluently move between them. In research on dialog systems, the ability to actively and smoothly transition to new topics is often ignored. In this paper we introduce TIAGE, a new topic-shift aware dialog benchmark constructed utilizing human annotations on topic shifts. Based on TIAGE, we introduce three tasks to investigate different scenarios of topic-shift modeling in dialog settings: topic-shift detection, topic-shift triggered response generation and topic-aware dialog generation. Experiments on these tasks show that the topic-shift signals in TIAGE are useful for topic-shift response generation. On the other hand, dialog systems still struggle to decide when to change topic. This indicates further research is needed in topic-shift aware dialog modeling.