Xiangyun Dong


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
TermDiffuSum: A Term-guided Diffusion Model for Extractive Summarization of Legal Documents
Xiangyun Dong | Wei Li | Yuquan Le | Zhangyue Jiang | Junxi Zhong | Zhong Wang
Proceedings of the 31st International Conference on Computational Linguistics

Extractive summarization for legal documents aims to automatically extract key sentences from legal texts to form concise summaries. Recent studies have explored diffusion models for extractive summarization task, showcasing their remarkable capabilities. Despite these advancements, these models often fall short in effectively capturing and leveraging the specialized legal terminology crucial for accurate legal summarization. To address the limitation, this paper presents a novel term-guided diffusion model for extractive summarization of legal documents, named TermDiffuSum. It incorporates legal terminology into the diffusion model via a well-designed multifactor fusion noise weighting schedule, which allocates higher attention weight to sentences containing a higher concentration of legal terms during the diffusion process. Additionally, TermDiffuSum utilizes a re-ranking loss function to refine the model’s selection of more relevant summaries by leveraging the relationship between the candidate summaries generated by the diffusion process and the reference summaries. Experimental results on a self-constructed legal summarization dataset reveal that TermDiffuSum outperforms existing diffusion-based summarization models, achieving improvements of 3.10 in ROUGE-1, 2.84 in ROUGE-2, and 2.89 in ROUGE-L. To further validate the generalizability of TermDiffuSum, we conduct experiments on three public datasets from news and social media domains, with results affirming the scalability of our approach.