Fangwei Zhu


2024

pdf
CoUDA: Coherence Evaluation via Unified Data Augmentation
Dawei Zhu | Wenhao Wu | Yifan Song | Fangwei Zhu | Ziqiang Cao | Sujian Li
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Coherence evaluation aims to assess the organization and structure of a discourse, which remains challenging even in the era of large language models. Due to the scarcity of annotated data, data augmentation is commonly used for training coherence evaluation models. However, previous augmentations for this task primarily rely on heuristic rules, lacking designing criteria as guidance.In this paper, we take inspiration from linguistic theory of discourse structure, and propose a data augmentation framework named CoUDA. CoUDA breaks down discourse coherence into global and local aspects, and designs augmentation strategies for both aspects, respectively.Especially for local coherence, we propose a novel generative strategy for constructing augmentation samples, which involves post-pretraining a generative model and applying two controlling mechanisms to control the difficulty of generated samples. During inference, CoUDA also jointly evaluates both global and local aspects to comprehensively assess the overall coherence of a discourse.Extensive experiments in coherence evaluation show that, with only 233M parameters, CoUDA achieves state-of-the-art performance in both pointwise scoring and pairwise ranking tasks, even surpassing recent GPT-3.5 and GPT-4 based metrics.

2023

pdf
Learn to Not Link: Exploring NIL Prediction in Entity Linking
Fangwei Zhu | Jifan Yu | Hailong Jin | Lei Hou | Juanzi Li | Zhifang Sui
Findings of the Association for Computational Linguistics: ACL 2023

Entity linking models have achieved significant success via utilizing pretrained language models to capture semantic features. However, the NIL prediction problem, which aims to identify mentions without a corresponding entity in the knowledge base, has received insufficient attention. We categorize mentions linking to NIL into Missing Entity and Non-Entity Phrase, and propose an entity linking dataset NEL that focuses on the NIL prediction problem.NEL takes ambiguous entities as seeds, collects relevant mention context in the Wikipedia corpus, and ensures the presence of mentions linking to NIL by human annotation and entity masking. We conduct a series of experiments with the widely used bi-encoder and cross-encoder entity linking models, results show that both types of NIL mentions in training data have a significant influence on the accuracy of NIL prediction. Our code and dataset can be accessed at https://github.com/solitaryzero/NIL_EL.

pdf
CCL23-Eval任务4总结报告:第三届中文空间语义理解评测(Overview of CCL23-Eval Task 4:The 3rd Chinese Spatial Cognition Evaluation)
Liming Xiao (肖力铭) | Weidong Zhan (詹卫东) | Zhifang Sui (穗志方) | Yuhang Qin (秦宇航) | Chunhui Sun (孙春晖) | Dan Xing (邢丹) | Nan Li (李楠) | Fangwei Zhu (祝方韦) | Peiyi Wang (王培懿)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“第三届中文空间语义理解评测任务(SpaCE2023)旨在测试机器的空间语义理解能力,包括三个子任务:(1)空间信息异常识别任务;(2)空间语义角色标注任务;(3)空间场景异同判断任务。本届评测在SpaCE2022的基础上,优化了子任务一和子任务二的任务设计,并提出了子任务三这一全新的评测任务。最终有1支队伍提交参赛结果,并且在子任务一上的成绩超过了基线模型。本文还报告了大语言模型ChatGPT在SpaCE2023三个子任务上的表现,结合问题提出指令设计可改进的方向。”

2022

pdf
UPER: Boosting Multi-Document Summarization with an Unsupervised Prompt-based Extractor
Shangqing Tu | Jifan Yu | Fangwei Zhu | Juanzi Li | Lei Hou | Jian-Yun Nie
Proceedings of the 29th International Conference on Computational Linguistics

Multi-Document Summarization (MDS) commonly employs the 2-stage extract-then-abstract paradigm, which first extracts a relatively short meta-document, then feeds it into the deep neural networks to generate an abstract. Previous work usually takes the ROUGE score as the label for training a scoring model to evaluate source documents. However, the trained scoring model is prone to under-fitting for low-resource settings, as it relies on the training data. To extract documents effectively, we construct prompting templates that invoke the underlying knowledge in Pre-trained Language Model (PLM) to calculate the document and keyword’s perplexity, which can assess the document’s semantic salience. Our unsupervised approach can be applied as a plug-in to boost other metrics for evaluating a document’s salience, thus improving the subsequent abstract generation. We get positive results on 2 MDS datasets, 2 data settings, and 2 abstractive backbone models, showing our method’s effectiveness. Our code is available at https://github.com/THU-KEG/UPER

2021

pdf
TWAG: A Topic-Guided Wikipedia Abstract Generator
Fangwei Zhu | Shangqing Tu | Jiaxin Shi | Juanzi Li | Lei Hou | Tong Cui
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Wikipedia abstract generation aims to distill a Wikipedia abstract from web sources and has met significant success by adopting multi-document summarization techniques. However, previous works generally view the abstract as plain text, ignoring the fact that it is a description of a certain entity and can be decomposed into different topics. In this paper, we propose a two-stage model TWAG that guides the abstract generation with topical information. First, we detect the topic of each input paragraph with a classifier trained on existing Wikipedia articles to divide input documents into different topics. Then, we predict the topic distribution of each abstract sentence, and decode the sentence from topic-aware representations with a Pointer-Generator network. We evaluate our model on the WikiCatSum dataset, and the results show that TWAG outperforms various existing baselines and is capable of generating comprehensive abstracts.