WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning

Wenhao Wu, Wei Li, Xinyan Xiao, Jiachen Liu, Sujian Li, Yajuan Lyu


Abstract
A crucial issue of current text generation models is that they often uncontrollably generate text that is factually inconsistent with inputs. Due to lack of annotated data, existing factual consistency metrics usually train evaluation models on synthetic texts or directly transfer from other related tasks, such as question answering (QA) and natural language inference (NLI).Bias in synthetic text or upstream tasks makes them perform poorly on text actually generated by language models, especially for general evaluation for various tasks. To alleviate this problem, we propose a weakly supervised framework named WeCheck that is directly trained on actual generated samples from language models with weakly annotated labels.WeCheck first utilizes a generative model to infer the factual labels of generated samples by aggregating weak labels from multiple resources.Next, we train a simple noise-aware classification model as the target metric using the inferred weakly supervised information.Comprehensive experiments on various tasks demonstrate the strong performance of WeCheck, achieving an average absolute improvement of 3.3% on the TRUE benchmark over 11B state-of-the-art methods using only 435M parameters.Furthermore, it is up to 30 times faster than previous evaluation methods, greatly improving the accuracy and efficiency of factual consistency evaluation.
Anthology ID:
2023.acl-long.18
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
307–321
Language:
URL:
https://aclanthology.org/2023.acl-long.18
DOI:
10.18653/v1/2023.acl-long.18
Bibkey:
Cite (ACL):
Wenhao Wu, Wei Li, Xinyan Xiao, Jiachen Liu, Sujian Li, and Yajuan Lyu. 2023. WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 307–321, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning (Wu et al., ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.acl-long.18.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-5/2023.acl-long.18.mp4