WYWEB: A NLP Evaluation Benchmark For Classical Chinese

Bo Zhou; Qianglong Chen; Tianyu Wang; Xiaomi Zhong; Yin Zhang

doi:10.18653/v1/2023.findings-acl.204

WYWEB: A NLP Evaluation Benchmark For Classical Chinese

Bo Zhou, Qianglong Chen, Tianyu Wang, Xiaomi Zhong, Yin Zhang

Abstract

To fully evaluate the overall performance of different NLP models in a given domain, many evaluation benchmarks are proposed, such as GLUE, SuperGLUE and CLUE. The field of natural language understanding has traditionally focused on benchmarks for various tasks in languages such as Chinese, English, and multilingual, however, there has been a lack of attention given to the area of classical Chinese, also known as "wen yan wen (文言文)", which has a rich history spanning thousands of years and holds significant cultural and academic value. For the prosperity of the NLP community, in this paper, we introduce the WYWEB evaluation benchmark, which consists of nine NLP tasks in classical Chinese, implementing sentence classification, sequence labeling, reading comprehension, and machine translation. We evaluate the existing pre-trained language models, which are all struggling with this benchmark. We also introduce a number of supplementary datasets and additional tools to help facilitate further progress on classical Chinese NLU. The github repository is https://github.com/baudzhou/WYWEB.

Anthology ID:: 2023.findings-acl.204
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3294–3319
Language:
URL:: https://aclanthology.org/2023.findings-acl.204
DOI:: 10.18653/v1/2023.findings-acl.204
Bibkey:
Cite (ACL):: Bo Zhou, Qianglong Chen, Tianyu Wang, Xiaomi Zhong, and Yin Zhang. 2023. WYWEB: A NLP Evaluation Benchmark For Classical Chinese. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3294–3319, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: WYWEB: A NLP Evaluation Benchmark For Classical Chinese (Zhou et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/improve-issue-templates/2023.findings-acl.204.pdf

PDF Search