Pengxiu Lu
Also published as: 芃秀 卢
2025
Lemmatization of Cuneiform Languages Using the ByT5 Model
Pengxiu Lu
|
Yonglong Huang
|
Jing Xu
|
Minxuan Feng
|
Chao Xu
Proceedings of the Second Workshop on Ancient Language Processing
Lemmatization of cuneiform languages presents a unique challenge due to their complex writing system, which combines syllabic and logographic elements. In this study, we investigate the effectiveness of the ByT5 model in addressing this challenge by developing and evaluating a ByT5-based lemmatization system. Experimental results demonstrate that ByT5 outperforms mT5 in this task, achieving an accuracy of 80.55% on raw lemmas and 82.59% on generalized lemmas, where sense numbers are removed. These findings highlight the potential of ByT5 for lemmatizing cuneiform languages and provide useful insights for future work on ancient text lemmatization.
2024
从句子图到篇章图——基于抽象语义表示的篇章级共指标注体系设计(Discourse-Level Anaphora Annotation System Based on Abstract Semantic Representation)
Yixuan Zhang (张艺璇)
|
Bin Li (李斌)
|
Zhixing Xu (许智星)
|
Pengxiu Lu (卢芃秀)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
“篇章共指体现篇章概念的动态转移,成为近年研究热点。本文在梳理共指理论研究的基础上,综述了相关语料库及解析方法,发现共指语料库仍存在以下两个问题:共指关系标注粗疏与基本不考虑整句语义表示的融合。本文以句子级语义标注体系(中文抽象语义表示)为基础构建篇章共指体系,构建了 100 篇共指语料库。本体系涵盖 52 种句内语义关系和 8 种篇章共指关系,二者相结合构建的篇章共指语义图,为篇章级语义分析提供新的框架和数据资源。”
Search
Fix author
Co-authors
- Minxuan Feng (冯敏萱) 1
- Yonglong Huang 1
- Bin Li (李斌) 1
- Zhixing Xu (许智星) 1
- Jing Xu 1
- show all...