Rong Yan
Also published as: 蓉 闫
2026
CEDAR: A Chinese Evaluation Dataset for Computational Argumentation
Tian Lan | Jiang Li | Rong Yan | Feilong Bao | Weihua Wang | Guanglai Gao | Xiangdong Su
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Tian Lan | Jiang Li | Rong Yan | Feilong Bao | Weihua Wang | Guanglai Gao | Xiangdong Su
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Computational argumentation has received increasing attention in recent years. However, existing debate datasets neglect some important labels for argument mining, generation, and evaluation. Meanwhile, the lack of comprehensively annotated Chinese oral debate datasets hinders progress in this field. To address these gaps, we introduce a comprehensive Chinese Evaluation Dataset for Computational Argumentation, named CEDAR. Compared to previous datasets, CEDAR includes the essential labels of computational argumentation (claim, stance, evidence) and five additional crucial labels: rhetorical figures, debater roles, modal words, utterance time, and debate results. Moreover, it offers complete transcripts of each debate, including speeches from the Pro and Con sides. Thus, the proposed CEDAR not only supports common argument mining and generation tasks, but also provides resources for rhetorical figure detection, argument quality evaluation, and debate result prediction. This dataset covers 600 debates about 318 topics from Chinese debate competitions. Besides providing a dataset for research, we conduct experiments on common computational argument tasks and a novel task (rhetorical figure detection), in which we also evaluate LLMs. The experimental results highlight the challenging nature of the dataset. Our corpus is available at https://github.com/VelikayaScarlet/CEDAR.
2024
多机制整合的中文医疗命名实体识别(Infusing multi-schemes for Chinese Medical Named Entity Recognition)
Shanshan Wang (王珊珊) | Kunyuan Zhang (张焜元) | Rong Yan (闫蓉)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
Shanshan Wang (王珊珊) | Kunyuan Zhang (张焜元) | Rong Yan (闫蓉)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
“在互联网在线医疗领域,由于大多数患者缺乏医学培训,以及不同学科病理特征的复杂性,医患对话文本中的医学命名实体呈现出长且多词的句法特点,给命名实体识别算法提出了新的挑战。 为解决这一问题,本研究融合多个不同粒度的扩张卷积机制,构建了Flat-Lattice-CNN模型。 该模型不仅考虑字符和词语的语义信息以及它们的绝对和相对位置信息,还提取跨越不同距离的多个字符/词语的共现依存关系特征,以此提高医学长命名实体的识别精度。 实验结果表明,本文提出的模型在所评估数据集的命名实体识别任务上有普遍性的性能提升,尤其是在以长实体为主的中文医疗数据集CTDD上,该模型的F 1值提升了约2%,具有更优的表现。”
EpLSA: Synergy of Expert-prefix Mixtures and Task-Oriented Latent Space Adaptation for Diverse Generative Reasoning
Fujun Zhang | Xiangdong Su | Jiang Li | Rong Yan | Guanglai Gao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Fujun Zhang | Xiangdong Su | Jiang Li | Rong Yan | Guanglai Gao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Existing models for diverse generative reasoning still struggle to generate multiple unique and plausible results. Through an in-depth examination, we argue that it is critical to leverage a mixture of experts as prefixes to enhance the diversity of generated results and make task-oriented adaptation in the latent space of the generation models to improve the quality of the responses. At this point, we propose EpLSA, an innovative model based on the synergy of expert-prefix mixtures and task-oriented latent space adaptation for diverse generative reasoning. Specifically, we use expert-prefixes mixtures to encourage the model to create multiple responses with different semantics and design a loss function to address the problem that the semantics is interfered by the expert-prefixes. Meanwhile, we design a task-oriented adaptation block to make the pre-trained encoder within the generation model more effectively adapted to the pre-trained decoder in the latent space, thus further improving the quality of the generated text. Extensive experiments on three different types of generative reasoning tasks demonstrate that EpLSA outperforms existing baseline models in terms of both the quality and diversity of the generated outputs. Our code is publicly available at https://github.com/IMU-MachineLearningSXD/EpLSA.
2022
基于SoftLexicon和注意力机制的中文因果关系抽取(Chinese Causality Extraction Based on SoftLexicon and Attention Mechanism)
Shilin Cui (崔仕林) | Rong Yan (闫蓉)
Proceedings of the 21st Chinese National Conference on Computational Linguistics
Shilin Cui (崔仕林) | Rong Yan (闫蓉)
Proceedings of the 21st Chinese National Conference on Computational Linguistics
“针对现有中文因果关系抽取方法对因果事件边界难以识别和文本特征表示不充分的问题,提出了一种基于外部词汇信息和注意力机制的中文因果关系抽取模型BiLSTM-TWAM+CRF。该模型首次使用SoftLexicon方法引入外部词汇信息构建词集,解决了因果事件边界难以识别的问题。通过构建的双路关注模块TWAM(Two Way Attention Module),实现了从局部和全局两个角度充分刻画文本特征。实验结果表明,与当前中文因果关系抽取模型相比较,本文方法表现出更优的抽取效果。”