2023
pdf
abs
人工智能生成语言与人类语言对比研究——以ChatGPT为例(A Comparative Study of Language between Artificial Intelligence and Human: A Case Study of ChatGPT)
Zhu Junhui (君辉 朱)
|
Wang Mengyan (梦焰 王)
|
Yang Erhong (尔弘 杨)
|
Nie Jingran (锦燃 聂)
|
Wang Yujie (誉杰 王)
|
Yue Yan (岩 岳)
|
Yang Liner (麟儿 杨)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics
“基于自然语言生成技术的聊天机器人ChatGPT能够快速生成回答,但目前尚未对机器作答所使用的语言与人类真实语言在哪些方面存在差异进行充分研究。本研究提取并计算159个语言特征在人类和ChatGPT对中文开放域问题作答文本中的分布,使用随机森林、逻辑回归和支持向量机(SVM)三种机器学习算法训练人工智能探测器,并评估模型性能。实验结果表明,随机森林和SVM均能达到较高的分类准确率。通过对比分析,研究揭示了两种文本在描述性特征、字词常用度、字词多样性、句法复杂性、语篇凝聚力五个维度上语言表现的优势和不足。结果显示,两种文本之间的差异主要集中在描述性特征、字词常用度、字词多样性三个维度。”
pdf
abs
Lexical Complexity Controlled Sentence Generation for Language Learning
Nie Jinran
|
Yang Liner
|
Chen Yun
|
Kong Cunliang
|
Zhu Junhui
|
Yang Erhong
Proceedings of the 22nd Chinese National Conference on Computational Linguistics
“Language teachers spend a lot of time developing good examples for language learners. For this reason, we define a new task for language learning, lexical complexity controlledsentence generation, which requires precise control over the lexical complexity in thekeywords to examples generation and better fluency and semantic consistency. The chal-lenge of this task is to generate fluent sentences only using words of given complexitylevels. We propose a simple but effective approach for this task based on complexityembedding while controlling sentence length and syntactic complexity at the decodingstage. Compared with potential solutions, our approach fuses the representations of theword complexity levels into the model to get better control of lexical complexity. Andwe demonstrate the feasibility of the approach for both training models from scratch andfine-tuning the pre-trained models. To facilitate the research, we develop two datasetsin English and Chinese respectively, on which extensive experiments are conducted. Ex-perimental results show that our approach provides more precise control over lexicalcomplexity, as well as better fluency and diversity.”
pdf
abs
CCL23-Eval 任务7总结报告: 汉语学习者文本纠错(Overview of CCL23-Eval Task: Chinese Learner Text Correction)
Hongxiang Chang
|
Yang Liu
|
Meng Xu
|
Yingying Wang
|
Cunliang Kong
|
Liner Yang
|
Yang Erhong
|
Maosong Sun
|
Gaoqi Rao
|
Renfen Hu
|
Zhenghao Liu
|
鸿翔 常
|
洋 刘
|
萌 徐
|
莹莹 王
|
存良 孔
|
麟儿 杨
|
尔弘 杨
|
茂松 孙
|
高琦 饶
|
韧奋 胡
|
正皓 刘
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
“汉语学习者文本纠错(Chinese Learner Text Correction)评测比赛,是依托于第22届中国计算语言学大会举办的技术评测。针对汉语学习者文本,设置了多维度汉语学习者文本纠错和中文语法错误检测两个赛道。结合人工智能技术的不断进步和发展的时代背景,在两赛道下分别设置开放和封闭任务。开放任务允许使用大模型。以汉语学习者文本多维标注语料库YACLC为基础建设评测数据集,建立基于多参考答案的评价标准,构建基准评测框架,进一步推动汉语学习者文本纠错研究的发展。共38支队伍报名参赛,其中5支队伍成绩优异并提交了技术报告。”