Wenhe Feng

Also published as: 文贺


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2024

pdf bib
基于隐性句逗号识别的汉语长句机器翻译(Machine translation of Chinese long sentences based on recognition of implicit period and comma)
Wenjuan Zhang (张文娟) | Manjia Li (李熳佳) | Wenhe Feng (冯文贺)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“长句翻译一直是机器翻译的难题。本文根据汉语中相当数量的逗号(句内标点)和句号(句间标点)可相互转化的特点,提出”隐性句号”(可转化为句号的逗号)和”隐性逗号”(可转化为逗号的句号)概念,并实现其自动识别,以将汉语长句变为短句用于汉英机器翻译。为此,首先通过人工与半监督学习结合方法构建了一个隐性句逗数据集,实现了基于预训练模型的隐性句逗识别方法,其中性能最好的HierarchicalBERT作为后续应用模型。进而,实现了基于隐性句逗识别的汉英机器翻译方法。在WMT2018(新闻)和WMT2023(文学)测试语料上基于预训练机器翻译模型的实验表明,对于汉语长句的英译,本文方法相比基准翻译的BLEU值整体有所提高,而且在相对稳健机器翻译模型上,呈现为句子越长本文方法效果越明显。”

pdf bib
Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese
Jingshen Zhang | Xinglu Chen | Xinying Qiu | Zhimin Wang | Wenhe Feng
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“Chinese sentence simplification faces challenges due to the lack of large-scale labeledparallel corpora and the prevalence of idioms. To address these challenges, we pro-pose Readability-guided Idiom-aware Sentence Simplification (RISS), a novel frameworkthat combines data augmentation techniques. RISS introduces two key components: (1)Readability-guided Paraphrase Selection (RPS), a method for mining high-quality sen-tence pairs, and (2) Idiom-aware Simplification (IAS), a model that enhances the compre-hension and simplification of idiomatic expressions. By integrating RPS and IAS usingmulti-stage and multi-task learning strategies, RISS outperforms previous state-of-the-artmethods on two Chinese sentence simplification datasets. Furthermore, RISS achievesadditional improvements when fine-tuned on a small labeled dataset. Our approachdemonstrates the potential for more effective and accessible Chinese text simplification.”

2014

pdf bib
Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure
Yancui Li | Wenhe Feng | Jing Sun | Fang Kong | Guodong Zhou
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)