Ning Wu


2022

pdf
基于语料的“一+形容词+量词+名词”构式语义考察(A Semantic Study of “One-Adjective-Quantifier-Noun” Based on Corpus)
Ning Wu (吴宁) | Zhimin Wang (王治敏)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“数形量名”构式是我们日常语言交流中大量使用的结构。本文在北京语言大学BCC在线语料库5710条语料的基础上考察“一形量名”结构,寻求影响构式成立与否的的关键性因素。本文研究了语义限制下进入构式形容词的语义特点、“物理抽象度”对构式名词成分的限制以及量词在构式形成过程中的作用。研究表明,具备高拆分计量性等语义特征的形容词更易进入此构式,进入构式形容词中90%以上项目都可由单一变化物理量进行衡量,此部分形容词在同一意义层面上与构式内的量词互相和谐;“一形量名”构式对“物理抽象度([+易量化、+低有机活性、+形状易概括])”赋值低的名词包容性更高;此外,本文还发现集合量词的出现可降低整体构式的物理抽象度,从而增强“一形量名”构式成立可能性。”

2020

pdf
XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation
Yaobo Liang | Nan Duan | Yeyun Gong | Ning Wu | Fenfei Guo | Weizhen Qi | Ming Gong | Linjun Shou | Daxin Jiang | Guihong Cao | Xiaodong Fan | Ruofei Zhang | Rahul Agrawal | Edward Cui | Sining Wei | Taroon Bharti | Ying Qiao | Jiun-Hung Chen | Winnie Wu | Shuguang Liu | Fan Yang | Daniel Campos | Rangan Majumder | Ming Zhou
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

In this paper, we introduce XGLUE, a new benchmark dataset to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora, and evaluate their performance across a diverse set of cross-lingual tasks. Comparing to GLUE (Wang et al.,2019), which is labeled in English and includes natural language understanding tasks only, XGLUE has three main advantages: (1) it provides two corpora with different sizes for cross-lingual pre-training; (2) it provides 11 diversified tasks that cover both natural language understanding and generation scenarios; (3) for each task, it provides labeled data in multiple languages. We extend a recent cross-lingual pre-trained model Unicoder (Huang et al., 2019) to cover both understanding and generation tasks, which is evaluated on XGLUE as a strong baseline. We also evaluate the base versions (12-layer) of Multilingual BERT, XLM and XLM-R for comparison.