Meichun Liu


2022

pdf
Exploring metaphorical polysemy with Multiple Correspondence analysis: A corpus-based study on the predicative hei ‘black’ in Chinese
Jinmeng Dou | Meichun Liu
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

pdf
A distinctive collexeme analysis of near-synonym constructions “ying-dang/ying-gai + verb”
Zhuo Zhang | Meichun Liu | Dingxuan Zhou
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

pdf
Robustness of Hybrid Models in Cross-domain Readability Assessment
Ho Hung Lim | Tianyuan Cai | John S. Y. Lee | Meichun Liu
Proceedings of the 20th Annual Workshop of the Australasian Language Technology Association

2020

pdf
Using Verb Frames for Text Difficulty Assessment
John Lee | Meichun Liu | Tianyuan Cai
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet

This paper presents the first investigation on using semantic frames to assess text difficulty. Based on Mandarin VerbNet, a verbal semantic database that adopts a frame-based approach, we examine usage patterns of ten verbs in a corpus of graded Chinese texts. We identify a number of characteristics in texts at advanced grades: more frequent use of non-core frame elements; more frequent omission of some core frame elements; increased preference for noun phrases rather than clauses as verb arguments; and more frequent metaphoric usage. These characteristics can potentially be useful for automatic prediction of text readability.

2019

pdf
Parsing Chinese Sentences with Grammatical Relations
Weiwei Sun | Yufei Chen | Xiaojun Wan | Meichun Liu
Computational Linguistics, Volume 45, Issue 1 - March 2019

We report our work on building linguistic resources and data-driven parsers in the grammatical relation (GR) analysis for Mandarin Chinese. Chinese, as an analytic language, encodes grammatical information in a highly configurational rather than morphological way. Accordingly, it is possible and reasonable to represent almost all grammatical relations as bilexical dependencies. In this work, we propose to represent grammatical information using general directed dependency graphs. Both only-local and rich long-distance dependencies are explicitly represented. To create high-quality annotations, we take advantage of an existing TreeBank, namely, Chinese TreeBank (CTB), which is grounded on the Government and Binding theory. We define a set of linguistic rules to explore CTB’s implicit phrase structural information and build deep dependency graphs. The reliability of this linguistically motivated GR extraction procedure is highlighted by manual evaluation. Based on the converted corpus, data-driven, including graph- and transition-based, models are explored for Chinese GR parsing. For graph-based parsing, a new perspective, graph merging, is proposed for building flexible dependency graphs: constructing complex graphs via constructing simple subgraphs. Two key problems are discussed in this perspective: (1) how to decompose a complex graph into simple subgraphs, and (2) how to combine subgraphs into a coherent complex graph. For transition-based parsing, we introduce a neural parser based on a list-based transition system. We also discuss several other key problems, including dynamic oracle and beam search for neural transition-based parsing. Evaluation gauges how successful GR parsing for Chinese can be by applying data-driven models. The empirical analysis suggests several directions for future study.

2017

pdf
Automatic Difficulty Assessment for Chinese Texts
John Lee | Meichun Liu | Chun Yin Lam | Tak On Lau | Bing Li | Keying Li
Proceedings of the IJCNLP 2017, System Demonstrations

We present a web-based interface that automatically assesses reading difficulty of Chinese texts. The system performs word segmentation, part-of-speech tagging and dependency parsing on the input text, and then determines the difficulty levels of the vocabulary items and grammatical constructions in the text. Furthermore, the system highlights the words and phrases that must be simplified or re-written in order to conform to the user-specified target difficulty level. Evaluation results show that the system accurately identifies the vocabulary level of 89.9% of the words, and detects grammar points at 0.79 precision and 0.83 recall.

2004

pdf
A Resolution for Polysemy: the case of Mandarin verb ZOU
Yaling Hsu | Meichun Liu
Proceedings of the 16th Conference on Computational Linguistics and Speech Processing