Xiaomin Chu


基于新闻图式结构的篇章功能语用识别方法(Discourse Functional Pragmatics Recognition Based on News Schemata)
Mengqi Du (杜梦琦) | Feng Jiang (蒋峰) | Xiaomin Chu (褚晓敏) | Peifeng Li (李培峰)
Proceedings of the 21st Chinese National Conference on Computational Linguistics


Automated Chinese Essay Scoring from Multiple Traits
Yaqiong He | Feng Jiang | Xiaomin Chu | Peifeng Li
Proceedings of the 29th International Conference on Computational Linguistics

Automatic Essay Scoring (AES) is the task of using the computer to evaluate the quality of essays automatically. Current research on AES focuses on scoring the overall quality or single trait of prompt-specific essays. However, the users not only expect to obtain the overall score but also the instant feedback from different traits to help their writing in the real world. Therefore, we first annotate a mutli-trait dataset ACEA including 1220 argumentative essays from four traits, i.e., essay organization, topic, logic, and language. And then we design a hierarchical multi-task trait scorer HMTS to evaluate the quality of writing by modeling these four traits. Moreover, we propose an inter-sequence attention mechanism to enhance information interaction between different tasks and design the trait-specific features for various tasks in AES. The experimental results on ACEA show that our HMTS can effectively score essays from multiple traits, outperforming several strong models.


Not Just Classification: Recognizing Implicit Discourse Relation on Joint Modeling of Classification and Generation
Feng Jiang | Yaxin Fan | Xiaomin Chu | Peifeng Li | Qiaoming Zhu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Implicit discourse relation recognition (IDRR) is a critical task in discourse analysis. Previous studies only regard it as a classification task and lack an in-depth understanding of the semantics of different relations. Therefore, we first view IDRR as a generation task and further propose a method joint modeling of the classification and generation. Specifically, we propose a joint model, CG-T5, to recognize the relation label and generate the target sentence containing the meaning of relations simultaneously. Furthermore, we design three target sentence forms, including the question form, for the generation model to incorporate prior knowledge. To address the issue that large discourse units are hardly embedded into the target sentence, we also propose a target sentence construction mechanism that automatically extracts core sentences from those large discourse units. Experimental results both on Chinese MCDTB and English PDTB datasets show that our model CG-T5 achieves the best performance against several state-of-the-art systems.


融合全局和局部信息的汉语宏观篇章结构识别(Combining Global and Local Information to Recognize Chinese Macro Discourse Structure)
Yaxin Fan (范亚鑫) | Feng Jiang (蒋峰) | Xiaomin Chu (褚晓敏) | Peifeng Li (李培峰) | Qiaoming Zhu (朱巧明)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


Chinese Paragraph-level Discourse Parsing with Global Backward and Local Reverse Reading
Feng Jiang | Xiaomin Chu | Peifeng Li | Fang Kong | Qiaoming Zhu
Proceedings of the 28th International Conference on Computational Linguistics

Discourse structure tree construction is the fundamental task of discourse parsing and most previous work focused on English. Due to the cultural and linguistic differences, existing successful methods on English discourse parsing cannot be transformed into Chinese directly, especially in paragraph level suffering from longer discourse units and fewer explicit connectives. To alleviate the above issues, we propose two reading modes, i.e., the global backward reading and the local reverse reading, to construct Chinese paragraph level discourse trees. The former processes discourse units from the end to the beginning in a document to utilize the left-branching bias of discourse structure in Chinese, while the latter reverses the position of paragraphs in a discourse unit to enhance the differentiation of coherence between adjacent discourse units. The experimental results on Chinese MCDTB demonstrate that our model outperforms all strong baselines.


Building a Macro Chinese Discourse Treebank
Xiaomin Chu | Feng Jiang | Sheng Xu | Qiaoming Zhu
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Joint Modeling of Structure Identification and Nuclearity Recognition in Macro Chinese Discourse Treebank
Xiaomin Chu | Feng Jiang | Yi Zhou | Guodong Zhou | Qiaoming Zhu
Proceedings of the 27th International Conference on Computational Linguistics

Discourse parsing is a challenging task and plays a critical role in discourse analysis. This paper focus on the macro level discourse structure analysis, which has been less studied in the previous researches. We explore a macro discourse structure presentation schema to present the macro level discourse structure, and propose a corresponding corpus, named Macro Chinese Discourse Treebank. On these bases, we concentrate on two tasks of macro discourse structure analysis, including structure identification and nuclearity recognition. In order to reduce the error transmission between the associated tasks, we adopt a joint model of the two tasks, and an Integer Linear Programming approach is proposed to achieve global optimization with various kinds of constraints.

MCDTB: A Macro-level Chinese Discourse TreeBank
Feng Jiang | Sheng Xu | Xiaomin Chu | Peifeng Li | Qiaoming Zhu | Guodong Zhou
Proceedings of the 27th International Conference on Computational Linguistics

In view of the differences between the annotations of micro and macro discourse rela-tionships, this paper describes the relevant experiments on the construction of the Macro Chinese Discourse Treebank (MCDTB), a higher-level Chinese discourse corpus. Fol-lowing RST (Rhetorical Structure Theory), we annotate the macro discourse information, including discourse structure, nuclearity and relationship, and the additional discourse information, including topic sentences, lead and abstract, to make the macro discourse annotation more objective and accurate. Finally, we annotated 720 articles with a Kappa value greater than 0.6. Preliminary experiments on this corpus verify the computability of MCDTB.