Li Sujian


2021

pdf
Unifying Discourse Resources with Dependency Framework
Cheng Yi | Li Sujian | Li Yueyuan
Proceedings of the 20th Chinese National Conference on Computational Linguistics

For text-level discourse analysis there are various discourse schemes but relatively few labeleddata because discourse research is still immature and it is labor-intensive to annotate the innerlogic of a text. In this paper we attempt to unify multiple Chinese discourse corpora under different annotation schemes with discourse dependency framework by designing semi-automatic methods to convert them into dependency structures. We also implement several benchmark dependency parsers and research on how they can leverage the unified data to improve performance.1

2020

pdf
LiveQA: A Question Answering Dataset over Sports Live
Liu Qianying | Jiang Sicong | Wang Yizhong | Li Sujian
Proceedings of the 19th Chinese National Conference on Computational Linguistics

In this paper, we introduce LiveQA, a new question answering dataset constructed from play-by-play live broadcast. It contains 117k multiple-choice questions written by human commentators for over 1,670 NBA games, which are collected from the Chinese Hupu1 website. Derived from the characteristics of sports games, LiveQA can potentially test the reasoning ability across timeline-based live broadcasts, which is challenging compared to the existing datasets. In LiveQA, the questions require understanding the timeline, tracking events or doing mathematical computations. Our preliminary experiments show that the dataset introduces a challenging problem for question answering models, and a strong baseline model only achieves the accuracy of 53.1% and cannot beat the dominant option rule. We release the code and data of this paper for future research.