Tianwei He


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2023

pdf bib
Symbolization, Prompt, and Classification: A Framework for Implicit Speaker Identification in Novels
Yue Chen | Tianwei He | Hongbin Zhou | Jia-Chen Gu | Heng Lu | Zhen-Hua Ling
Findings of the Association for Computational Linguistics: EMNLP 2023

Speaker identification in novel dialogues can be widely applied to various downstream tasks, such as producing multi-speaker audiobooks and converting novels into scripts. However, existing state-of-the-art methods are limited to handling explicit narrative patterns like “Tom said, '...'", unable to thoroughly understand long-range contexts and to deal with complex cases. To this end, we propose a framework named SPC, which identifies implicit speakers in novels via symbolization, prompt, and classification. First, SPC symbolizes the mentions of candidate speakers to construct a unified label set. Then, by inserting a prompt we re-formulate speaker identification as a classification task to minimize the gap between the training objectives of speaker identification and the pre-training task. Two auxiliary tasks are also introduced in SPC to enhance long-range context understanding. Experimental results show that SPC outperforms previous methods by a large margin of 4.8% accuracy on the web novel collection, which reduces 47% of speaker identification errors, and also outperforms the emerging ChatGPT. In addition, SPC is more accurate in implicit speaker identification cases that require long-range context semantic understanding.