Qinglin Zhu
2021
HITSZ-HLT at SemEval-2021 Task 5: Ensemble Sequence Labeling and Span Boundary Detection for Toxic Span Detection
Qinglin Zhu
|
Zijie Lin
|
Yice Zhang
|
Jingyi Sun
|
Xiang Li
|
Qihui Lin
|
Yixue Dang
|
Ruifeng Xu
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
This paper presents the winning system that participated in SemEval-2021 Task 5: Toxic Spans Detection. This task aims to locate those spans that attribute to the text’s toxicity within a text, which is crucial for semi-automated moderation in online discussions. We formalize this task as the Sequence Labeling (SL) problem and the Span Boundary Detection (SBD) problem separately and employ three state-of-the-art models. Next, we integrate predictions of these models to produce a more credible and complement result. Our system achieves a char-level score of 70.83%, ranking 1/91. In addition, we also explore the lexicon-based method, which is strongly interpretable and flexible in practice.
2020
结合金融领域情感词典和注意力机制的细粒度情感分析(Attention-based Recurrent Network Combined with Financial Lexicon for Aspect-level Sentiment Classification)
Qinglin Zhu (祝清麟)
|
Bin Liang (梁斌)
|
Liuyu Han (刘宇瀚)
|
Yi Chen (陈奕)
|
Ruifeng Xu (徐睿峰)
|
Ruibin Mao (毛瑞彬)
Proceedings of the 19th Chinese National Conference on Computational Linguistics
针对在金融领域实体级情感分析任务中,往往缺乏足够的标注语料,以及通用的情感分析模型难以有效处理金融文本等问题。本文构建一个百万级别的金融领域实体情感分析语料库,并标注五千余个金融领域情感词作为金融领域情感词典。同时,基于该金融领域数据集,提出一种结合金融领域情感词典和注意力机制的金融文本细粒度情感分析模型。该模型使用两个LSTM网络分别提取词级别的语义信息和基于情感词典分类后的词类级别信息,能有效获取金融领域词语的特征信息。此外,为了让文本中金融领域情感词获得更多关注,提出一种基于金融领域情感词典的注意力机制来为不同实体获取重要的情感信息。最终在构建的金融领域实体级语料库上进行实验,取得了比对比模型更好的效果。
Search
Co-authors
- Ruifeng Xu 2
- Bin Liang 1
- Liuyu Han (刘宇瀚) 1
- Yi Chen (陈奕) 1
- Ruibin Mao (毛瑞彬) 1
- show all...