CroAno : A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset
Baoli Zhang, Zhucong Li, Zhen Gan, Yubo Chen, Jing Wan, Kang Liu, Jun Zhao, Shengping Liu, Yafei Shi
Abstract
In this paper, we introduce CroAno, a web-based crowd annotation platform for the Chinese named entity recognition (NER). Besides some basic features for crowd annotation like fast tagging and data management, CroAno provides a systematic solution for improving label consistency of Chinese NER dataset. 1) Disagreement Adjudicator: CroAno uses a multi-dimensional highlight mode to visualize instance-level inconsistent entities and makes the revision process user-friendly. 2) Inconsistency Detector: CroAno employs a detector to locate corpus-level label inconsistency and provides users an interface to correct inconsistent entities in batches. 3) Prediction Error Analyzer: We deconstruct the entity prediction error of the model to six fine-grained entity error types. Users can employ this error system to detect corpus-level inconsistency from a model perspective. To validate the effectiveness of our platform, we use CroAno to revise two public datasets. In the two revised datasets, we get an improvement of +1.96% and +2.57% F1 respectively in model performance.- Anthology ID:
- 2021.emnlp-demo.32
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Editors:
- Heike Adel, Shuming Shi
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 275–282
- Language:
- URL:
- https://aclanthology.org/2021.emnlp-demo.32
- DOI:
- 10.18653/v1/2021.emnlp-demo.32
- Cite (ACL):
- Baoli Zhang, Zhucong Li, Zhen Gan, Yubo Chen, Jing Wan, Kang Liu, Jun Zhao, Shengping Liu, and Yafei Shi. 2021. CroAno : A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 275–282, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- CroAno : A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset (Zhang et al., EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2021.emnlp-demo.32.pdf