Abstract
Toxicity detection plays a crucial role in maintaining the peace of the society. Existing methods can be roughly categorized as small language model (SLM) based and large language model (LLM) based. However, due to the limitation of SLMs on general knowledge and the potential embedded bias in LLMs despite their large amount of knowledge, it is not a good idea to detect toxicity only with either SLM or LLM based method.In this work, we propose to implant LLM’s knowledge into SLM based methods such that we can stick to both types of models’ strengths. To this end, we develop a reading comprehension (RC) tree to transfer knowledge between two models. Specifically, we first construct the RC tree, from an extensive to intensive reading perspective, to capture the local and global information in the text. We then model samples encoded by SLM and knowledge extracted from LLM as two distributions using the constructed RT tree. We finally transfer knowledge via optimal transportation between two distributions. Extensive experiments prove the effectiveness of our method on real-world and machine-generated datasets.- Anthology ID:
- 2024.findings-acl.56
- Volume:
- Findings of the Association for Computational Linguistics ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand and virtual meeting
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 947–962
- Language:
- URL:
- https://aclanthology.org/2024.findings-acl.56
- DOI:
- Cite (ACL):
- Hankun Kang and Tieyun Qian. 2024. Implanting LLM’s Knowledge via Reading Comprehension Tree for Toxicity Detection. In Findings of the Association for Computational Linguistics ACL 2024, pages 947–962, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
- Cite (Informal):
- Implanting LLM’s Knowledge via Reading Comprehension Tree for Toxicity Detection (Kang & Qian, Findings 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.findings-acl.56.pdf