Implanting LLM’s Knowledge via Reading Comprehension Tree for Toxicity Detection

Hankun Kang, Tieyun Qian


Abstract
Toxicity detection plays a crucial role in maintaining the peace of the society. Existing methods can be roughly categorized as small language model (SLM) based and large language model (LLM) based. However, due to the limitation of SLMs on general knowledge and the potential embedded bias in LLMs despite their large amount of knowledge, it is not a good idea to detect toxicity only with either SLM or LLM based method.In this work, we propose to implant LLM’s knowledge into SLM based methods such that we can stick to both types of models’ strengths. To this end, we develop a reading comprehension (RC) tree to transfer knowledge between two models. Specifically, we first construct the RC tree, from an extensive to intensive reading perspective, to capture the local and global information in the text. We then model samples encoded by SLM and knowledge extracted from LLM as two distributions using the constructed RT tree. We finally transfer knowledge via optimal transportation between two distributions. Extensive experiments prove the effectiveness of our method on real-world and machine-generated datasets.
Anthology ID:
2024.findings-acl.56
Volume:
Findings of the Association for Computational Linguistics: ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
947–962
Language:
URL:
https://aclanthology.org/2024.findings-acl.56
DOI:
10.18653/v1/2024.findings-acl.56
Bibkey:
Cite (ACL):
Hankun Kang and Tieyun Qian. 2024. Implanting LLM’s Knowledge via Reading Comprehension Tree for Toxicity Detection. In Findings of the Association for Computational Linguistics: ACL 2024, pages 947–962, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Implanting LLM’s Knowledge via Reading Comprehension Tree for Toxicity Detection (Kang & Qian, Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2024.findings-acl.56.pdf