A Semi-Supervised Approach to Detect Toxic Comments
Ghivvago Damas Saraiva, Rafael Anchiêta, Francisco Assis Ricarte Neto, Raimundo Moura
Abstract
Toxic comments contain forms of non-acceptable language targeted towards groups or individuals. These types of comments become a serious concern for government organizations, online communities, and social media platforms. Although there are some approaches to handle non-acceptable language, most of them focus on supervised learning and the English language. In this paper, we deal with toxic comment detection as a semi-supervised strategy over a heterogeneous graph. We evaluate the approach on a toxic dataset of the Portuguese language, outperforming several graph-based methods and achieving competitive results compared to transformer architectures.- Anthology ID:
- 2021.ranlp-1.142
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
- Month:
- September
- Year:
- 2021
- Address:
- Held Online
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 1261–1267
- Language:
- URL:
- https://aclanthology.org/2021.ranlp-1.142
- DOI:
- Cite (ACL):
- Ghivvago Damas Saraiva, Rafael Anchiêta, Francisco Assis Ricarte Neto, and Raimundo Moura. 2021. A Semi-Supervised Approach to Detect Toxic Comments. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1261–1267, Held Online. INCOMA Ltd..
- Cite (Informal):
- A Semi-Supervised Approach to Detect Toxic Comments (Saraiva et al., RANLP 2021)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2021.ranlp-1.142.pdf
- Code
- rafaelanchieta/toxic
- Data
- ToLD-Br