Thesis Proposal: An Explainable Multimodal Framework for Detecting Harmful Content in Code-Switched Children’s Media

Juliana Isabelle A. Guillermo, Jasper Kyle Catapang, Nathaniel Oco


Abstract
Current automated content moderation systems fail to protect children from harmful YouTube content, particularly in under-resourced, code-switched settings. These systems are often text-only, English-centric, and operate as ’black boxes,’ lacking the multimodal understanding and transparency needed for effective moderation. This thesis proposes a novel hybrid framework for the explainable multimodal detection of harmful content in videos with code-switching. The proposed framework integrates a fine-tuned classifier for accurate, scalable detection with an LLM-powered module that synthesizes the classifier’s internal evidential signals (e.g., text attention and visual heat maps) to generate faithful, human-readable rationales for each decision. As a primary case study, the framework will be developed and validated on an English–Filipino code-switched dataset. Expected contributions include a new dataset publicly available under controlled access (de-identified transcripts, blacked-out frames, extracted feature representations, and metadata via data-sharing agreement) and a blueprint for building more equitable, transparent, and trustworthy AI safety systems.
Anthology ID:
2026.acl-srw.40
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
450–462
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.40/
DOI:
Bibkey:
Cite (ACL):
Juliana Isabelle A. Guillermo, Jasper Kyle Catapang, and Nathaniel Oco. 2026. Thesis Proposal: An Explainable Multimodal Framework for Detecting Harmful Content in Code-Switched Children’s Media. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 450–462, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Thesis Proposal: An Explainable Multimodal Framework for Detecting Harmful Content in Code-Switched Children’s Media (Guillermo et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.40.pdf