Andrew Li


2025

pdf bib
cantnlp@DravidianLangTech-2025: A Bag-of-Sounds Approach to Multimodal Hate Speech Detection
Sidney Wong | Andrew Li
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

This paper presents the systems and results for the Multimodal Social Media Data Analysis in Dravidian Languages (MSMDA-DL) shared task at the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages (DravidianLangTech-2025). We took a ‘bag-of-sounds’ approach by training our hate speech detection system on the speech (audio) data using transformed Mel spectrogram measures. While our candidate model performed poorly on the test set, our approach offered promising results during training and development for Malayalam and Tamil. With sufficient and well-balanced training data, our results show that it is feasible to use both text and speech (audio) data in the development of multimodal hate speech detection systems.

2024

pdf bib
ChatHF: Collecting Rich Human Feedback from Real-time Conversations
Andrew Li | Zhenduo Wang | Ethan Mendes | Duong Minh Le | Wei Xu | Alan Ritter
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We introduce ChatHF, an interactive annotation framework for chatbot evaluation, which integrates configurable annotation within a chat interface. ChatHF can be flexibly configured to accommodate various chatbot evaluation tasks, for example detecting offensive content, identifying incorrect or misleading information in chatbot responses, and chatbot responses that might compromise privacy. It supports post-editing of chatbot outputs and supports visual inputs, in addition to an optional voice interface. ChatHF is suitable for collection and annotation of NLP datasets, and Human-Computer Interaction studies, as demonstrated in case studies on image geolocation and assisting older adults with daily activities. ChatHF is publicly accessible at https://chat-hf.com.