Chen Tao


2025

This paper presents MaRSI, an automatic subject indexing method designed to address the limitations of traditional manual indexing and emerging GenAI technologies. Focusing on improving indexing accuracy in cross-lingual contexts and balancing efficiency and accuracy in large-scale datasets, MaRSI mimics human reference learning behavior by constructing semantic indexes from pre-indexed document. It calculates similarity to retrieve relevant references, merges, and reorders their topics to generate index results. Experiments demonstrate that MaRSI outperforms supervised fine-tuning of LLMs on the same dataset, offering advantages in speed, effectiveness, and interpretability.

2022

We present a multi-modal deep learning system for the Multimedia Automatic Misogyny Identification (MAMI) challenge, a SemEval task of identifying and classifying misogynistic messages in online memes. We adapt multi-task learning for the multimodal subtasks of the MAMI challenge to transfer knowledge among the correlated subtasks. We also leverage on ensemble learning for synergistic integration of models individually trained for the subtasks. We finally discuss about errors of the system to provide useful insights for future work.