Chen Tao
2025
RUC Team at SemEval-2025 Task 5: Fast Automated Subject Indexing: A Method Based on Similar Records Matching and Related Subject Ranking
Xia Tian
|
Yang Xin
|
Wu Jing
|
Xiu Heng
|
Zhang Xin
|
Li Yu
|
Gao Tong
|
Tan Xi
|
Hu Dong
|
Chen Tao
|
Jia Zhi
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
This paper presents MaRSI, an automatic subject indexing method designed to address the limitations of traditional manual indexing and emerging GenAI technologies. Focusing on improving indexing accuracy in cross-lingual contexts and balancing efficiency and accuracy in large-scale datasets, MaRSI mimics human reference learning behavior by constructing semantic indexes from pre-indexed document. It calculates similarity to retrieve relevant references, merges, and reorders their topics to generate index results. Experiments demonstrate that MaRSI outperforms supervised fine-tuning of LLMs on the same dataset, offering advantages in speed, effectiveness, and interpretability.
2022
taochen at SemEval-2022 Task 5: Multimodal Multitask Learning and Ensemble Learning
Chen Tao
|
Jung-jae Kim
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
We present a multi-modal deep learning system for the Multimedia Automatic Misogyny Identification (MAMI) challenge, a SemEval task of identifying and classifying misogynistic messages in online memes. We adapt multi-task learning for the multimodal subtasks of the MAMI challenge to transfer knowledge among the correlated subtasks. We also leverage on ensemble learning for synergistic integration of models individually trained for the subtasks. We finally discuss about errors of the system to provide useful insights for future work.