Zhe - Yu Xu

Also published as: Zhe-Yu Xu


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
NYCU-NLP at SemEval-2025 Task 11: Assembling Small Language Models for Multilabel Emotion Detection and Intensity Prediction
Zhe - Yu Xu | Yu - Hsin Wu | Lung - Hao Lee
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This study describes the design of the NYCU-NLP system for the SemEval-2025 Task 11 that focuses on multi-lingual text-based emotion analysis. We instruction-tuned three small language models: Gemma-2 (27B), Mistral-small-3 (22B), and Phi-4 (14B) and then assembled them as our main system architecture. Our NYCU-NLP system participated the English Track A for multilabel emotion detection and English Track B for emotion intensity prediction. Experimental results show our best-performing submission produced a macro-averaging F1 score of 0.8225, ranking second of 90 participating teams for Track A, and ranked second among 41 teams for Track B with a Pearson correlation coefficient of 0.8373.

2024

pdf bib
NYCU-NLP at EXALT 2024: Assembling Large Language Models for Cross-Lingual Emotion and Trigger Detection
Tzu-Mi Lin | Zhe-Yu Xu | Jian-Yu Zhou | Lung-Hao Lee
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

This study describes the model design of the NYCU-NLP system for the EXALT shared task at the WASSA 2024 workshop. We instruction-tune several large language models and then assemble various model combinations as our main system architecture for cross-lingual emotion and trigger detection in tweets. Experimental results showed that our best performing submission is an assembly of the Starling (7B) and Llama 3 (8B) models. Our submission was ranked sixth of 17 participating systems for the emotion detection subtask, and fifth of 7 systems for the binary trigger detection subtask.