James Zhang


2025

pdf bib
Stanford MLab at SemEval-2025 Task 11: Track B–Emotion Intensity Detection
Joseph Le | Hannah Cui | James Zhang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

We outline our SemEval 2025 Track B: Emotion Intensity Prediction submission, for which the objective is to predict the intensity of six primary emotions—anger, disgust, fear, joy, sadness, and surprise—between 0 and 3, with 0 being none and 3 being very strong. We used a regression fine-tuned BERT-based model that makes use of pretrained embeddings in order to sense subtle emotional wordings in text.We include tokenization with a BERT tokenizer, training with AdamW optimization, and an ExponentialLR scheduler used for learning rate modification. Performance is monitored based on validation loss and accuracy through closeness of model outputs to gold labels.Our best-performing model is 68.97% accurate in validation and has a validation loss of 0.373, demonstrating BERT’s capability in fine-grained emotion intensity prediction. Key findings include that fine-tuning transformer models with regression loss improves prediction accuracy and that early stopping and learning rate scheduling avoid overfitting.Future improvements can include larger datasets, ensemble models, or other architectures such as RoBERTa and T5. This paper shows the potential of pretrained transformers for emotion intensity estimation and lays the groundwork for future computational emotion analysis research.

2023

pdf bib
Towards Anytime Fine-tuning: Continually Pre-trained Language Models with Hypernetwork Prompts
Gangwei Jiang | Caigao Jiang | Siqiao Xue | James Zhang | Jun Zhou | Defu Lian | Ying Wei
Findings of the Association for Computational Linguistics: EMNLP 2023

Continual pre-training has been urgent for adapting a pre-trained model to a multitude of domains and tasks in the fast-evolving world. In practice, a continually pre-trained model is expected to demonstrate not only greater capacity when fine-tuned on pre-trained domains but also a non-decreasing performance on unseen ones. In this work, we first investigate such anytime fine-tuning effectiveness of existing continual pre-training approaches, concluding with unanimously decreased performance on unseen domains. To this end, we propose a prompt-guided continual pre-training method, where we train a hypernetwork to generate domain-specific prompts by both agreement and disagreement losses. The agreement loss maximally preserves the generalization of a pre-trained model to new domains, and the disagreement one guards the exclusiveness of the generated hidden states for each domain. Remarkably, prompts by the hypernetwork alleviate the domain identity when fine-tuning and promote knowledge transfer across domains. Our method achieved improvements of 3.57% and 3.4% on two real-world datasets (including domain shift and temporal shift), respectively, demonstrating its efficacy.

2006

pdf bib
Why Generative Phrase Models Underperform Surface Heuristics
John DeNero | Dan Gillick | James Zhang | Dan Klein
Proceedings on the Workshop on Statistical Machine Translation