Steven Au
2026
SoloSemantics at SemEval-2026 Task 4: Triplet-Tuned MPNet for Story Similarity
Steven Au
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Steven Au
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper describes Team SoloSemantics’ submissions to SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning. We began with lightweight neuro-symbolic knowledge-graph baselines, but a triplet-tuned MPNet bi-encoder produced stronger semantic separation in our experiments. We adopted a shared dense encoder family across both tracks and kept the KG and fusion variants as diagnostic baselines. Team SoloSemantics ranked 22nd on Track A and 9th on Track B. Our reproducibility audit further shows that the KG branch was often too sparse on short summaries to represent abstract narrative relations reliably under the current extraction pipeline.
MIDI-PHOR: Multi-View Distillation for Music Understanding and Captioning
Steven Au
Proceedings of the 4th Workshop on NLP for Music and Audio (NLP4MusA 2026)
Steven Au
Proceedings of the 4th Workshop on NLP for Music and Audio (NLP4MusA 2026)
A central limitation of current music understanding frameworks is the reliance on audio embeddings, which frequently yields interpretations lacking traceable ties to explicit musical elements such as notes, dynamics, and instrumentation. We address this gap with MIDIPHOR, a MIDI-first framework that converts symbolic data into structured, queryable representations for reasoning. MIDI-PHOR distills each piece into three complementary views: a symbolic view capturing pitch, meter, and key; a time-series (TS) view that tracks rhythmic salience, texture, and role activity; and an instrument-role graph encoding ensemble interactions. With evidence-linked claims, experiments demonstrate reduced hallucinations compared to raw-MIDI baselines and offer a robust, auditable bridge between symbolic data and semantic music understanding.
Clinical Evidence and Patient Reviews: A Linked Dataset for Antidepressant Side Effects
Steven Au
BioNLP 2026
Steven Au
BioNLP 2026
Clinical sources and patient-authored reviews often describe antidepressant side effects in different ways, but these differences are rarely measured directly. We present ClinPeer-AE, a linked dataset for comparing side-effect evidence from PubMed, ClinicalTrials.gov, WebMD, and Drugs.com while preserving source identity. Across five widely prescribed antidepressants, we find low overlap between clinical and peer sources, large differences in relative emphasis, and evidence that many peer-only effects also appear in U.S. Food and Drug Administration Adverse Event Reporting System (FAERS) reports. These findings suggest that patient reviews provide useful context about recurring medication experiences and offer a complementary view of how side effects are described outside formal clinical settings.
2025
Personalized Graph-Based Retrieval for Large Language Models
Steven Au | Cameron Dimacali | Ojasmitha Pedirappagari | Namyong Park | Franck Dernoncourt | Yu Wang | Nikos Kanakaris | Hanieh Deilamsalehy | Ryan A. Rossi | Nesreen K. Ahmed
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation
Steven Au | Cameron Dimacali | Ojasmitha Pedirappagari | Namyong Park | Franck Dernoncourt | Yu Wang | Nikos Kanakaris | Hanieh Deilamsalehy | Ryan A. Rossi | Nesreen K. Ahmed
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation
2024
UCSC NLP at SemEval-2024 Task 10: Emotion Discovery and Reasoning its Flip in Conversation (EDiReF)
Neng Wan | Steven Au | Esha Ubale | Decker Krogh
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Neng Wan | Steven Au | Esha Ubale | Decker Krogh
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
We describe SemEval-2024 Task 10: EDiReF consisting of three sub-tasks involving emotion in conversation across Hinglish code-mixed and English datasets. Subtasks include classification of speaker emotion in multiparty conversations (Emotion Recognition in Conversation) and reasoning around shifts in speaker emotion state (Emotion Flip Reasoning). We deployed a BERT model for emotion recognition and two GRU-based models for emotion flip. Our model achieved F1 scores of 0.45, 0.79, and 0.68 for subtasks 1, 2, and 3, respectively.