Guanghui Zhao


2026

Multimodal Large Language Models (MLLMs)show strong medical visual understanding,however their capability for continuous per-ception in procedural clinical workflows re-mains underexplored. We present Perceive-and-Plan, a decomposed in-context learningparadigm for clinical skill keyframe reorder-ing. The method separates visual perceptionfrom temporal planning via two stages: (1)structured visual perception with saliency-guided Picture-in-Picture (PiP) compositionthat magnifies critical regions (head, chest)as color-coded insets, and (2) temporal rea-soning with chain-style self-verification viafresh conversation reset and visual-evidenceanchoring (BLS Rules R1-R11). Withoutparameter updates, our system scores 71.43overall (2nd place, ClinSkill QA 2026), with0.86 pairwise accuracy and 1.0 rationale cover-age. Structured prompting with visual saliencyguidance measurably improves MLLMs’ pro-cedural understanding.Our code is pub-lished at https://github.com/NanceTide/clinskillqa-perceive-and-plan.
Detecting DMRS defense levels in emotionalsupport dialogues is challenging due to severe class imbalance and fine-grained clinical distinctions between adjacent levels, issueswell documented in psychotherapy-orientedNLP surveys (Na et al., 2025). We presentzzucs for PsyDefDetect at BioNLP 2026 (Naet al., 2026a), adopting a data–supervisionco-design strategy. SCCR applies stratifiedresampling to balance support across nine defense levels. CoR–QLoRA encodes clinical rubrics, including task contracts, taxonomy definitions, and boundary cues, into staticprompts for 8B model fine-tuning. Ablationsshow SCCR improves macro-F1 by 4.9 pointsover random oversampling. Our system fromteam zzucs, submitted on CodaBench underthe display name sly_zzu with submission ID652647, achieves 0.3585 macro-F1 on the official blind-test leaderboard LB1. It ranks6th of 21 registered teams with official submissions and surpasses all published 8B baselines by 4.4 F1 points over the strongest 8Bcomparator, Ministral-8B. The code has beenreleased at https://github.com/jackssdd/zzucs_psydefdetect_code.