SteerForce at SemEval-2026 Task 11: Reducing Content Effects Using Layered Activation Steering
Noah Tratzsch, Asmaa Al-Raian, Mounika Marreddy, Alexander Mehler
Abstract
Large language models exhibit content effects, where surface plausibility interferes with formal logical reasoning. In SemEval-2026 Task 11, this appears as a performance gap between plausibility-aligned and plausibility-conflicting syllogisms, reflecting directional content bias. We address this issue using inference-time activation steering, modeling bias as a geometric deviation between plausibility-driven and validity-driven representations. We introduce a layered steering framework that combines Activation Transport (ACT) with input-adaptive contrastive steering (K-CAST), applied to layers identified through sensitivity analysis. This architecture-aware strategy enables targeted interventions without retraining.On BERT, sequential multi-layer steering improves validity accuracy from 77.1% to 82.3% while reducing bias by 75%. In contrast, for the decoder-only Qwen2.5-1.5B-Instruct, a single mid-to-late layer intervention reduces bias from 0.26 to 0.04 with modest accuracy gains, whereas multi-layer steering offers no additional benefit. These results reveal a fundamental architectural distinction: encoder-based models benefit from distributed low-intensity corrections, while decoder-only instruction-tuned models concentrate reasoning signals within a narrow late-layer band. Our findings demonstrate that effective bias mitigation requires architecture-aware activation steering.- Anthology ID:
- 2026.semeval-1.143
- Volume:
- Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1050–1055
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.143/
- DOI:
- Cite (ACL):
- Noah Tratzsch, Asmaa Al-Raian, Mounika Marreddy, and Alexander Mehler. 2026. SteerForce at SemEval-2026 Task 11: Reducing Content Effects Using Layered Activation Steering. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 1050–1055, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- SteerForce at SemEval-2026 Task 11: Reducing Content Effects Using Layered Activation Steering (Tratzsch et al., SemEval 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.143.pdf