Tian Niezing
2026
d’Olle Grieze at SemEval-2026 Task 11: Comparing the Impact of Supervised Fine-Tuning and Activation Steering on Mitigating Content Effect Bias in Syllogistic Reasoning
Twan Huiskens | Tian Niezing | Koen Snelten
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Twan Huiskens | Tian Niezing | Koen Snelten
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
We investigate the content effect bias in Large Language Models (LLMs) as part of SemEval 2026 Task 11. We compare the impact of supervised fine-tuning using low-rank adaptation against activation steering across several model families, including LLaMA, Gemma and Qwen. Our results show that SFT improves accuracy, with LLaMa 8B reaching 98.75\% accuracy. Activation steering offers limited effectiveness in mitigating the content effect bias. A logit lens analysis further reveals that fine-tuning successfully shifts the model’s focus toward logical structure, specifically within the later layers.