Camelia Lemnaru


2026

In this paper, we present the solution submitted by TUCNLP at SemEval-2026 Task~11: Disentangling Content and Formal Reasoning in Large Language Models. The task requires predicting the formal validity of categorical syllogisms while minimizing susceptibility to content-driven biases in English and 11 additional languages. We show that a modestly-sized model (Qwen3-8B) can achieve near-perfect logical reasoning on the English validity-only subtask, and large reductions in content effect on multilingual and premise-retrieval variants, when augmented with a multi-stage neuro-symbolic pipeline: LLM-based content stripping with iterative error correction converts natural language to abstract categorical forms, a classical symbolic parser validates against the twenty-four Aristotelian syllogistic forms, and asymmetric confidence thresholds mediate between symbolic and neural decisions. Across the four subtasks (ST1 to ST4), our system achieves accuracy ranging from 91.1\% to 100\% and bias-penalized ranking scores ($\mathcal{M}$) from 31.8 to 100.0, with the main bottleneck being overconfident neural predictions that bypass symbolic verification.

2025

We introduce MorphNLI, a modular step-by-step approach to natural language inference (NLI). When classifying the premise-hypothesis pairs into entailment, contradiction, neutral, we use a language model to generate the necessary edits to incrementally transform (i.e., morph) the premise into the hypothesis. Then, using an off-the-shelf NLI model we track how the entailment progresses with these atomic changes, aggregating these intermediate labels into a final output. We demonstrate the advantages of our proposed method particularly in realistic cross-domain settings, where our method always outperforms strong baselines with improvements up to 12.6% (relative). Further, our proposed approach is explainable as the atomic edits can be used to understand the overall NLI label.

2024

Large language models are prone to internalize social biases due to the characteristics of the data used for their self-supervised training scheme. Considering their recent emergence and wide availability to the general public, it is mandatory to identify and alleviate these biases to avoid perpetuating stereotypes towards underrepresented groups. We present a novel prompt-tuning method for reducing biases in encoder models such as BERT or RoBERTa. Unlike other methods, we only train a small set of additional reusable token embeddings that can be concatenated to any input sequence to reduce bias in the outputs. We particularize this method to gender bias by providing a set of templates used for training the prompts. Evaluations on two benchmarks show that our method is on par with the state of the art while having a limited impact on language modeling ability.