Shakila Mahjabin Tonni

2025

Some Odd Adversarial Perturbations and the Notion of Adversarial Closeness
Shakila Mahjabin Tonni | Pedro Faustini | Mark Dras
Proceedings of the 23rd Annual Workshop of the Australasian Language Technology Association

Deep learning models for language are vulnerable to adversarial examples. However, the perturbations introduced can sometimes seem odd or very noticeable to humans, which can make them less effective, a notion captured in some recent investigations as a property of '(non-)suspicion’. In this paper, we focus on three main types of perturbations that may raise suspicion: changes to named entities, inconsistent morphological inflections, and the use of non-English words. We define a notion of adversarial closeness and collect human annotations to construct two new datasets. We then use these datasets to investigate whether these kinds of perturbations have a disproportionate effect on human judgements. Following that, we propose new constraints to include in a constraint-based optimisation approach to adversarial text generation. Our human evaluation shows that these do improve the process by preventing the generation of especially odd or marked texts.

pdf bib abs

CSIRO LT at SemEval-2025 Task 8: Answering Questions over Tabular Data using LLMs
Tomas Turek | Shakila Mahjabin Tonni | Vincent Nguyen | Huichen Yang | Sarvnaz Karimi
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Question Answering over large tables is challenging due to the difficulty of reasoning required in linking information from different parts of a table, such as heading and metadata to the values in the table and information needs. We investigate using Large Language Models (LLM) for tabular reasoning, where, given a pair of a table and a question from the DataBench benchmark, the models generate answers. We experiment with three techniques that enables symbolic reasoning through code execution: a direct code prompting (DCP) approach, ‘DCP_Py’, which uses Python, multi-step code (MSC) prompting ‘MSC_SQL+FS’ using SQL and ReAct prompting, ‘MSR_Py+FS’, which combines multi-step reasoning (MSR), few-shot (FS) learning and Python tools. We also conduct an analysis exploring the impact of answer types, data size, and multi-column dependencies on LLMs’ answer generation performance, including an assessment of the models’ limitations and the underlying challenges of tabular reasoning in LLMs.

Shakila Mahjabin Tonni

2025

2023

Co-authors

Venues