Sunisth Kumar
2026
SciClaimEval: Cross-modal Claim Verification in Scientific Papers
Xanh Ho | Yun-Ang Wu | Sunisth Kumar | Tian Cheng Xia | Florian Boudin | Andre Greiner-Petter | Akiko Aizawa
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Xanh Ho | Yun-Ang Wu | Sunisth Kumar | Tian Cheng Xia | Florian Boudin | Andre Greiner-Petter | Akiko Aizawa
Proceedings of the Fifteenth Language Resources and Evaluation Conference
We present SciClaimEval, a new scientific dataset for the claim verification task. Unlike existing resources, SciClaimEval features authentic claims, including refuted ones, directly extracted from published papers. To create refuted claims, we introduce a novel approach that modifies the supporting evidence (figures and tables), rather than altering the claims or relying on large language models (LLMs) to fabricate contradictions. The dataset provides cross-modal evidence with diverse representations: figures are available as images, while tables are provided in multiple formats, including images, LaTeX source, HTML, and JSON. SciClaimEval contains 1,664 annotated samples from 180 papers across three domains, machine learning, natural language processing, and medicine, validated through expert annotation. We benchmark 11 multimodal foundation models, both open-source and proprietary, across the dataset. Results show that figure-based verification remains particularly challenging for all models, as a substantial performance gap remains between the best system and human baseline.
2025
Table-Text Alignment: Explaining Claim Verification Against Tables in Scientific Papers
Xanh Ho | Sunisth Kumar | Yun-Ang Wu | Florian Boudin | Atsuhiro Takasu | Akiko Aizawa
Findings of the Association for Computational Linguistics: EMNLP 2025
Xanh Ho | Sunisth Kumar | Yun-Ang Wu | Florian Boudin | Atsuhiro Takasu | Akiko Aizawa
Findings of the Association for Computational Linguistics: EMNLP 2025
Scientific claim verification against tables typically requires predicting whether a claim is supported or refuted given a table. However, we argue that predicting the final label alone is insufficient: it reveals little about the model’s reasoning and offers limited interpretability. To address this, we reframe table–text alignment as an explanation task, requiring models to identify the table cells essential for claim verification. We build a new dataset by extending the SciTab benchmark with human-annotated cell-level rationales. Annotators verify the claim label and highlight the minimal set of cells needed to support their decision. After the annotation process, we utilize the collected information and propose a taxonomy for handling ambiguous cases. Our experiments show that (i) incorporating table alignment information improves claim verification performance, and (ii) most LLMs, while often predicting correct labels, fail to recover human-aligned rationales, suggesting that their predictions do not stem from faithful reasoning.
Bridging the Gap: Efficient Cross-Lingual NER in Low-Resource Financial Domain
Sunisth Kumar | Mohammed ElKholy | Davide Liu | Alexandre Boulenger
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Sunisth Kumar | Mohammed ElKholy | Davide Liu | Alexandre Boulenger
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
We present an innovative and efficient modeling framework for cross-lingual named entity recognition (NER), leveraging the strengths of knowledge distillation and consistency training. Our approach distills knowledge from an XLM-RoBERTa model pre-trained on a high-resource source language (English) to a student model, which then undergoes semi-supervised consistency training with KL divergence loss on a low-resource target language (Arabic). We focus our application on the financial domain, using a small, sourced dataset of financial transactions as seen in SMS messages Using datasets comprising SMS messages in English and Arabic containing financial transaction information, we aim to transfer NER capabilities from English to Arabic with minimal labeled Arabic samples. The framework generalizes named entity recognition from English to Arabic, achieving F1 scores of 0.74 on the Arabic financial transaction dataset and 0.61 on the WikiANN dataset, surpassing or closely competing with models that have 1.7 and 5.3 more parameters, respectively, while efficiently training it on a single T4 GPU. Our experiments show that using a small number of labeled data for low-resource cross-lingual NER applications is a wiser choice than utilizing zero-shot techniques while also using up fewer resources. This framework holds significant potential for developing multilingual applications, particularly in regions where digital interactions span English and low-resource languages.