Yevheniia Kryklyvets


Fixing paper assignments

  1. Please select all papers that do not belong to this person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
MAviS: A Multimodal Conversational Assistant For Avian Species
Yevheniia Kryklyvets | Mohammed Irfan Kurpath | Sahal Shaji Mullappilly | Jinxing Zhou | Fahad Shahbaz Khan | Rao Muhammad Anwer | Salman Khan | Hisham Cholakkal
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Fine-grained understanding and species-specific, multimodal question answering are vital for advancing biodiversity conservation and ecological monitoring. However, existing multimodal large language models (MM-LLMs) face challenges when it comes to specialized topics like avian species, making it harder to provide accurate and contextually relevant information in these areas. To address this limitation, we introduce the **MAviS-Dataset**, a large-scale multimodal avian species dataset that integrates image, audio, and text modalities for over 1,000 bird species, comprising both pretraining and instruction-tuning subsets enriched with structured question–answer pairs. Building on the MAviS-Dataset, we introduce **MAviS-Chat**, a multimodal LLM that supports audio, vision, and text designed for fine-grained species understanding, multimodal question answering, and scene-specific description generation. Finally, for quantitative evaluation, we present **MAviS-Bench**, a benchmark of over 25,000 Q&A pairs designed to assess avian species-specific perceptual and reasoning abilities across modalities. Experimental results show that MAviS-Chat outperforms the baseline MiniCPM-o-2.6 by a large margin, achieving state-of-the-art open-source results and demonstrating the effectiveness of our instruction-tuned MAviS-Dataset. Our findings highlight the necessity of domain-adaptive MM-LLMs for ecological applications. Our code, training data, evaluation benchmark, and models are available at https://github.com/yevheniia-uv/MAviS.