Though Large Vision-Language Models (LVLMs) are being actively explored in medicine, their ability to conduct complex real-world telemedicine consultations combining accurate diagnosis with professional dialogue remains underexplored. This paper presents **3MDBench** (**M**edical **M**ultimodal **M**ulti-agent **D**ialogue **Bench**mark), an open-source framework for simulating and evaluating LVLM-driven telemedical consultations. 3MDBench simulates patient variability through temperament-based Patient Agent and evaluates diagnostic accuracy and dialogue quality via Assessor Agent. It includes 2996 cases across 34 diagnoses from real-world telemedicine interactions, combining textual and image-based data. The experimental study compares diagnostic strategies for widely used open and closed-source LVLMs. We demonstrate that multimodal dialogue with internal reasoning improves F1 score by 6.5% over non-dialogue settings, highlighting the importance of context-aware, information-seeking questioning. Moreover, injecting predictions from a diagnostic convolutional neural network into the LVLM’s context boosts F1 by up to 20%. Source code is available at https://github.com/univanxx/3mdbench.
The paper researches the problem of drug adverse effect detection in texts of social media. We describe the development of such classification system for Russian tweets. To increase the train dataset we apply a couple of augmentation techniques and analyze their effect in comparison with similar systems presented at 2021 years’ SMM4H Workshop.
In this paper we present the drug adverse effects detection system developed during our participation in the Social Media Mining for Health Applications Shared Task 2020. We experimented with transfer learning approach for English and Russian, BERT and RoBERTa architectures and several strategies for regression head composition. Our final submissions in both languages overcome average F1 by several percents margin.
The paper devoted to the problem of automatic text generation from RDF triples. This problem was formalized and proposed as a part of the 2020 WebNLG challenge. We describe our approach to the RDF-to-text generation task based on a neural network model with the Generative Pre-Training (GPT-2) architecture. In particular, we outline a way of base GPT-2 model conversion to a model with language and classification heads and discuss the text generation methods. To research the parameters’ influence on the end-task performance a series of experiments was carried out. We report the result metrics and conclude with possible improvement directions.