2025
pdf
bib
abs
SzegedAI at ArchEHR-QA 2025: Combining LLMs with traditional methods for grounded question answering
Soma Nagy
|
Bálint Nyerges
|
Zsombor Kispéter
|
Gábor Tóth
|
András Szlúka
|
Gábor Kőrösi
|
Zsolt Szántó
|
Richárd Farkas
Proceedings of the 24th Workshop on Biomedical Language Processing (Shared Tasks)
In this paper, we present the SzegedAI team’s submissions to the ArchEHR-QA 2025 shared task. Our approaches include multiple prompting techniques for large language models (LLMs), sentence similarity methods, and traditional feature engineering. We are aiming to explore both modern and traditional solutions to the task. To combine the strengths of these diverse methods, we employed different ensembling strategies.
2023
pdf
bib
abs
A Question Answering Benchmark Database for Hungarian
Attila Novák
|
Borbála Novák
|
Tamás Zombori
|
Gergő Szabó
|
Zsolt Szántó
|
Richárd Farkas
Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII)
Within the research presented in this article, we created a new question answering benchmark database for Hungarian called MILQA. When creating the dataset, we basically followed the principles of the English SQuAD 2.0, however, like in some more recent English question answering datasets, we introduced a number of innovations beyond SQuAD: e.g., yes/no-questions, list-like answers consisting of several text spans, long answers, questions requiring calculation and other question types where you cannot simply copy the answer from the text. For all these non-extractive question types, the pragmatically adequate form of the answer was also added to make the training of generative models possible. We implemented and evaluated a set of baseline retrieval and answer span extraction models on the dataset. BM25 performed better than any vector-based solution for retrieval. Cross-lingual transfer from English significantly improved span extraction models.
2020
pdf
bib
ProsperAMnet at the FinSim Task: Detecting hypernyms of financial concepts via measuring the information stored in sparse word representations
Gábor Berend
|
Norbert Kis-Szabó
|
Zsolt Szántó
Proceedings of the Second Workshop on Financial Technology and Natural Language Processing
pdf
bib
abs
ProsperAMnet at FinCausal 2020, Task 1 & 2: Modeling causality in financial texts using multi-headed transformers
Zsolt Szántó
|
Gábor Berend
Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation
This paper introduces our efforts at the FinCasual shared task for modeling causality in financial utterances. Our approach uses the commonly and successfully applied strategy of fine-tuning a transformer-based language model with a twist, i.e. we modified the training and inference mechanism such that our model produces multiple predictions for the same instance. By designing such a model that returns k>1 predictions at the same time, we not only obtain a more resource efficient training (as opposed to fine-tuning some pre-trained language model k independent times), but our results indicate that we are also capable of obtaining comparable or even better evaluation scores that way. We compare multiple strategies for combining the k predictions of our model. Our submissions got ranked third on both subtasks of the shared task.
2017
pdf
bib
abs
Universal Dependencies and Morphology for Hungarian - and on the Price of Universality
Veronika Vincze
|
Katalin Simkó
|
Zsolt Szántó
|
Richárd Farkas
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
In this paper, we present how the principles of universal dependencies and morphology have been adapted to Hungarian. We report the most challenging grammatical phenomena and our solutions to those. On the basis of the adapted guidelines, we have converted and manually corrected 1,800 sentences from the Szeged Treebank to universal dependency format. We also introduce experiments on this manually annotated corpus for evaluating automatic conversion and the added value of language-specific, i.e. non-universal, annotations. Our results reveal that converting to universal dependencies is not necessarily trivial, moreover, using language-specific morphological features may have an impact on overall performance.
2014
pdf
bib
An Empirical Evaluation of Automatic Conversion from Constituency to Dependency in Hungarian
Katalin Ilona Simkó
|
Veronika Vincze
|
Zsolt Szántó
|
Richárd Farkas
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
pdf
bib
Special Techniques for Constituent Parsing of Morphologically Rich Languages
Zsolt Szántó
|
Richárd Farkas
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
pdf
bib
Introducing the IMS-Wrocław-Szeged-CIS entry at the SPMRL 2014 Shared Task: Reranking and Morpho-syntax meet Unlabeled Data
Anders Björkelund
|
Özlem Çetinoğlu
|
Agnieszka Faleńska
|
Richárd Farkas
|
Thomas Mueller
|
Wolfgang Seeker
|
Zsolt Szántó
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages