Roman Kovalev
2025
LCTeam at SemEval-2025 Task 3: Multilingual Detection of Hallucinations and Overgeneration Mistakes Using XLM-RoBERTa
Araya Hailemariam
|
Jose Maldonado Rodriguez
|
Ezgi Başar
|
Roman Kovalev
|
Hanna Shcharbakova
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
In recent years, the tendency of large language models to produce hallucinations has become an object of academic interest. Hallucinated or overgenerated outputs created by LLMs contain factual inaccuracies which can potentially invalidate textual coherence. The Mu-SHROOM shared task sets the goal of developing strategies for detecting hallucinated parts of LLM outputs in a multilingual context. We present an approach applicable across multiple languages, which incorporates the alignment of tokens and hard labels, as well as training a multi-lingual XLM-RoBERTa model. With this approach we managed to achieve 2nd in Chinese and top-10 positions in 7 other language tracks of the competition.
2024
UOM-Constrained IWSLT 2024 Shared Task Submission - Maltese Speech Translation
Kurt Abela
|
Md Abdur Razzaq Riyadh
|
Melanie Galea
|
Alana Busuttil
|
Roman Kovalev
|
Aiden Williams
|
Claudia Borg
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
This paper presents our IWSLT-2024 shared task submission on the low-resource track. This submission forms part of the constrained setup; implying limited data for training. Following the introduction, this paper consists of a literature review defining previous approaches to speech translation, as well as their application to Maltese, followed by the defined methodology, evaluation and results, and the conclusion. A cascaded submission on the Maltese to English language pair is presented; consisting of a pipeline containing: a DeepSpeech 1 Automatic Speech Recognition (ASR) system, a KenLM model to optimise the transcriptions, and finally an LSTM machine translation model. The submission achieves a 0.5 BLEU score on the overall test set, and the ASR system achieves a word error rate of 97.15%. Our code is made publicly available.
Search
Fix author
Co-authors
- Kurt Abela 1
- Ezgi Başar 1
- Claudia Borg 1
- Alana Busuttil 1
- Melanie Galea 1
- show all...