Maciej Czajka
2026
AMU at RAG4Reports 2026 Task B: A Practical Multilingual RAG Pipeline for Citation-Grounded Reports
Maciej Czajka | Piotr Jabłoński | Mateusz Czajka | Konrad Pierzyński | Krzysztof Jassem
Proceedings of the 1st Workshop on Multilingual Report Generation via Retrieval Augmented Generation (RAG4Reports 2026)
Maciej Czajka | Piotr Jabłoński | Mateusz Czajka | Konrad Pierzyński | Krzysztof Jassem
Proceedings of the 1st Workshop on Multilingual Report Generation via Retrieval Augmented Generation (RAG4Reports 2026)
This system paper presents AMU’s submission to RAG4Reports 2026 Task B: a practical multilingual retrieval-augmented generation pipeline for evidence-supported report generation. The system combines full-query retrieval, optional query rewriting, dense retrieval with Qdrant, cross-encoder reranking, diversity-aware context selection, and structured generation. The best submitted run uses BAAI/bge-m3 embeddings, BAAI/bge-reranker-v2-m3 reranking, and gpt-5.1 generation with medium reasoning effort, using a partial-coverage prompt strategy. On the official leaderboard, it achieved F1=0.4351, sentence_support=0.8280, and nugget_coverage=0.3403, indicating that the generated reports were well grounded but only partially comprehensive.
2025
PolEval 2025 Task 4: Polish Speech Emotion Recognition Challenge
Iwona Christop | Maciej Czajka
Proceedings of the PolEval 2025 Workshop
Iwona Christop | Maciej Czajka
Proceedings of the PolEval 2025 Workshop
This paper introduces the Polish Speech Emotion Recognition Challenge, a shared task aimed at advancing research on cross-lingual emotion recognition in low-resource languages. The challenge’s objective was to develop systems that could recognize emotional states in Polish speech using only multilingual training data, with no access to Polish training examples. The final test set consisted of newly recorded Polish speech samples created specifically for the challenge, ensuring a fully blind evaluation. Participants submitted emotion predictions for six target classes. System performance was assessed using the macro-averaged F1 score as the primary metric.