Yuke Li
2025
NYA’s Offline Speech Translation System for IWSLT 2025
Wenxuan Wang
|
Yingxin Zhang
|
Yifan Jin
|
Binbin Du
|
Yuke Li
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
This paper reports NYA’s submissions to the IWSLT 2025 Offline Speech Translation (ST) task. The task includes three translation directions: English to Chinese, German, and Arabic. In detail, we adopt a cascaded speech translation architecture comprising automatic speech recognition (ASR) and machine translation (MT) components to participate in the unconstrained training track. For the ASR model, we use the Whisper medium model. For the neural machine translation (NMT) model, the wider and deeper Transformer is adopted as the backbone model. Building upon last year’s work, we implement multiple techniques and strategies such as data augmentation, domain adaptation, and model ensemble to improve the translation quality of the NMT model. In addition, we adopt X-ALMA as the foundational LLM-based MT model, with domain-specific supervised fine-tuning applied to train and optimize our LLM-based MT model. Finally, by employing COMET-based Minimum Bayes Risk decoding to integrate and select translation candidates from both NMT and LLM-based MT systems, the translation quality of our ST system is significantly improved, and competitive results are obtained on the evaluation set.
DocIE@XLLM25: UIEPrompter: A Unified Training-Free Framework for universal document-level information extraction via Structured Prompt
Chengfeng Qiu
|
Lifeng Zhou
|
Kaifeng Wei
|
Yuke Li
Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025)
We introduce UIEPrompter, a unified, training-free framework that secures 1st place in the ACL 2025 shared competition on universal document-level information extraction. UIEPrompter effectively addresses both named entity recognition and relation extraction without the need for annotated data.Leveraging large language models, UIEPrompter establishes a zero-shot baseline through role-specific prompts, which are then refined via few-shot guidance and constrained output generation prompt to align with competition schemas. Additionally, by integrating outputs from several large language models, we reduce individual model biases, thereby improving overall performance. Evaluated on the competition evaluation dataset, UIEPrompter showcases outstanding performance in document-level information extraction, ultimately securing first place. The implementation code is available on GitHub.
Search
Fix author
Co-authors
- Binbin Du 1
- Yifan Jin 1
- Chengfeng Qiu 1
- Wenxuan Wang 1
- Kaifeng Wei 1
- show all...