Subaru Kimura
2025
KIKIS at WMT 2025 General Translation Task
Koichi Iwakawa
|
Keito Kudo
|
Subaru Kimura
|
Takumi Ito
|
Jun Suzuki
Proceedings of the Tenth Conference on Machine Translation
We participated in the constrained English–Japanese track of the WMT 2025 General Machine Translation Task.Our system collected the outputs produced by multiple subsystems, each of which consisted of LLM-based translation and reranking models configured differently (e.g., prompting strategies and context sizes), and reranked those outputs.Each subsystem generated multiple segment-level candidates and iteratively selected the most probable one to construct the document translation.We then reranked the document-level outputs from all subsystems to obtain the final translation.For reranking, we adopted a text-based LLM reranking approach with a reasoning model to take long contexts into account.Additionally, we built a bilingual dictionary on the fly from parallel corpora to make the system more robust to rare words.
2024
Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task
Keito Kudo
|
Hiroyuki Deguchi
|
Makoto Morishita
|
Ryo Fujii
|
Takumi Ito
|
Shintaro Ozaki
|
Koki Natsumi
|
Kai Sato
|
Kazuki Yano
|
Ryosuke Takahashi
|
Subaru Kimura
|
Tomomasa Hara
|
Yusuke Sakai
|
Jun Suzuki
Proceedings of the Ninth Conference on Machine Translation
We participated in the constrained track for English-Japanese and Japanese-Chinese translations at the WMT 2024 General Machine Translation Task. Our approach was to generate a large number of sentence-level translation candidates and select the most probable translation using minimum Bayes risk (MBR) decoding and document-level large language model (LLM) re-ranking. We first generated hundreds of translation candidates from multiple translation models and retained the top 30 candidates using MBR decoding. In addition, we continually pre-trained LLMs on the target language corpora to leverage document-level information. We utilized LLMs to select the most probable sentence sequentially in context from the beginning of the document.
Search
Fix author
Co-authors
- Takumi Ito 2
- Keito Kudo 2
- Jun Suzuki 2
- Hiroyuki Deguchi 1
- Ryo Fujii 1
- show all...