Ke Wu
2026
GR1: Reinforcement-Enhanced LLM for Geoscience Reasoning
Yule Xie | Jiaxin Ding | Cheng Deng | Shiqing Gao | Junran Zhang | Sibo Zhang | Zeyuan Wang | Ke Wu | Xin Ding | Luoyi Fu | Meng Jin | Xinbing Wang
Findings of the Association for Computational Linguistics: ACL 2026
Yule Xie | Jiaxin Ding | Cheng Deng | Shiqing Gao | Junran Zhang | Sibo Zhang | Zeyuan Wang | Ke Wu | Xin Ding | Luoyi Fu | Meng Jin | Xinbing Wang
Findings of the Association for Computational Linguistics: ACL 2026
Reinforcement learning (RL) has recently shown remarkable ability to enhance reasoning in large language models (LLMs), yet its potential in scientific domains beyond mathematics remains largely unexplored. Geoscience questions couple broad factual knowledge with multi-step inference and often rely on visual evidence such as maps, cross-sections, and diagrams, making them a challenging but verifiable testbed for RL-based reasoning. To enable this study, we introduce GeoMC-10K, a dataset of 10,000 geoscience multiple-choice questions spanning physical to human geography and high-school to professional levels; over 30% of the questions are image dependent. To support text-only RL on these multimodal questions, we design GeoM2T, a multi-agent framework that converts multimodal questions into descriptive text while preserving answerability and difficulty. Fine-tuning LLaMA-3.1-8B and Qwen-3-8B with Group Relative Policy Optimization (GRPO), incorporating a factual reward mechanism, yields GR1, which achieves absolute accuracy improvements of 5.9% and 13.3%, respectively, and it generalizes to out-of-distribution geoscience benchmarks. Together, GeoMC-10K, GeoM2T, and GR1 establish a scalable benchmark and baseline for RL-enhanced geoscience reasoning.
2023
Long-Form Speech Translation through Segmentation with Finite-State Decoding Constraints on Large Language Models
Arya McCarthy | Hao Zhang | Shankar Kumar | Felix Stahlberg | Ke Wu
Findings of the Association for Computational Linguistics: EMNLP 2023
Arya McCarthy | Hao Zhang | Shankar Kumar | Felix Stahlberg | Ke Wu
Findings of the Association for Computational Linguistics: EMNLP 2023
One challenge in speech translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we adapt large language models (LLMs) to split long ASR transcripts into segments that can be independently translated so as to maximize the overall translation quality. We overcome the tendency of hallucination in LLMs by incorporating finite-state constraints during decoding; these eliminate invalid outputs without requiring additional training. We discover that LLMs are adaptable to transcripts containing ASR errors through prompt-tuning or fine-tuning. Relative to a state-of-the-art automatic punctuation baseline, our best LLM improves the average BLEU by 2.9 points for English–German, English–Spanish, and English–Arabic TED talk translation in 9 test sets, just by improving segmentation.
2013
Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation
Vladimir Eidelman | Ke Wu | Ferhan Ture | Philip Resnik | Jimmy Lin
Proceedings of the Eighth Workshop on Statistical Machine Translation
Vladimir Eidelman | Ke Wu | Ferhan Ture | Philip Resnik | Jimmy Lin
Proceedings of the Eighth Workshop on Statistical Machine Translation
Mr. MIRA: Open-Source Large-Margin Structured Learning on MapReduce
Vladimir Eidelman | Ke Wu | Ferhan Ture | Philip Resnik | Jimmy Lin
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Vladimir Eidelman | Ke Wu | Ferhan Ture | Philip Resnik | Jimmy Lin
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations