Xiaochun Wei
2026
ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models
Yachuan Liu | Xiaochun Wei | Lin Shi | Xinnuo Li | Bohan Zhang | Paramveer Dhillon | Qiaozhu Mei
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Yachuan Liu | Xiaochun Wei | Lin Shi | Xinnuo Li | Bohan Zhang | Paramveer Dhillon | Qiaozhu Mei
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) struggle with ex-ante reasoning—making inferences or predictions without access to future information. Even under explicit temporal cutoffs, they often rely on internalized post-cutoff knowledge. To systematically evaluate this issue, we introduce a benchmark that assesses LLMs’ ex-ante inference ability across four tasks: stock prediction, question answering, Wikipedia event generation, and scientific publication generation. We quantify temporal leakage using a leakage rate metric, which measures models’ reliance on future information beyond cutoff timestamps, and a quality measure that evaluates task performance. Experimental results show that LLMs frequently violate temporal constraints across tasks, revealing persistent challenges in ex-ante reasoning. Our benchmark serves as a rigorous testbed for studying temporal reasoning in time-sensitive contexts and provides complete datasets, results, and evaluation resources to support future research on improving temporal consistency in modern LLMs.