TripTailor: A Real-World Benchmark for Personalized Travel Planning
Kaimin Wang, Yuanzhe Shen, Changze Lv, Xiaoqing Zheng, Xuanjing Huang
Abstract
The continuous evolution and enhanced reasoning capabilities of large language models (LLMs) have elevated their role in complex tasks, notably in travel planning, where demand for personalized, high-quality itineraries is rising. However, current benchmarks often rely on unrealistic simulated data, failing to reflect the differences between LLM-generated and real-world itineraries. Existing evaluation metrics, which primarily emphasize constraints, fall short of providing a comprehensive assessment of the overall quality of travel plans. To address these limitations, we introduce TripTailor, a benchmark designed specifically for personalized travel planning in real-world scenarios. This dataset features an extensive collection of over 500,000 real-world points of interest (POIs) and nearly 4,000 diverse travel itineraries, complete with detailed information, providing a more authentic evaluation framework. Experiments show that fewer than 10% of the itineraries generated by the latest state-of-the-art LLMs achieve human-level performance. Moreover, we identify several critical challenges in travel planning, including the feasibility, rationality, and personalized customization of the proposed solutions. We hope that TripTailor will drive the development of travel planning agents capable of understanding and meeting user needs while generating practical itineraries.- Anthology ID:
- 2025.findings-acl.503
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2025
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9705–9723
- Language:
- URL:
- https://preview.aclanthology.org/landing_page/2025.findings-acl.503/
- DOI:
- Cite (ACL):
- Kaimin Wang, Yuanzhe Shen, Changze Lv, Xiaoqing Zheng, and Xuanjing Huang. 2025. TripTailor: A Real-World Benchmark for Personalized Travel Planning. In Findings of the Association for Computational Linguistics: ACL 2025, pages 9705–9723, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- TripTailor: A Real-World Benchmark for Personalized Travel Planning (Wang et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/landing_page/2025.findings-acl.503.pdf