X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents

Weiqi Wu; Hongqiu Wu; Hai Zhao

X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents

Abstract

The Turing test examines whether AIs exhibit human-like behaviour in natural language conversations. The traditional setting limits each participant to one message at a time and requires constant human participation. This fails to reflect a natural conversational style and hinders the evaluation of dialogue agents based on Large Language Models (LLMs) in complex and prolonged interactions. This paper proposes X-Turing, which enhances the original test with a burst dialogue pattern, allowing more dynamic exchanges using consecutive messages. It further reduces human workload by iteratively generating dialogues that simulate the long-term interaction between the agent and a human to compose the majority of the test process. With the pseudo-dialogue history, the agent then engages in a shorter dialogue with a real human, which is paired with a human-human conversation on the same topic to be judged using questionnaires. We introduce the X-Turn Pass-Rate metric to assess the human likeness of LLMs across varying durations. While LLMs like GPT-4 initially perform well, achieving pass rates of 51.9% and 38.9% during 3 turns and 10 turns of dialogues respectively, their performance drops as the dialogue progresses, which underscores the difficulty in maintaining consistency in the long term.

Anthology ID:: 2025.acl-long.293
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5874–5889
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.293/
DOI:
Bibkey:
Cite (ACL):: Weiqi Wu, Hongqiu Wu, and Hai Zhao. 2025. X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5874–5889, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents (Wu et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.293.pdf

PDF Cite Search Fix data