Single- vs. Dual-Prompt Dialogue Generation with LLMs for Job Interviews in Human Resources

Joachim De Baer, A. Seza Doğruöz, Thomas Demeester, Chris Develder


Abstract
Optimizing language models for use in conversational agents requires large quantities of example dialogues. Increasingly, these dialogues are synthetically generated by using powerful large language models (LLMs), especially in domains where obtaining authentic human data is challenging. One such domain is human resources (HR). In this context, we compare two LLM-based dialogue generation methods for producing HR job interviews, and assess which method generates higher-quality dialogues, i.e., those more difficult to distinguish from genuine human discourse. The first method uses a single prompt to generate the complete interview dialog. The second method uses two agents that converse with each other. To evaluate dialogue quality under each method, we ask a judge LLM to determine whether AI was used for interview generation, using pairwise interview comparisons. We empirically find that, at the expense of a sixfold increase in token count, interviews generated with the dual-prompt method achieve a win rate 2 to 10 times higher than those generated with the single-prompt method. This difference remains consistent regardless of whether GPT-4o or Llama 3.3 70B is used for either interview generation or quality judging.
Anthology ID:
2025.gem-1.74
Volume:
Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
Month:
July
Year:
2025
Address:
Vienna, Austria and virtual meeting
Editors:
Kaustubh Dhole, Miruna Clinciu
Venues:
GEM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
947–957
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.gem-1.74/
DOI:
Bibkey:
Cite (ACL):
Joachim De Baer, A. Seza Doğruöz, Thomas Demeester, and Chris Develder. 2025. Single- vs. Dual-Prompt Dialogue Generation with LLMs for Job Interviews in Human Resources. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 947–957, Vienna, Austria and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Single- vs. Dual-Prompt Dialogue Generation with LLMs for Job Interviews in Human Resources (De Baer et al., GEM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.gem-1.74.pdf