LoCt-Instruct: An Automatic Pipeline for Constructing Datasets of Logical Continuous Instructions

Hongyu Sun; Yusuke Sakai; Haruki Sakajo; Shintaro Ozaki; Kazuki Hayashi; Hidetaka Kamigaito; Taro Watanabe

LoCt-Instruct: An Automatic Pipeline for Constructing Datasets of Logical Continuous Instructions

Hongyu Sun, Yusuke Sakai, Haruki Sakajo, Shintaro Ozaki, Kazuki Hayashi, Hidetaka Kamigaito, Taro Watanabe

Abstract

Continuous instruction following closely mirrors real-world tasks by requiring models to solve sequences of interdependent steps, yet existing multi-step instruction datasets suffer from three key limitations: (1) lack of logical coherence across turns, (2) narrow topical breadth and depth, and (3) reliance on rigid templates or heavy manual effort. We introduce LoCt-Pipeline, a novel pipeline that leverages modern LLMs’ reasoning capabilities to assemble rich, topic-related single-instruction data into multi-turn dialogues, producing chains that are logically coherent, progressively deepen in content, and span diverse domains without fixed templates or extensive human annotation. We employed this pipeline to construct LoCt-Instruct for assessing models’ problem-solving abilities. The generated chains serve as a testbed for benchmarking a variety of models, including reasoning-oriented architectures, instruction-tuned variants, and state-of-the-art closed-source LLMs on their capacity to follow and correctly respond to each step. Our results reveal a substantial performance gap between current LLMs and human solvers. These findings highlight the need for more robust continuous instruction following. We publicly release the dataset and end-to-end pipeline.

Anthology ID:: 2025.emnlp-main.1734
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 34187–34206
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1734/
DOI:
Bibkey:
Cite (ACL):: Hongyu Sun, Yusuke Sakai, Haruki Sakajo, Shintaro Ozaki, Kazuki Hayashi, Hidetaka Kamigaito, and Taro Watanabe. 2025. LoCt-Instruct: An Automatic Pipeline for Constructing Datasets of Logical Continuous Instructions. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 34187–34206, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: LoCt-Instruct: An Automatic Pipeline for Constructing Datasets of Logical Continuous Instructions (Sun et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1734.pdf
Checklist:: 2025.emnlp-main.1734.checklist.pdf

PDF Cite Search Checklist Fix data