Dhaval C Patel
2025
ReAct Meets Industrial IoT: Language Agents for Data Access
James T Rayfield
|
Shuxin Lin
|
Nianjun Zhou
|
Dhaval C Patel
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
We present a robust framework for deploying domain-specific language agents that can query industrial sensor data using natural language. Grounded in the Reasoning and Acting (ReAct) paradigm, our system introduces three key innovations: (1) integration of the Self-Ask method for compositional, multi-hop reasoning; (2) a multi-agent architecture with Review, Reflect and Distillation components to improve reliability and fault tolerance; and (3) a long-context prompting strategy leveraging curated in-context examples, which we call Tiny Trajectory Store, eliminating the need for fine-tuning. We apply our method to Industry 4.0 scenarios, where agents query SCADA systems (e.g., SkySpark) using questions such as, “How much power did B002 AHU 2-1-1 use on 6/14/16 at the POKMAIN site?” To enable systematic evaluation, we introduce IoTBench, a benchmark of 400+ tasks across five industrial sites. Our experiments show that ReAct-style agents enhanced with long-context reasoning (ReActXen) significantly outperform standard prompting baselines across multiple LLMs including smaller models. This work repositions NLP agents as practical interfaces for industrial automation, bridging natural language understanding and sensor-driven environments.
Generalized Embedding Models for Industry 4.0 Applications
Christodoulos Constantinides
|
Shuxin Lin
|
Dhaval C Patel
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
In this work, we present the first embedding model specifically designed for Industry 4.0 applications, targeting the semantics of industrial asset operations. Given natural language tasks related to specific assets, our model retrieves relevant items and generalizes to queries involving similar assets, such as identifying sensors relevant to an asset’s failure mode. We systematically construct nine asset-specific datasets using an expert-validated knowledge base reflecting real operational scenarios. To ensure contextually rich embeddings, we augment queries with Large Language Models, generating concise entity descriptions that capture domain-specific nuances. Across five embedding models ranging from BERT (110M) to gte-Qwen (7B), we observe substantial in-domain gains: HIT@1 +54.2%, MAP@100 +50.1%, NDCG@10 +54.7% on average. Ablation studies reveal that (a) LLM-based query augmentation significantly improves embedding quality; (b) contrastive objectives without in-batch negatives are more effective for tasks with many relevant items; and (c) balancing positives and negatives in batches is essential. We evaluate on a new task and finally present a case study wrapping them as tools and providing them to a planning agent. The code can be found here.