James T Rayfield


2026

Industrial maintenance platforms contain rich but fragmented evidence, including free-text work orders, heterogeneous operational sensors or indicators, and structured failure knowledge. These sources are often analyzed in isolation, producing alerts or forecasts that do not support conditional decision-making: given this asset history and behavior, what is happening and what action is warranted?We present Condition Insight Agent, a deployed decision-support framework that integrates maintenance language, behavioral abstractions of operational data, and engineering failure semantics to produce evidence-grounded explanations and advisory actions. The system constrains reasoning through deterministic evidence construction and structured failure knowledge, and applies a rule-based verification loop to suppress unsupported conclusions.Case studies from production CMMS deployments show that this verification-first design operates reliably under heterogeneous and incomplete data while preserving human oversight. Our results demonstrate how constrained LLM-based reasoning can function as a governed decision-support layer for industrial maintenance.

2025

We present a robust framework for deploying domain-specific language agents that can query industrial sensor data using natural language. Grounded in the Reasoning and Acting (ReAct) paradigm, our system introduces three key innovations: (1) integration of the Self-Ask method for compositional, multi-hop reasoning; (2) a multi-agent architecture with Review, Reflect and Distillation components to improve reliability and fault tolerance; and (3) a long-context prompting strategy leveraging curated in-context examples, which we call Tiny Trajectory Store, eliminating the need for fine-tuning. We apply our method to Industry 4.0 scenarios, where agents query SCADA systems (e.g., SkySpark) using questions such as, “How much power did B002 AHU 2-1-1 use on 6/14/16 at the POKMAIN site?” To enable systematic evaluation, we introduce IoTBench, a benchmark of 400+ tasks across five industrial sites. Our experiments show that ReAct-style agents enhanced with long-context reasoning (ReActXen) significantly outperform standard prompting baselines across multiple LLMs including smaller models. This work repositions NLP agents as practical interfaces for industrial automation, bridging natural language understanding and sensor-driven environments.