Oier Ijurco

2026

A Virtual Assistant for Architectural Design in a VR Environment
Ander Salaberria | Oier Ijurco | Markel Ferro | Jiayuan Wang | Iñigo Vilá Muñoz | Roberto de Ioris | Jeremy Barnes | Oier Lopez De Lacalle
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Architectural design relies on 3D modeling procedures, generally carried out in Building Information Modeling (BIM) formats. In this setting, architects and designers collaborate on building designs, iterating over many possible versions until a final design is agreed upon. However, this iteration is complicated by the fact that any changes need to be made by manually introducing changes to the complex BIM files, which lengthens the design process and makes it difficult to quickly prototype changes. To speed up prototyping, we propose VR-Arch, a virtual assistant that allows users to interact with the BIM file in a virtual reality (VR) environment. This framework enables users to 1) make changes directly in the VR environment, 2) make complex queries about the BIM, and 3) combine these to perform more complex actions. All of this is done via voice commands and processed through a ReAct-based agentic system that selects appropriate tools depending on the query context.This multi-tool approach enables real-time, contextualized interaction through natural language, allowing for a faster and more natural prototyping experience.

pdf bib abs

Reasoning over Object Descriptions Improves Coreference Resolution in Task-Based Dialogue Systems
Oier Ijurco | Oier Lopez de Lacalle
Proceedings of the Fifteenth Language Resources and Evaluation Conference

Task-based dialogue systems assist users in achieving specific goals, such as executing actions or retrieving information, through natural language interactions. Accurate coreference resolution is essential, as it involves identifying object references within the dialogue—a task that becomes increasingly challenging in visually grounded environments characterized by complex scenes and diverse object metadata. However, coreference resolution in task-based dialogue remains limited by poor generalization across domains and heavy reliance on supervised models that often overfit to dataset-specific artifacts. In this work, we propose a unimodal test-time reasoning approach that enables large language models (LLMs) to reason over detailed object metadata and dialogue history to improve coreference resolution. Empirical results on the SIMMC 2.1 dataset demonstrate that LLMs can generate step-by-step reasoning processes that effectively align dialogue context with objects present in the scene. Extensive experiments highlight the models’ ability to link conversations and objects accurately. Moreover, we show that test-time reasoning under few-shot settings generalizes effectively to unseen scenarios and novel objects, outperforming encoder-based supervised methods in cross-domain evaluations. These findings underscore the critical role of structured metadata and careful prompt engineering in enhancing the robustness and generalization of task-oriented dialogue systems.

Co-authors

Jiayuan Wang 1

Roberto de Ioris 1

Venues

EACL1
LREC1

Fix author