Jiazhou Liang


2026

Conversational AI (ConvAI) agents increasingly maintain structured memory to support long-term, task-oriented interactions. In-context memory approaches append the growing history to the model input, which scales poorly under context-window limits. RAG-based methods retrieve request-relevant information, but most assume flat memory collections and ignore structure. We propose **Semantic XPath**, a **tree-structured memory module** to access and update structured conversational memory. **Semantic XPath** improves performance over flat-RAG baselines by **176.7%** while using only **9.1%** of the tokens required by in-context memory. We also introduce **SemanticXPath Chat**, an end-to-end ConvAI demo system that visualizes the structured memory and query execution details. Overall, this paper demonstrates a candidate for the next generation of long-term, task-oriented ConvAI systems built on structured memory.

2025

Dense Passage Retrieval (DPR) typically relies on Euclidean or cosine distance to measure query–passage relevance in embedding space, which is effective when embeddings lie on a linear manifold. However, our experiments across DPR benchmarks suggest that embeddings often lie on lower-dimensional, non-linear manifolds, especially in out-of-distribution (OOD) settings, where cosine and Euclidean distance fail to capture semantic similarity. To address this limitation, we propose a *manifold-aware* distance metric for DPR (**MA-DPR**) that models the intrinsic manifold structure of passages using a nearest-neighbor graph and measures query–passage distance based on their shortest path in this graph. We show that MA-DPR outperforms Euclidean and cosine distances by up to **26%** on OOD passage retrieval, with comparable in-distribution performance across various embedding models, while incurring a minimal increase in query inference time. Empirical evidence suggests that manifold-aware distance allows DPR to leverage context from related neighboring passages, making it effective even in the absence of direct semantic overlap. MA-DPR can be applied to a wide range of dense embedding and retrieval tasks, offering potential benefits across a wide spectrum of domains.