Kuien Liu


2026

Recent advancements in large language models (LLMs) have empowered autonomous web agents to execute natural language instructions directly on real-world webpages. However, existing agents often struggle with complex tasks involving dynamic interactions and long-horizon execution due to rigid planning strategies and hallucination-prone reasoning. To address these limitations, we propose WebUncertainty, a novel autonomous agent framework designed to tackle dual-level uncertainty in planning and reasoning. Specifically, we design a Task Uncertainty-Driven Adaptive Planning Mechanism that adaptively selects planning modes to navigate unknown environments. Furthermore, we introduce an Action Uncertainty-Driven Monte Carlo tree search (MCTS) Reasoning Mechanism. This mechanism incorporates the Confidence-induced Action Uncertainty (ConActU) strategy to quantify both aleatoric uncertainty (AU) and epistemic uncertainty (EU), thereby optimizing the search process and guiding robust decision-making. Experimental results on the WebArena and WebVoyager benchmarks demonstrate that WebUncertainty achieves superior performance compared to state-of-the-art baselines.

2025

In the retrieval stage of recommendation systems, two-tower models are widely adopted for their efficiency as a predominant paradigm. However, this method, which relies on collaborative filtering signals, exhibits limitations in modeling similarity for long-tail items. To address this issue, we propose a Motivation-aware Retrieval for Long-Tail Recommendation, named MotiR. The purchase motivations generated by LLMs represent a condensed abstraction of items’ intrinsic attributes. By effectively integrating them with traditional item features, this approach enables the two-tower model to capture semantic-level similarities among long-tail items. Furthermore, a gated network-based adaptive weighting mechanism dynamically adjusts representation weights: emphasizing semantic modeling for long-tail items while preserving collaborative signal advantages for popular items. Experimental results demonstrate 60.5% Hit@10 improvements over existing methods on Amazon Books. Industrial deployment in Taobao&Tmall Group 88VIP scenarios achieves over 4% CTR and CVR improvement, validating the effectiveness of our method.