Shumpei Sano

2025

pdf bib abs
Search Query Embeddings via User-behavior-driven Contrastive Learning
Sosuke Nishikawa | Jun Hirako | Nobuhiro Kaji | Koki Watanabe | Hiroki Asano | Souta Yamashiro | Shumpei Sano
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)

Universal query embeddings that accurately capture the semantic meaning of search queries are crucial for supporting a range of query understanding (QU) tasks within enterprises.However, current embedding approaches often struggle to effectively represent queries due to the shortness of search queries and their tendency for surface-level variations.We propose a user-behavior-driven contrastive learning approach which directly aligns embeddings according to user intent.This approach uses intent-aligned query pairs as positive examples, derived from two types of real-world user interactions: (1) clickthrough data, in which queries leading to clicks on the same URLs are assumed to share the same intent, and (2) session data, in which queries within the same user session are considered to share intent.By incorporating these query pairs into a robust contrastive learning framework, we can construct query embedding models that align with user intent while minimizing reliance on surface-level lexical similarities.Evaluations on real-world QU tasks demonstrated that these models substantially outperformed state-of-the-art text embedding models such as mE5 and SimCSE.Our models have been deployed in our search engine to support QU technologies.

2017

pdf bib abs
Predicting Causes of Reformulation in Intelligent Assistants
Shumpei Sano | Nobuhiro Kaji | Manabu Sassano
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue

Intelligent assistants (IAs) such as Siri and Cortana conversationally interact with users and execute a wide range of actions (e.g., searching the Web, setting alarms, and chatting). IAs can support these actions through the combination of various components such as automatic speech recognition, natural language understanding, and language generation. However, the complexity of these components hinders developers from determining which component causes an error. To remove this hindrance, we focus on reformulation, which is a useful signal of user dissatisfaction, and propose a method to predict the reformulation causes. We evaluate the method using the user logs of a commercial IA. The experimental results have demonstrated that features designed to detect the error of a specific component improve the performance of reformulation cause detection.

Shumpei Sano

2025

2017

2016

Co-authors

Venues