Hiroki Asano


2025

pdf bib
Search Query Embeddings via User-behavior-driven Contrastive Learning
Sosuke Nishikawa | Jun Hirako | Nobuhiro Kaji | Koki Watanabe | Hiroki Asano | Souta Yamashiro | Shumpei Sano
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)

Universal query embeddings that accurately capture the semantic meaning of search queries are crucial for supporting a range of query understanding (QU) tasks within enterprises.However, current embedding approaches often struggle to effectively represent queries due to the shortness of search queries and their tendency for surface-level variations.We propose a user-behavior-driven contrastive learning approach which directly aligns embeddings according to user intent.This approach uses intent-aligned query pairs as positive examples, derived from two types of real-world user interactions: (1) clickthrough data, in which queries leading to clicks on the same URLs are assumed to share the same intent, and (2) session data, in which queries within the same user session are considered to share intent.By incorporating these query pairs into a robust contrastive learning framework, we can construct query embedding models that align with user intent while minimizing reliance on surface-level lexical similarities.Evaluations on real-world QU tasks demonstrated that these models substantially outperformed state-of-the-art text embedding models such as mE5 and SimCSE.Our models have been deployed in our search engine to support QU technologies.

2019

pdf bib
The AIP-Tohoku System at the BEA-2019 Shared Task
Hiroki Asano | Masato Mita | Tomoya Mizumoto | Jun Suzuki
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

We introduce the AIP-Tohoku grammatical error correction (GEC) system for the BEA-2019 shared task in Track 1 (Restricted Track) and Track 2 (Unrestricted Track) using the same system architecture. Our system comprises two key components: error generation and sentence-level error detection. In particular, GEC with sentence-level grammatical error detection is a novel and versatile approach, and we experimentally demonstrate that it significantly improves the precision of the base model. Our system is ranked 9th in Track 1 and 2nd in Track 2.

2018

pdf bib
Suspicious News Detection Using Micro Blog Text
Tsubasa Tagami | Hiroki Ouchi | Hiroki Asano | Kazuaki Hanawa | Kaori Uchiyama | Kaito Suzuki | Kentaro Inui | Atsushi Komiya | Atsuo Fujimura | Ryo Yamashita | Hitofumi Yanai | Akinori Machino
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

2017

pdf bib
Reference-based Metrics can be Replaced with Reference-less Metrics in Evaluating Grammatical Error Correction Systems
Hiroki Asano | Tomoya Mizumoto | Kentaro Inui
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

In grammatical error correction (GEC), automatically evaluating system outputs requires gold-standard references, which must be created manually and thus tend to be both expensive and limited in coverage. To address this problem, a reference-less approach has recently emerged; however, previous reference-less metrics that only consider the criterion of grammaticality, have not worked as well as reference-based metrics. This study explores the potential of extending a prior grammaticality-based method to establish a reference-less evaluation method for GEC systems. Further, we empirically show that a reference-less metric that combines fluency and meaning preservation with grammaticality provides a better estimate of manual scores than that of commonly used reference-based metrics. To our knowledge, this is the first study that provides empirical evidence that a reference-less metric can replace reference-based metrics in evaluating GEC systems.