Mingming Li


2024

pdf
Few-Shot Learning for Cold-Start Recommendation
Mingming Li | Songlin Hu | Fuqing Zhu | Qiannan Zhu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Cold-start is a significant problem in recommender systems. Recently, with the development of few-shot learning and meta-learning techniques, many researchers have devoted themselves to adopting meta-learning into recommendation as the natural scenario of few-shots. Nevertheless, we argue that recent work has a huge gap between few-shot learning and recommendations. In particular, users are locally dependent, not globally independent in recommendation. Therefore, it is necessary to formulate the local relationships between users. To accomplish this, we present a novel Few-shot learning method for Cold-Start (FCS) recommendation that consists of three hierarchical structures. More concretely, this first hierarchy is the global-meta parameters for learning the global information of all users; the second hierarchy is the local-meta parameters whose goal is to learn the adaptive cluster of local users; the third hierarchy is the specific parameters of the target user. Both the global and local information are formulated, addressing the new user’s problem in accordance with the few-shot records rapidly. Experimental results on two public real-world datasets show that the FCS method could produce stable improvements compared with the state-of-the-art.

2023

pdf
Adaptive Hyper-parameter Learning for Deep Semantic Retrieval
Mingming Li | Chunyuan Yuan | Huimu Wang | Peng Wang | Jingwei Zhuo | Binbin Wang | Lin Liu | Sulong Xu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track

Deep semantic retrieval has achieved remarkable success in online E-commerce applications. The majority of methods aim to distinguish positive items and negative items for each query by utilizing margin loss or softmax loss. Despite their decent performance, these methods are highly sensitive to hyper-parameters, i.e., margin and temperature 𝜏, which measure the similarity of negative pairs and affect the distribution of items in metric space. How to design and choose adaptively parameters for different pairs is still an open challenge. Recently several methods have attempted to alleviate the above problem by learning each parameter through trainable/statistical methods in the recommendation. We argue that those are not suitable for retrieval scenarios, due to the agnosticism and diversity of the queries. To fully overcome this limitation, we propose a novel adaptive metric learning method that designs a simple and universal hyper-parameter-free learning method to improve the performance of retrieval. Specifically, we first propose a method that adaptive obtains the hyper-parameters by relying on the batch similarity without fixed or extra-trainable hyper-parameters. Subsequently, we adopt a symmetric metric learning method to mitigate model collapse issues. Furthermore, the proposed method is general and sheds a highlight on other fields. Extensive experiments demonstrate our method significantly outperforms previous methods on a real-world dataset, highlighting the superiority and effectiveness of our method. This method has been successfully deployed on an online E-commerce search platform and brought substantial economic benefits.

2019

pdf
Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots
Chunyuan Yuan | Wei Zhou | Mingming Li | Shangwen Lv | Fuqing Zhu | Jizhong Han | Songlin Hu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Multi-turn retrieval-based conversation is an important task for building intelligent dialogue systems. Existing works mainly focus on matching candidate responses with every context utterance on multiple levels of granularity, which ignore the side effect of using excessive context information. Context utterances provide abundant information for extracting more matching features, but it also brings noise signals and unnecessary information. In this paper, we will analyze the side effect of using too many context utterances and propose a multi-hop selector network (MSN) to alleviate the problem. Specifically, MSN firstly utilizes a multi-hop selector to select the relevant utterances as context. Then, the model matches the filtered context with the candidate response and obtains a matching score. Experimental results show that MSN outperforms some state-of-the-art methods on three public multi-turn dialogue datasets.