Sifan Liu
2025
MultiConIR: Towards Multi-Condition Information Retrieval
Xuan Lu
|
Sifan Liu
|
Bochao Yin
|
Yongqi Li
|
Xinghao Chen
|
Hui Su
|
Yaohui Jin
|
Wenjun Zeng
|
Xiaoyu Shen
Findings of the Association for Computational Linguistics: EMNLP 2025
Multi-condition information retrieval (IR) presents a significant, yet underexplored challenge for existing systems. This paper introduces MultiConIR, the first benchmark specifically designed to evaluate retrieval and reranking models under nuanced multi-condition query scenarios across five diverse domains. We systematically assess model capabilities through three critical tasks: complexity robustness, relevance monotonicity, and query format sensitivity. Our extensive experiments on 15 models reveal a critical vulnerability: most retrievers and rerankers exhibit severe performance degradation as query complexity increases. Key deficiencies include widespread failure to maintain relevance monotonicity, and high sensitivity to query style and condition placement. The superior performance GPT-4o reveals the performance gap between IR systems and advanced LLM for handling sophisticated natural language queries. Furthermore, this work delves into the factors contributing to reranker performance deterioration and examines how condition positioning within queries affects similarity assessment, providing crucial insights for advancing IR systems towards complex search scenarios.
2020
WeChat Neural Machine Translation Systems for WMT20
Fandong Meng
|
Jianhao Yan
|
Yijin Liu
|
Yuan Gao
|
Xianfeng Zeng
|
Qinsong Zeng
|
Peng Li
|
Ming Chen
|
Jie Zhou
|
Sifan Liu
|
Hao Zhou
Proceedings of the Fifth Conference on Machine Translation
We participate in the WMT 2020 shared newstranslation task on Chinese→English. Our system is based on the Transformer (Vaswaniet al., 2017a) with effective variants and the DTMT (Meng and Zhang, 2019) architecture. In our experiments, we employ data selection, several synthetic data generation approaches (i.e., back-translation, knowledge distillation, and iterative in-domain knowledge transfer), advanced finetuning approaches and self-bleu based model ensemble. Our constrained Chinese→English system achieves 36.9 case-sensitive BLEU score, which is thehighest among all submissions.