Yi Guan
2025
Agri-CM3: A Chinese Massive Multi-modal, Multi-level Benchmark for Agricultural Understanding and Reasoning
Haotian Wang | Yi Guan | Fanshu Meng | Chao Zhao | Lian Yan | Yang Yang | Jingchi Jiang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Haotian Wang | Yi Guan | Fanshu Meng | Chao Zhao | Lian Yan | Yang Yang | Jingchi Jiang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Multi-modal Large Language Models (MLLMs) integrating images, text, and speech can provide farmers with accurate diagnoses and treatment of pests and diseases, enhancing agricultural efficiency and sustainability. However, existing benchmarks lack comprehensive evaluations, particularly in multi-level reasoning, making it challenging to identify model limitations. To address this issue, we introduce Agri-CM3, an expert-validated benchmark assessing MLLMs’ understanding and reasoning in agricultural management. It includes 3,939 images and 15,901 multi-level multiple-choice questions with detailed explanations. Evaluations of 45 MLLMs reveal significant gaps. Even GPT-4o achieves only 63.64% accuracy, falling short in fine-grained reasoning tasks. Analysis across three reasoning levels and seven compositional abilities highlights key challenges in accuracy and cognitive understanding. Our study provides insights for advancing MLLMs in agricultural management, driving their development and application. Code and data are available at https://github.com/HIT-Kwoo/Agri-CM3.
RLKGF: Reinforcement Learning from Knowledge Graph Feedback Without Human Annotations
Lian Yan | Chen Tang | Yi Guan | Haotian Wang | Songyuan Wang | Haifeng Liu | Yang Yang | Jingchi Jiang
Findings of the Association for Computational Linguistics: ACL 2025
Lian Yan | Chen Tang | Yi Guan | Haotian Wang | Songyuan Wang | Haifeng Liu | Yang Yang | Jingchi Jiang
Findings of the Association for Computational Linguistics: ACL 2025
Reinforcement Learning from Human Feedback (RLHF) has been shown to effectively align large language models (LLMs) with human knowledge. However, the lack of human preference labels remains a significant bottleneck when applying RLHF to a downstream domain. Humans in RLHF play a critical role in injecting reasoning preferences into LLM, and we assume the reasoning process underlying human assessments may potentially be replaced by reasoning pathways derived from Knowledge Graphs (KGs). Inspired by this assumption, we propose Reinforcement Learning from Knowledge Graph Feedback (RLKGF), a novel method that leverages KG semantics and structure to derive RL rewards in the absence of manual annotations. Unlike Reinforcement Learning from AI Feedback (RLAIF), RLKGF directly integrates human priors encoded in KGs as the reward model, aligning LLM responses with expert knowledge without additional preference labeling or reward model training. RLKGF structures context-relevant facts into knowledge subgraphs and defines rewards by simulating information flow across semantic and logical connections between question and candidate response entities. Experiments on three public and one private medical dialogue dataset demonstrate that RLKGF significantly outperforms the competitive RLAIF in improving LLM diagnostic accuracy. The code is available at https://github.com/YanPioneer/RLKGF.
2021
AdvPicker: Effectively Leveraging Unlabeled Data via Adversarial Discriminator for Cross-Lingual NER
Weile Chen | Huiqiang Jiang | Qianhui Wu | Börje F. Karlsson | Yi Guan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Weile Chen | Huiqiang Jiang | Qianhui Wu | Börje F. Karlsson | Yi Guan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Neural methods have been shown to achieve high performance in Named Entity Recognition (NER), but rely on costly high-quality labeled data for training, which is not always available across languages. While previous works have shown that unlabeled data in a target language can be used to improve cross-lingual model performance, we propose a novel adversarial approach (AdvPicker) to better leverage such data and further improve results. We design an adversarial learning framework in which an encoder learns entity domain knowledge from labeled source-language data and better shared features are captured via adversarial training - where a discriminator selects less language-dependent target-language data via similarity to the source language. Experimental results on standard benchmark datasets well demonstrate that the proposed method benefits strongly from this data selection process and outperforms existing state-of-the-art methods; without requiring any additional external resources (e.g., gazetteers or via machine translation).
2013
Reserved Self-training: A Semi-supervised Sentiment Classification Method for Chinese Microblogs
Zhiguang Liu | Xishuang Dong | Yi Guan | Jinfeng Yang
Proceedings of the Sixth International Joint Conference on Natural Language Processing
Zhiguang Liu | Xishuang Dong | Yi Guan | Jinfeng Yang
Proceedings of the Sixth International Joint Conference on Natural Language Processing
2011
Automatically Generating Questions from Queries for Community-based Question Answering
Shiqi Zhao | Haifeng Wang | Chao Li | Ting Liu | Yi Guan
Proceedings of 5th International Joint Conference on Natural Language Processing
Shiqi Zhao | Haifeng Wang | Chao Li | Ting Liu | Yi Guan
Proceedings of 5th International Joint Conference on Natural Language Processing
2010
Selecting Optimal Feature Template Subset for CRFs
Xingjun Xu | Guanglu Sun | Yi Guan | Xishuang Dong | Sheng Li
CIPS-SIGHAN Joint Conference on Chinese Language Processing
Xingjun Xu | Guanglu Sun | Yi Guan | Xishuang Dong | Sheng Li
CIPS-SIGHAN Joint Conference on Chinese Language Processing
Complete Syntactic Analysis Bases on Multi-level Chunking
Zhipeng Jiang | Yu Zhao | Yi Guan | Chao Li | Sheng Li
CIPS-SIGHAN Joint Conference on Chinese Language Processing
Zhipeng Jiang | Yu Zhao | Yi Guan | Chao Li | Sheng Li
CIPS-SIGHAN Joint Conference on Chinese Language Processing
2007
A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation
Chi-Ho Li | Minghui Li | Dongdong Zhang | Mu Li | Ming Zhou | Yi Guan
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics
Chi-Ho Li | Minghui Li | Dongdong Zhang | Mu Li | Ming Zhou | Yi Guan
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics
2006
A Pragmatic Chinese Word Segmentation Approach Based on Mixing Models
Wei Jiang | Yi Guan | Xiao-Long Wang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 4, December 2006
Wei Jiang | Yi Guan | Xiao-Long Wang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 4, December 2006
A Pragmatic Chinese Word Segmentation System
Wei Jiang | Yi Guan | Xiao-Long Wang
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing
Wei Jiang | Yi Guan | Xiao-Long Wang
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing
2005
Search
Fix author
Co-authors
- Wei Jiang 3
- Xishuang Dong 2
- Jingchi Jiang 2
- Chao Li 2
- Sheng Li 2
- Haotian Wang 2
- Xiao-Long Wang 2
- Lian Yan 2
- Yang Yang 2
- Weile Chen 1
- Huiqiang Jiang 1
- Zhipeng Jiang 1
- Börje F. Karlsson 1
- Chi-Ho Li 1
- Minghui Li 1
- Mu Li 1
- Haifeng Liu 1
- Ting Liu 1
- Zhiguang Liu 1
- Fanshu Meng 1
- Guanglu Sun 1
- Chen Tang 1
- Songyuan Wang 1
- Haifeng Wang 1
- Qianhui Wu 1
- Zhiming Xu 1
- Xingjun Xu 1
- Jinfeng Yang 1
- Dongdong Zhang 1
- Chao Zhao 1
- Jian Zhao 1
- Shiqi Zhao 1
- Yu Zhao 1
- Ming Zhou 1