Kunpeng Zhang
2026
Analyze Like a Venture Capitalist: Information-Gain and Knowledge Enhanced Graph Reasoning for Startup Success Prediction
Haoyu Pei | Zhongyang Liu | Xiangyi Xiao | Xiaocong Du | Suting Hong | Kunpeng Zhang | Haipeng Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Haoyu Pei | Zhongyang Liu | Xiangyi Xiao | Xiaocong Du | Suting Hong | Kunpeng Zhang | Haipeng Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Most venture capital (VC) investments fail, while a few deliver outsized returns. Predicting startup success requires synthesizing relational evidence across company fundamentals, investor track records, and investment networks through explicit reasoning, which traditional machine learning and graph neural networks lack. Large language models excel at reasoning, but applying them to VC prediction must address: selecting compact evidence subgraphs from large investment networks, one-sided label noise where failures may be latent successes, and grounding decisions in structured VC domain knowledge. We present MIRAGE-VC, an evidence-grounded reasoning framework with three innovations. First, an information-gain-driven retriever distills networks into compact evidence subgraphs. Second, a dual-layer knowledge base grounds reasoning in VC principles. Third, a noise-aware mechanism down-weights mislabeled negatives via improved Positive-Unlabeled (PU) estimation. MIRAGE-VC achieves +5.9% F1 and +22.1% Precision@5 over state-of-the-art baselines. Expert evaluation confirms professional-quality rationales. We further validate our approach on public data with consistent improvements. Code and reasoning results are available at: https://github.com/ZhangDataLab/MIRAGE-VC.git
2023
Evaluating Reading Comprehension Exercises Generated by LLMs: A Showcase of ChatGPT in Education Applications
Changrong Xiao | Sean Xin Xu | Kunpeng Zhang | Yufang Wang | Lei Xia
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
Changrong Xiao | Sean Xin Xu | Kunpeng Zhang | Yufang Wang | Lei Xia
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
The recent advancement of pre-trained Large Language Models (LLMs), such as OpenAI’s ChatGPT, has led to transformative changes across fields. For example, developing intelligent systems in the educational sector that leverage the linguistic capabilities of LLMs demonstrates a visible potential. Though researchers have recently explored how ChatGPT could possibly assist in student learning, few studies have applied these techniques to real-world classroom settings involving teachers and students. In this study, we implement a reading comprehension exercise generation system that provides high-quality and personalized reading materials for middle school English learners in China. Extensive evaluations of the generated reading passages and corresponding exercise questions, conducted both automatically and manually, demonstrate that the system-generated materials are suitable for students and even surpass the quality of existing human-written ones. By incorporating first-hand feedback and suggestions from experienced educators, this study serves as a meaningful pioneering application of ChatGPT, shedding light on the future design and implementation of LLM-based systems in the educational context.
2020
Interpreting Twitter User Geolocation
Ting Zhong | Tianliang Wang | Fan Zhou | Goce Trajcevski | Kunpeng Zhang | Yi Yang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Ting Zhong | Tianliang Wang | Fan Zhou | Goce Trajcevski | Kunpeng Zhang | Yi Yang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Identifying user geolocation in online social networks is an essential task in many location-based applications. Existing methods rely on the similarity of text and network structure, however, they suffer from a lack of interpretability on the corresponding results, which is crucial for understanding model behavior. In this work, we adopt influence functions to interpret the behavior of GNN-based models by identifying the importance of training users when predicting the locations of the testing users. This methodology helps with providing meaningful explanations on prediction results. Furthermore, it also initiates an attempt to uncover the so-called “black-box” GNN-based models by investigating the effect of individual nodes.
2015
Reducing infrequent-token perplexity via variational corpora
Yusheng Xie | Pranjal Daga | Yu Cheng | Kunpeng Zhang | Ankit Agrawal | Alok Choudhary
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Yusheng Xie | Pranjal Daga | Yu Cheng | Kunpeng Zhang | Ankit Agrawal | Alok Choudhary
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)