Yiwen Wang
2026
LitVISTA: A Benchmark for Narrative Orchestration in Literary Text
Mingzhe Lu | Yiwen Wang | Yanbing Liu | Qi You | Chong Liu | Ruize Qin | Haoyu Dong | Wenyu Zhang | JiaRui Zhang | Yue Hu | Yunpeng Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Mingzhe Lu | Yiwen Wang | Yanbing Liu | Qi You | Chong Liu | Ruize Qin | Haoyu Dong | Wenyu Zhang | JiaRui Zhang | Yue Hu | Yunpeng Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Computational narrative analysis aims to capture rhythm, tension, and emotional dynamics in literary texts. Existing large language models can generate long stories but overly focus on causal coherence, neglecting the complex story arcs and orchestration inherent in human narratives. This suggests a structural misalignment between model- and human-generated narratives.We therefore position narrative analysis as a diagnostic proxy for generation and propose VISTA Space, a high-dimensional framework for narrative orchestration that unifies human and model perspectives while jointly characterizing narrative function and structure in a common space.We further introduce LitVISTA, a structurally annotated benchmark grounded in literary texts, which operationalizes VISTA Space for systematic evaluation of models’ narrative orchestration capabilities. Under an oracle setting with gold event anchors, we evaluate frontier LLMs including GPT, Claude, Grok, and Gemini. Results reveal systematic deficiencies, as current models struggle to jointly capture narrative function and structure and fail to form an integrated global view of literary narrative orchestration. End-to-end analysis further shows that failures are dominated by anchor identification and localization errors. Even advanced thinking modes yield mixed and often limited gains for literary narrative understanding.
2024
OpenResearcher: Unleashing AI for Accelerated Scientific Research
Yuxiang Zheng | Shichao Sun | Lin Qiu | Dongyu Ru | Cheng Jiayang | Xuefeng Li | Jifan Lin | Binjie Wang | Yun Luo | Renjie Pan | Yang Xu | Qingkai Min | Zizhao Zhang | Yiwen Wang | Wenjie Li | Pengfei Liu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Yuxiang Zheng | Shichao Sun | Lin Qiu | Dongyu Ru | Cheng Jiayang | Xuefeng Li | Jifan Lin | Binjie Wang | Yun Luo | Renjie Pan | Yang Xu | Qingkai Min | Zizhao Zhang | Yiwen Wang | Wenjie Li | Pengfei Liu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
The rapid growth of scientific literature imposes significant challenges for researchers endeavoring to stay updated with the latest advancements in their fields and delve into new areas. We introduce OpenResearcher, an innovative platform that leverages Artificial Intelligence (AI) techniques to accelerate the research process by answering diverse questions from researchers. OpenResearcher is built based on Retrieval-Augmented Generation (RAG) to integrate Large Language Models (LLMs) with up-to-date, domain-specific knowledge. Moreover, we develop various tools for OpenResearcher to understand researchers’ queries, search from the scientific literature, filter retrieved information, provide accurate and comprehensive answers, and self-refine these answers. OpenResearcher can flexibly use these tools to balance efficiency and effectiveness. As a result, OpenResearcher enables researchers to save time and increase their potential to discover new insights and drive scientific breakthroughs. Demo, video, and code are available at: https://github.com/GAIR-NLP/OpenResearcher.
2021
Controlled Evaluation of Grammatical Knowledge in Mandarin Chinese Language Models
Yiwen Wang | Jennifer Hu | Roger Levy | Peng Qian
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Yiwen Wang | Jennifer Hu | Roger Levy | Peng Qian
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Prior work has shown that structural supervision helps English language models learn generalizations about syntactic phenomena such as subject-verb agreement. However, it remains unclear if such an inductive bias would also improve language models’ ability to learn grammatical dependencies in typologically different languages. Here we investigate this question in Mandarin Chinese, which has a logographic, largely syllable-based writing system; different word order; and sparser morphology than English. We train LSTMs, Recurrent Neural Network Grammars, Transformer language models, and Transformer-parameterized generative parsing models on two Mandarin Chinese datasets of different sizes. We evaluate the models’ ability to learn different aspects of Mandarin grammar that assess syntactic and semantic relationships. We find suggestive evidence that structural supervision helps with representing syntactic state across intervening content and improves performance in low-data settings, suggesting that the benefits of hierarchical inductive biases in acquiring dependency relationships may extend beyond English.