Yixin Zhu


2023

pdf bib
PersLEARN: Research Training through the Lens of Perspective Cultivation
Yu-Zhe Shi | Shiqian Li | Xinyi Niu | Qiao Xu | Jiawen Liu | Yifan Xu | Shiyu Gu | Bingru He | Xinyang Li | Xinyu Zhao | Zijian Zhao | Yidong Lyu | Zhen Li | Sijia Liu | Lin Qiu | Jinhao Ji | Lecheng Ruan | Yuxi Ma | Wenjuan Han | Yixin Zhu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Scientific research is inherently shaped by its authors’ perspectives, influenced by various factorssuch as their personality, community, or society. Junior researchers often face challenges in identifying the perspectives reflected in the existing literature and struggle to develop their own viewpoints. In response to this issue, we introduce PersLEARN , a tool designed to facilitate the cultivation of scientific perspectives, starting from a basic seed idea and progressing to a well-articulated framework. By interacting with a prompt-based model, researchers can develop their perspectives explicitly. Our humanstudy reveals that scientific perspectives developed by students using PersLEARN exhibit a superior level of logical coherence and depth compared to those that did not. Furthermore, our pipeline outperforms baseline approaches across multiple domains of literature from various perspectives. These results suggest that PersLEARN could help foster a greater appreciation of diversity in scientific perspectives as an essential component of research training.

pdf
Making the Most Out of the Limited Context Length: Predictive Power Varies with Clinical Note Type and Note Section
Hongyi Zheng | Yixin Zhu | Lavender Jiang | Kyunghyun Cho | Eric Oermann
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)

Recent advances in large language models have led to renewed interest in natural language processing in healthcare using the free text of clinical notes. One distinguishing characteristic of clinical notes is their long time span over multiple long documents. The unique structure of clinical notes creates a new design choice: when the context length for a language model predictor is limited, which part of clinical notes should we choose as the input? Existing studies either choose the inputs with domain knowledge or simply truncate them. We propose a framework to analyze the sections with high predictive power. Using MIMIC-III, we show that: 1) predictive power distribution is different between nursing notes and discharge notes and 2) combining different types of notes could improve performance when the context length is large. Our findings suggest that a carefully selected sampling function could enable more efficient information extraction from clinical notes.

2021

pdf
GRICE: A Grammar-based Dataset for Recovering Implicature and Conversational rEasoning
Zilong Zheng | Shuwen Qiu | Lifeng Fan | Yixin Zhu | Song-Chun Zhu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021