Xiangchen Song
2026
Mechanistic Interpretability Should Prioritize Feature Consistency in Sparse Autoencoders
Xiangchen Song | Aashiq Muhamed | Yujia Zheng | Lingjing Kong | Zeyu Tang | Mona T. Diab | Virginia Smith | Kun Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xiangchen Song | Aashiq Muhamed | Yujia Zheng | Lingjing Kong | Zeyu Tang | Mona T. Diab | Virginia Smith | Kun Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sparse Autoencoders (SAEs) are a prominent tool in mechanistic interpretability (MI) for decomposing neural network activations into interpretable features. However, the aspiration to identify a canonical set of features is challenged by the observed inconsistency of learned SAE features across different training runs, undermining reproducibility and complicating model comparison. We study run-to-run feature consistency in SAEs and argue that it should be reported as a standard evaluation axis alongside reconstruction and sparsity. We propose the Pairwise Dictionary Mean Correlation Coefficient (PW-MCC) as an assignment-based metric to quantify consistency and demonstrate that high levels are achievable (PW-MCC ≈ 0.80 for TopK SAEs on LLM activations) with appropriate architectural choices.Our contributions include: (i) theoretical grounding for strong consistency in the idealized setting of TopK SAEs; (ii) synthetic validation using a model organism, which verifies PW-MCC as a reliable proxy for ground-truth recovery; and (iii) empirical analysis on LLM activations, where PW-MCC correlates with the similarity of automatically generated natural-language feature explanations.
Advancing Reasoning in Diffusion Language Models with Denoising Process Rewards
Shaoan Xie | Lingjing Kong | Xiangchen Song | Xinshuai Dong | Guangyi Chen | Eric P. Xing | Kun Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Shaoan Xie | Lingjing Kong | Xiangchen Song | Xinshuai Dong | Guangyi Chen | Eric P. Xing | Kun Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Diffusion-based large language models offer a non-autoregressive alternative for text generation, but enabling them to perform complex reasoning remains challenging. Reinforcement learning has recently emerged as an effective post-training strategy for improving their performance; however, existing methods rely primarily on outcome-based rewards, which provide no direct supervision over the denoising process and often result in poorly structured reasoning that is difficult to interpret and inconsistently supports the final prediction. To address this limitation, we introduce denoising process reward, a process-level reinforcement signal defined over the denoising trajectory of diffusion language models. This reward is obtained by estimating the contribution of intermediate denoising intervals to the final task outcome, encouraging the model to favor reasoning trajectories that consistently guide generation toward correct predictions. We further propose an efficient stochastic estimator that reuses standard training rollouts, enabling practical process-level supervision at scale. Experiments on challenging reasoning benchmarks demonstrate that our approach yields consistent improvements in reasoning stability, interpretability, and overall task performance.
2021
COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation
Qingyun Wang | Manling Li | Xuan Wang | Nikolaus Parulian | Guangxing Han | Jiawei Ma | Jingxuan Tu | Ying Lin | Ranran Haoran Zhang | Weili Liu | Aabhas Chauhan | Yingjun Guan | Bangzheng Li | Ruisong Li | Xiangchen Song | Yi Fung | Heng Ji | Jiawei Han | Shih-Fu Chang | James Pustejovsky | Jasmine Rah | David Liem | Ahmed ELsayed | Martha Palmer | Clare Voss | Cynthia Schneider | Boyan Onyshkevych
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations
Qingyun Wang | Manling Li | Xuan Wang | Nikolaus Parulian | Guangxing Han | Jiawei Ma | Jingxuan Tu | Ying Lin | Ranran Haoran Zhang | Weili Liu | Aabhas Chauhan | Yingjun Guan | Bangzheng Li | Ruisong Li | Xiangchen Song | Yi Fung | Heng Ji | Jiawei Han | Shih-Fu Chang | James Pustejovsky | Jasmine Rah | David Liem | Ahmed ELsayed | Martha Palmer | Clare Voss | Cynthia Schneider | Boyan Onyshkevych
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations
To combat COVID-19, both clinicians and scientists need to digest the vast amount of relevant biomedical knowledge in literature to understand the disease mechanism and the related biological functions. We have developed a novel and comprehensive knowledge discovery framework, COVID-KG to extract fine-grained multimedia knowledge elements (entities, relations and events) from scientific literature. We then exploit the constructed multimedia knowledge graphs (KGs) for question answering and report generation, using drug repurposing as a case study. Our framework also provides detailed contextual sentences, subfigures, and knowledge subgraphs as evidence. All of the data, KGs, reports.
ChemNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision
Xuan Wang | Vivian Hu | Xiangchen Song | Shweta Garg | Jinfeng Xiao | Jiawei Han
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Xuan Wang | Vivian Hu | Xiangchen Song | Shweta Garg | Jinfeng Xiao | Jiawei Han
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Scientific literature analysis needs fine-grained named entity recognition (NER) to provide a wide range of information for scientific discovery. For example, chemistry research needs to study dozens to hundreds of distinct, fine-grained entity types, making consistent and accurate annotation difficult even for crowds of domain experts. On the other hand, domain-specific ontologies and knowledge bases (KBs) can be easily accessed, constructed, or integrated, which makes distant supervision realistic for fine-grained chemistry NER. In distant supervision, training labels are generated by matching mentions in a document with the concepts in the knowledge bases (KBs). However, this kind of KB-matching suffers from two major challenges: incomplete annotation and noisy annotation. We propose ChemNER, an ontology-guided, distantly-supervised method for fine-grained chemistry NER to tackle these challenges. It leverages the chemistry type ontology structure to generate distant labels with novel methods of flexible KB-matching and ontology-guided multi-type disambiguation. It significantly improves the distant label generation for the subsequent sequence labeling model training. We also provide an expert-labeled, chemistry NER dataset with 62 fine-grained chemistry types (e.g., chemical compounds and chemical reactions). Experimental results show that ChemNER is highly effective, outperforming substantially the state-of-the-art NER methods (with .25 absolute F1 score improvement).
Search
Fix author
Co-authors
- Jiawei Han 2
- Lingjing Kong 2
- Xuan Wang 2
- Kun Zhang 2
- Shih-Fu Chang 1
- Aabhas Chauhan 1
- Guangyi Chen 1
- Mona Diab 1
- Xinshuai Dong 1
- Ahmed Elsayed 1
- Yi Fung 1
- Shweta Garg 1
- Yingjun Guan 1
- Guangxing Han 1
- Vivian Hu 1
- Heng Ji 1
- Bangzheng Li 1
- Manling Li 1
- Ruisong Li 1
- David Liem 1
- Ying Lin 1
- Weili Liu 1
- Jiawei Ma 1
- Aashiq Muhamed 1
- Boyan Onyshkevych 1
- Martha Palmer 1
- Nikolaus Parulian 1
- James Pustejovsky 1
- Jasmine Rah 1
- Cynthia Schneider 1
- Virginia Smith 1
- Zeyu Tang 1
- Jingxuan Tu 1
- Clare Voss 1
- Qingyun Wang 1
- Jinfeng Xiao 1
- Shaoan Xie 1
- Eric Xing 1
- Ranran Haoran Zhang 1
- Yujia Zheng 1