Jiahao Sun
2026
GraphSynth: Resolving the Diversity-Reliability Trade-off with Probabilistic Factor Graphs
Zehua Cheng | Wei Dai | Jiahao Sun | Thomas Lukasiewicz
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zehua Cheng | Wei Dai | Jiahao Sun | Thomas Lukasiewicz
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The large language models offer a scaleable solution for the generation of synthetic data faced with a trade-off between maintaining the diversity of generation and achieving factually accurate results. This paper introduces Graphsynth, a framework which leverages a probabilistic factor graph modeling the universe of attributes. The framework leverages a high-level schema mapping compiled into efficient hard masks during the decoding phase for maintaining the syntactic truth and a span-synchronized verifier for dismissing logical contradictions at the decode time. The experiments conducted on biomedical, legal, and generic domains show that the method outperforms the state-of-the-art baselines with a structural integrity approaching perfection, a coverage of around 94% attributes on the factor graph solution, and a boost in performance on downstream tasks such as +17.9% on TruthfulQA.
2025
On Weaponization-Resistant Large Language Models with Prospect Theoretic Alignment
Zehua Cheng | Manying Zhang | Jiahao Sun | Wei Dai
Proceedings of the 31st International Conference on Computational Linguistics
Zehua Cheng | Manying Zhang | Jiahao Sun | Wei Dai
Proceedings of the 31st International Conference on Computational Linguistics
Large language models (LLMs) have made significant advancements, but their increasing capabilities present serious risks of misuse, particularly in open-weight models where direct access to the model’s parameters is possible. Current safeguards, designed for closed-weight API models, are inadequate for open-weight models, as minimal fine-tuning can bypass these protections. Preserving the integrity of open-weight LLMs before deployment has thus become a critical challenge. We argue that these vulnerabilities stem from the overemphasis on maximizing the LLM’s log-likelihood during training, which amplifies data biases, especially with large datasets. To address these issues, we introduce Kahneman and Tversky’s Prospect Theoretic Integrity Preserving Alignment (KT-IPA), a framework that prioritizes maximizing generative utility rather than a singular optimization metric. This approach strengthens LLMs against misuse and weaponization while maintaining high performance, even after extensive fine-tuning. Our results demonstrate that integrating prospect theory into LLM training enhances robustness, security, and responsible innovation in this rapidly evolving field. Our codes are available on https://anonymous.4open.science/r/KT-IPA-40B7