Tian Wang

2025

pdf bib abs
InfoPO: On Mutual Information Maximization for Large Language Model Alignment
Teng Xiao | Zhen Ge | Sujay Sanghavi | Tian Wang | Julian Katz-Samuels | Marc Versage | Qingjun Cui | Trishul Chilimbi
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

We study the post-training of large language models (LLMs) with human preference data. Recently, direct preference optimization and its variants have shown considerable promise in aligning language models, eliminating the need for reward models and online sampling. Despite these benefits, these methods rely on explicit assumptions about the Bradley-Terry (BT) model, which makes them prone to overfitting and results in suboptimal performance, particularly on reasoning-heavy tasks. To address these challenges, we propose a principled preference fine-tuning algorithm called InfoPO, which effectively and efficiently aligns large language models using preference data. InfoPO eliminates the reliance on the BT model and prevents the likelihood of the chosen response from decreasing. Extensive experiments confirm that InfoPO consistently outperforms established baselines on widely used open benchmarks, particularly in reasoning tasks.

2020

pdf bib abs
Item-based Collaborative Filtering with BERT
Tian Wang | Yuyangzi Fu
Proceedings of the 3rd Workshop on e-Commerce and NLP

In e-commerce, recommender systems have become an indispensable part of helping users explore the available inventory. In this work, we present a novel approach for item-based collaborative filtering, by leveraging BERT to understand items, and score relevancy between different items. Our proposed method could address problems that plague traditional recommender systems such as cold start, and “more of the same” recommended content. We conducted experiments on a large-scale real-world dataset with full cold-start scenario, and the proposed approach significantly outperforms the popular Bi-LSTM model.

2019

pdf bib abs
Mimic and Rephrase: Reflective Listening in Open-Ended Dialogue
Justin Dieter | Tian Wang | Arun Tejasvi Chaganty | Gabor Angeli | Angel X. Chang
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Reflective listening–demonstrating that you have heard your conversational partner–is key to effective communication. Expert human communicators often mimic and rephrase their conversational partner, e.g., when responding to sentimental stories or to questions they don’t know the answer to. We introduce a new task and an associated dataset wherein dialogue agents similarly mimic and rephrase a user’s request to communicate sympathy (I’m sorry to hear that) or lack of knowledge (I do not know that). We study what makes a rephrasal response good against a set of qualitative metrics. We then evaluate three models for generating responses: a syntax-aware rule-based system, a seq2seq LSTM neural models with attention (S2SA), and the same neural model augmented with a copy mechanism (S2SA+C). In a human evaluation, we find that S2SA+C and the rule-based system are comparable and approach human-generated response quality. In addition, experiences with a live deployment of S2SA+C in a customer support setting suggest that this generation task is a practical contribution to real world conversational agents.

Tian Wang

Fixing paper assignments

2025

2020

2019

2016

Co-authors

Venues