Yuanyuan Shi
2025
Uncovering the Bigger Picture: Comprehensive Event Understanding Via Diverse News Retrieval
Yixuan Tang
|
Yuanyuan Shi
|
Yiqun Sun
|
Anthony Kum Hoe Tung
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Access to diverse perspectives is essential for understanding real-world events, yet most news retrieval systems prioritize textual relevance, leading to redundant results and limited viewpoint exposure. We propose NEWSCOPE, a two-stage framework for diverse news retrieval that enhances event coverage by explicitly modeling semantic variation at the sentence level. The first stage retrieves topically relevant content using dense retrieval, while the second stage applies sentence-level clustering and diversity-aware re-ranking to surface complementary information. To evaluate retrieval diversity, we introduce three interpretable metrics, namely Average Pairwise Distance, Positive Cluster Coverage, and Information Density Ratio, and construct two paragraph-level benchmarks: LocalNews and DSGlobal. Experiments show that NEWSCOPE consistently outperforms strong baselines, achieving significantly higher diversity without compromising relevance. Our results demonstrate the effectiveness of fine-grained, interpretable modeling in mitigating redundancy and promoting comprehensive event understanding. The data and code are available at https://github.com/tangyixuan/NEWSCOPE.
2016
Product Review Summarization by Exploiting Phrase Properties
Naitong Yu
|
Minlie Huang
|
Yuanyuan Shi
|
Xiaoyan Zhu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
We propose a phrase-based approach for generating product review summaries. The main idea of our method is to leverage phrase properties to choose a subset of optimal phrases for generating the final summary. Specifically, we exploit two phrase properties, popularity and specificity. Popularity describes how popular the phrase is in the original reviews. Specificity describes how descriptive a phrase is in comparison to generic comments. We formalize the phrase selection procedure as an optimization problem and solve it using integer linear programming (ILP). An aspect-based bigram language model is used for generating the final summary with the selected phrases. Experiments show that our summarizer outperforms the other baselines.
Search
Fix author
Co-authors
- Minlie Huang 1
- Yiqun Sun 1
- Yixuan Tang 1
- Anthony Kum Hoe Tung 1
- Naitong Yu 1
- show all...