Mingyang Ma
2026
E-ABSA20K: A Dataset and Propose-and-Verify for Aspect-Based Sentiment Analysis in Long E-commerce Reviews
Tong Sun | Mingyang Ma | Cheng Yu
Findings of the Association for Computational Linguistics: ACL 2026
Tong Sun | Mingyang Ma | Cheng Yu
Findings of the Association for Computational Linguistics: ACL 2026
Aspect-Based Sentiment Analysis (ABSA) is critical for extracting actionable product insights from e-commerce reviews. However, most public ABSA benchmarks are restricted to short texts and a limited range of domains, and therefore underrepresent the challenges posed by real-world reviews—where multiple aspects co-occur, colloquial and noisy expressions are common, and evidence must often be aggregated across sentences in long contexts.We introduce E-ABSA20K, a multi-domain dataset of 20K reviews from four product categories (Women’s Bags, Dresses, Cosmetics, and Furniture), annotated with review-level sentiment quads. Compared to existing benchmarks, E-ABSA20K contains substantially longer and more aspect-dense reviews, averaging 63.9 words and 6.0 quads per review. We further propose a two-stage propose-and-verify framework for review-level quadruple extraction (target, aspect, opinion, sentiment). The first stage generates high-recall candidates under strict schema constraints, while the second stage conducts explicit grounding, scope, and modality verification, followed by review-level consolidation to mitigate hallucinations and scope leakage in long reviews. Experiments across multiple Qwen3 model sizes demonstrate that our approach consistently outperforms single-stage prompting (with and without chain-of-thought) as well as competitive ABSA extraction baselines, improving quad-level micro-F1 and robustness on discourse-hard cases such as comparisons and conditionals.
2024
RAGAR, Your Falsehood Radar: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models
Mohammed Abdul Khaliq | Paul Yu-Chun Chang | Mingyang Ma | Bernhard Pflugfelder | Filip Miletić
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)
Mohammed Abdul Khaliq | Paul Yu-Chun Chang | Mingyang Ma | Bernhard Pflugfelder | Filip Miletić
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)
The escalating challenge of misinformation, particularly in political discourse, requires advanced fact-checking solutions; this is even clearer in the more complex scenario of multimodal claims. We tackle this issue using a multimodal large language model in conjunction with retrieval-augmented generation (RAG), and introduce two novel reasoning techniques: Chain of RAG (CoRAG) and Tree of RAG (ToRAG). They fact-check multimodal claims by extracting both textual and image content, retrieving external information, and reasoning subsequent questions to be answered based on prior evidence. We achieve a weighted F1-score of 0.85, surpassing a baseline reasoning technique by 0.14 points. Human evaluation confirms that the vast majority of our generated fact-check explanations contain all information from gold standard data.
2022
User Satisfaction Modeling with Domain Adaptation in Task-oriented Dialogue Systems
Yan Pan | Mingyang Ma | Bernhard Pflugfelder | Georg Groh
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue
Yan Pan | Mingyang Ma | Bernhard Pflugfelder | Georg Groh
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue
User Satisfaction Estimation (USE) is crucial in helping measure the quality of a task-oriented dialogue system. However, the complex nature of implicit responses poses challenges in detecting user satisfaction, and most datasets are limited in size or not available to the public due to user privacy policies. Unlike task-oriented dialogue, large-scale annotated chitchat with emotion labels is publicly available. Therefore, we present a novel user satisfaction model with domain adaptation (USMDA) to utilize this chitchat. We adopt a dialogue Transformer encoder to capture contextual features from the dialogue. And we reduce domain discrepancy to learn dialogue-related invariant features. Moreover, USMDA jointly learns satisfaction signals in the chitchat context with user satisfaction estimation, and user actions in task-oriented dialogue with dialogue action recognition. Experimental results on two benchmarks show that our proposed framework for the USE task outperforms existing unsupervised domain adaptation methods. To the best of our knowledge, this is the first work to study user satisfaction estimation with unsupervised domain adaptation from chitchat to task-oriented dialogue.