2025
pdf
bib
abs
NovAScore: A New Automated Metric for Evaluating Document Level Novelty
Lin Ai
|
Ziwei Gong
|
Harshsaiprasad Deshpande
|
Alexander Johnson
|
Emmy Phung
|
Ahmad Emami
|
Julia Hirschberg
Proceedings of the 31st International Conference on Computational Linguistics
The rapid expansion of online content has intensified the issue of information redundancy, underscoring the need for solutions that can identify genuinely new information. Despite this challenge, the research community has seen a decline in focus on novelty detection, particularly with the rise of large language models (LLMs). Additionally, previous approaches have relied heavily on human annotation, which is time-consuming, costly, and particularly challenging when annotators must compare a target document against a vast number of historical documents. In this work, we introduce NovAScore (Novelty Evaluation in Atomicity Score), an automated metric for evaluating document-level novelty. NovAScore aggregates the novelty and salience scores of atomic information, providing high interpretability and a detailed analysis of a document’s novelty. With its dynamic weight adjustment scheme, NovAScore offers enhanced flexibility and an additional dimension to assess both the novelty level and the importance of information within a document. Our experiments show that NovAScore strongly correlates with human judgments of novelty, achieving a 0.626 Point-Biserial correlation on the TAP-DLND 1.0 dataset and a 0.920 Pearson correlation on an internal human-annotated dataset.
pdf
bib
abs
PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent
Jiateng Liu
|
Lin Ai
|
Zizhou Liu
|
Payam Karisani
|
Zheng Hui
|
Yi Fung
|
Preslav Nakov
|
Julia Hirschberg
|
Heng Ji
Proceedings of the 31st International Conference on Computational Linguistics
Propaganda plays a critical role in shaping public opinion and fueling disinformation. While existing research primarily focuses on identifying propaganda techniques, it lacks the ability to capture the broader motives and the impacts of such content. To address these challenges, we introduce PropaInsight, a conceptual framework grounded in foundational social science research, which systematically dissects propaganda into techniques, arousal appeals, and underlying intent. PropaInsight offers a more granular understanding of how propaganda operates across different contexts. Additionally, we present PropaGaze, a novel dataset that combines human-annotated data with high-quality synthetic data generated through a meticulously designed pipeline. Our experiments show that off-the-shelf LLMs struggle with propaganda analysis, but PropaGaze significantly improves performance. Fine-tuned Llama-7B-Chat achieves 203.4% higher text span IoU in technique identification and 66.2% higher BertScore in appeal analysis compared to 1-shot GPT-4-Turbo. Moreover, PropaGaze complements limited human-annotated data in data-sparse and cross-domain scenarios, demonstrating its potential for comprehensive and generalizable propaganda analysis.
pdf
bib
Beyond Silent Letters: Amplifying LLMs in Emotion Recognition with Vocal Nuances
Zehui Wu
|
Ziwei Gong
|
Lin Ai
|
Pengyuan Shi
|
Kaan Donbekci
|
Julia Hirschberg
Findings of the Association for Computational Linguistics: NAACL 2025
pdf
bib
abs
Akan Cinematic Emotions (ACE): A Multimodal Multi-party Dataset for Emotion Recognition in Movie Dialogues
David Sasu
|
Zehui Wu
|
Ziwei Gong
|
Run Chen
|
Pengyuan Shi
|
Lin Ai
|
Julia Hirschberg
|
Natalie Schluter
Findings of the Association for Computational Linguistics: ACL 2025
In this paper, we introduce the Akan Cinematic Emotions (AkaCE) dataset, the first multimodal emotion dialogue dataset for an African language, addressing the significant lack of resources for low-resource languages in emotion recognition research. AkaCE, developed for the Akan language, contains 385 emotion-labeled dialogues and 6162 utterances across audio, visual, and textual modalities, along with word-level prosodic prominence annotations. The presence of prosodic labels in this dataset also makes it the first prosodically annotated African language dataset. We demonstrate the quality and utility of AkaCE through experiments using state-of-the-art emotion recognition methods, establishing solid baselines for future research. We hope AkaCE inspires further work on inclusive, linguistically and culturally diverse NLP resources.
2024
pdf
bib
Enhancing Pre-Trained Generative Language Models with Question Attended Span Extraction on Machine Reading Comprehension
Lin Ai
|
Zheng Hui
|
Zizhou Liu
|
Julia Hirschberg
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
pdf
bib
Defending Against Social Engineering Attacks in the Age of LLMs
Lin Ai
|
Tharindu Sandaruwan Kumarage
|
Amrita Bhattacharjee
|
Zizhou Liu
|
Zheng Hui
|
Michael S. Davinroy
|
James Cook
|
Laura Cassani
|
Kirill Trapeznikov
|
Matthias Kirchner
|
Arslan Basharat
|
Anthony Hoogs
|
Joshua Garland
|
Huan Liu
|
Julia Hirschberg
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
pdf
bib
abs
A Survey on Open Information Extraction from Rule-based Model to Large Language Model
Liu Pai
|
Wenyang Gao
|
Wenjie Dong
|
Lin Ai
|
Ziwei Gong
|
Songfang Huang
|
Li Zongsheng
|
Ehsan Hoque
|
Julia Hirschberg
|
Yue Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
Open Information Extraction (OpenIE) represents a crucial NLP task aimed at deriving structured information from unstructured text, unrestricted by relation type or domain. This survey paper provides an overview of OpenIE technologies spanning from 2007 to 2024, emphasizing a chronological perspective absent in prior surveys. It examines the evolution of task settings in OpenIE to align with the advances in recent technologies. The paper categorizes OpenIE approaches into rule-based, neural, and pre-trained large language models, discussing each within a chronological framework. Additionally, it highlights prevalent datasets and evaluation metrics currently in use. Building on this extensive review, this paper systematically reviews the evolution of task settings, data, evaluation metrics, and methodologies in the era of large language models, highlighting their mutual influence, comparing their capabilities, and examining their implications for open challenges and future research directions.