2024
pdf
abs
DiffuCOMET: Contextual Commonsense Knowledge Diffusion
Silin Gao
|
Mete Ismayilzada
|
Mengjie Zhao
|
Hiromi Wakaki
|
Yuki Mitsufuji
|
Antoine Bosselut
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Inferring contextually-relevant and diverse commonsense to understand narratives remains challenging for knowledge models. In this work, we develop a series of knowledge models, DiffuCOMET, that leverage diffusion to learn to reconstruct the implicit semantic connections between narrative contexts and relevant commonsense knowledge. Across multiple diffusion steps, our method progressively refines a representation of commonsense facts that is anchored to a narrative, producing contextually-relevant and diverse commonsense inferences for an input context. To evaluate DiffuCOMET, we introduce new metrics for commonsense inference that more closely measure knowledge diversity and contextual relevance. Our results on two different benchmarks, ComFact and WebNLG+, show that knowledge generated by DiffuCOMET achieves a better trade-off between commonsense diversity, contextual relevance and alignment to known gold references, compared to baseline knowledge models.
pdf
abs
On the Language Encoder of Contrastive Cross-modal Models
Mengjie Zhao
|
Junya Ono
|
Zhi Zhong
|
Chieh-Hsin Lai
|
Yuhta Takida
|
Naoki Murata
|
Wei-Hsiang Liao
|
Takashi Shibuya
|
Hiromi Wakaki
|
Yuki Mitsufuji
Findings of the Association for Computational Linguistics: ACL 2024
Contrastive cross-modal models such as CLIP and CLAP aid various vision-language (VL) and audio-language (AL) tasks. However, there has been limited investigation of and improvement in their language encoder – the central component of encoding natural language descriptions of image/audio into vector representations. We extensively evaluate how unsupervised and supervised sentence embedding training affect language encoder quality and cross-modal task performance. In VL pretraining, we found that sentence embedding training enhances language encoder quality and aids in cross-modal tasks, improving contrastive VL models such as CyCLIP. Sentence embedding training benefits AL tasks when the amount of training data is large. We analyze the representation spaces to understand the strengths of sentence embedding training, and find that it improves text-space uniformity, at the cost of decreased cross-modal alignment.
pdf
abs
Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning
Zhouhang Xie
|
Bodhisattwa Prasad Majumder
|
Mengjie Zhao
|
Yoshinori Maeda
|
Keiichi Yamada
|
Hiromi Wakaki
|
Julian McAuley
Findings of the Association for Computational Linguistics: ACL 2024
We consider the task of building a dialogue system that can motivate users to adopt positive lifestyle changes, Motivational Interviewing (MI). Addressing such a task requires a system that could infer how to motivate the user effectively. We propose DIIR, a framework that is capable of learning and applying conversation strategies in the form of natural language inductive rules from expert demonstrations. Automatic and human evaluation on instruction-following large language models show natural language strategies descriptions discovered by DIIR can improve active listening skills, reduce unsolicited advice, and promote more collaborative and less authoritative conversations, outperforming in-context demonstrations that are over 50 times longer.
pdf
bib
Proceedings of the 6th Workshop on NLP for Conversational AI (NLP4ConvAI 2024)
Elnaz Nouri
|
Abhinav Rastogi
|
Georgios Spithourakis
|
Bing Liu
|
Yun-Nung Chen
|
Yu Li
|
Alon Albalak
|
Hiromi Wakaki
|
Alexandros Papangelis
Proceedings of the 6th Workshop on NLP for Conversational AI (NLP4ConvAI 2024)
2023
pdf
abs
PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives
Silin Gao
|
Beatriz Borges
|
Soyoung Oh
|
Deniz Bayazit
|
Saya Kanno
|
Hiromi Wakaki
|
Yuki Mitsufuji
|
Antoine Bosselut
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sustaining coherent and engaging narratives requires dialogue or storytelling agents to understandhow the personas of speakers or listeners ground the narrative. Specifically, these agents must infer personas of their listeners to produce statements that cater to their interests. They must also learn to maintain consistent speaker personas for themselves throughout the narrative, so that their counterparts feel involved in a realistic conversation or story. However, personas are diverse and complex: they entail large quantities of rich interconnected world knowledge that is challenging to robustly represent in general narrative systems (e.g., a singer is good at singing, and may have attended conservatoire). In this work, we construct a new large-scale persona commonsense knowledge graph, PeaCoK, containing ~100K human-validated persona facts. Our knowledge graph schematizes five dimensions of persona knowledge identified in previous studies of human interactive behaviours, and distils facts in this schema from both existing commonsense knowledge graphs and large-scale pretrained language models. Our analysis indicates that PeaCoK contains rich and precise world persona inferences that help downstream systems generate more consistent and engaging narratives.
2022
pdf
abs
ComFact: A Benchmark for Linking Contextual Commonsense Knowledge
Silin Gao
|
Jena D. Hwang
|
Saya Kanno
|
Hiromi Wakaki
|
Yuki Mitsufuji
|
Antoine Bosselut
Findings of the Association for Computational Linguistics: EMNLP 2022
Understanding rich narratives, such as dialogues and stories, often requires natural language processing systems to access relevant knowledge from commonsense knowledge graphs. However, these systems typically retrieve facts from KGs using simple heuristics that disregard the complex challenges of identifying situationally-relevant commonsense knowledge (e.g., contextualization, implicitness, ambiguity).In this work, we propose the new task of commonsense fact linking, where models are given contexts and trained to identify situationally-relevant commonsense knowledge from KGs. Our novel benchmark, ComFact, contains ~293k in-context relevance annotations for commonsense triplets across four stylistically diverse dialogue and storytelling datasets. Experimental results confirm that heuristic fact linking approaches are imprecise knowledge extractors. Learned fact linking models demonstrate across-the-board performance improvements (~34.6% F1) over these heuristics. Furthermore, improved knowledge retrieval yielded average downstream improvements of 9.8% for a dialogue response generation task. However, fact linking models still significantly underperform humans, suggesting our benchmark is a promising testbed for research in commonsense augmentation of NLP systems.
2021
pdf
abs
Fundamental Exploration of Evaluation Metrics for Persona Characteristics of Text Utterances
Chiaki Miyazaki
|
Saya Kanno
|
Makoto Yoda
|
Junya Ono
|
Hiromi Wakaki
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue
To maintain utterance quality of a persona-aware dialog system, inappropriate utterances for the persona should be thoroughly filtered. When evaluating the appropriateness of a large number of arbitrary utterances to be registered in the utterance database of a retrieval-based dialog system, evaluation metrics that require a reference (or a “correct” utterance) for each evaluation target cannot be used. In addition, practical utterance filtering requires the ability to select utterances based on the intensity of persona characteristics. Therefore, we are developing metrics that can be used to capture the intensity of persona characteristics and can be computed without references tailored to the evaluation targets. To this end, we explore existing metrics and propose two new metrics: persona speaker probability and persona term salience. Experimental results show that our proposed metrics show weak to moderate correlations between scores of persona characteristics based on human judgments and outperform other metrics overall in filtering inappropriate utterances for particular personas.
2011
pdf
The Semi-Automatic Construction of Part-Of-Speech Taggers for Specific Languages by Statistical Methods
Tomohiro Yamasaki
|
Hiromi Wakaki
|
Masaru Suzuki
Proceedings of the 2nd Workshop on South Southeast Asian Natural Language Processing (WSSANLP)
pdf
Topic Models with Logical Constraints on Words
Hayato Kobayashi
|
Hiromi Wakaki
|
Tomohiro Yamasaki
|
Masaru Suzuki
Proceedings of Workshop on Robust Unsupervised and Semisupervised Methods in Natural Language Processing
2009
pdf
Abbreviation Generation for Japanese Multi-Word Expressions
Hiromi Wakaki
|
Hiroko Fujii
|
Masaru Suzuki
|
Mika Fukui
|
Kazuo Sumita
Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications (MWE 2009)