The Need for Grounding in LLM-based Dialogue Systems
Kristiina Jokinen
Proceedings of the Workshop: Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning (NeusymBridge) @ LREC-COLING-2024
Grounding is a pertinent part of the design of LLM-based dialogue systems. Although research on grounding has a long tradition, the paradigm shift caused by LLMs has brought the concept onto the foreground, in particular in the context of cognitive robotics. To avoid generation of irrelevant or false information, the system needs to ground its utterances into real-world events, and to avoid the statistical parrot effect, the system needs to construct shared understanding of the dialogue context and of the partner’s intents. Grounding and construction of the shared context enables cooperation between the participants, and thus supports trustworthy interaction. This paper discusses grounding using neural LLM technology. It aims to bridge neural and symbolic computing on the cognitive architecture level, so as to contribute to a better understanding of how conversational reasoning and collaboration can be linked to LLM implementations to support trustworthy and flexible interaction.
From Data to Dialogue: Leveraging the Structure of Knowledge Graphs for Conversational Exploratory Search
Phillip Schneider
Nils Rehtanz
Kristiina Jokinen
Florian Matthes
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation
Cognitive States and Types of Nods
Taiga Mori
Kristiina Jokinen
Yasuharu Den
Proceedings of the 2nd Workshop on People in Vision, Language, and the Mind
In this paper we will study how different types of nods are related to the cognitive states of the listener. The distinction is made between nods with movement starting upwards (up-nods) and nods with movement starting downwards (down-nods) as well as between single or repetitive nods. The data is from Japanese multiparty conversations, and the results accord with the previous findings indicating that up-nods are related to the change in the listener’s cognitive state after hearing the partner’s contribution, while down-nods convey the meaning that the listener’s cognitive state is not changed.
The AICO Multimodal Corpus – Data Collection and Preliminary Analyses
Kristiina Jokinen
Proceedings of the Twelfth Language Resources and Evaluation Conference
This paper describes data collection and the first explorative research on the AICO Multimodal Corpus. The corpus contains eye-gaze, Kinect, and video recordings of human-robot and human-human interactions, and was collected to study cooperation, engagement and attention of human participants in task-based as well as in chatty type interactive situations. In particular, the goal was to enable comparison between human-human and human-robot interactions, besides studying multimodal behaviour and attention in the different dialogue activities. The robot partner was a humanoid Nao robot, and it was expected that its agent-like behaviour would render humanrobot interactions similar to human-human interaction but also high-light important differences due to the robot’s limited conversational capabilities. The paper reports on the preliminary studies on the corpus, concerning the participants’ eye-gaze and gesturing behaviours,which were chosen as objective measures to study differences in their multimodal behaviour patterns with a human and a robot partner.
Analysis of Body Behaviours in Human-Human and Human-Robot Interactions
Taiga Mori
Kristiina Jokinen
Yasuharu Den
Proceedings of LREC2020 Workshop "People in language, vision and the mind" (ONION2020)
We conducted preliminary comparison of human-robot (HR) interaction with human-human (HH) interaction conducted in English and in Japanese. As the result, body gestures increased in HR, while hand and head gestures decreased in HR. Concerning hand gesture, they were composed of more diverse and complex forms, trajectories and functions in HH than in HR. Moreover, English speakers produced 6 times more hand gestures than Japanese speakers in HH. Regarding head gesture, even though there was no difference in the frequency of head gestures between English speakers and Japanese speakers in HH, Japanese speakers produced slightly more nodding during the robot’s speaking than English speakers in HR. Furthermore, positions of nod were different depending on the language. Concerning body gesture, participants produced body gestures mostly to regulate appropriate distance with the robot in HR. Additionally, English speakers produced slightly more body gestures than Japanese speakers.
Researching Less-Resourced Languages – the DigiSami Corpus
Kristiina Jokinen
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
Kristiina Jokinen
Manfred Stede
David DeVault
Annie Louis
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
Double Topic Shifts in Open Domain Conversations: Natural Language Interface for a Wikipedia-based Robot Application
Kristiina Jokinen
Graham Wilcock
Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016)
The paper describes topic shifting in dialogues with a robot that provides information from Wiki-pedia. The work focuses on a double topical construction of dialogue coherence which refers to discourse coherence on two levels: the evolution of dialogue topics via the interaction between the user and the robot system, and the creation of discourse topics via the content of the Wiki-pedia article itself. The user selects topics that are of interest to her, and the system builds a list of potential topics, anticipated to be the next topic, by the links in the article and by the keywords extracted from the article. The described system deals with Wikipedia articles, but could easily be adapted to other digital information providing systems.
What topic do you want to hear about? A bilingual talking robot using English and Japanese Wikipedias
Graham Wilcock
Kristiina Jokinen
Seiichi Yamamoto
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations
We demonstrate a bilingual robot application, WikiTalk, that can talk fluently in both English and Japanese about almost any topic using information from English and Japanese Wikipedias. The English version of the system has been demonstrated previously, but we now present a live demo with a Nao robot that speaks English and Japanese and switches language on request. The robot supports the verbal interaction with face-tracking, nodding and communicative gesturing. One of the key features of the WikiTalk system is that the robot can switch from the current topic to related topics during the interaction in order to navigate around Wikipedia following the user’s individual interests.
Sentiment analysis on conversational texts
Birgitta Ojamaa
Päivi Kristiina Jokinen
Kadri Muischenk
Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA 2015)
Multilingual WikiTalk: Wikipedia-based talking robots that switch languages.
Graham Wilcock
Kristiina Jokinen
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Towards automatic annotation of communicative gesturing
Kristiina Jokinen
Graham Wilcock
Proceedings of the Third Workshop on Vision and Language
Open-domain Interaction and Online Content in the Sami Language
Kristiina Jokinen
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper presents data collection and collaborative community events organised within the project Digital Natives on the North Sami language. The project is one of the collaboration initiatives on endangered Finno-Ugric languages, supported by the larger framework between the Academy of Finland and the Hungarian Academy of Sciences. The goal of the project is to improve digital visibility and viability of the targeted Finno-Ugric languages, as well as to develop language technology tools and resources in order to assist automatic language processing and experimenting with multilingual interactive applications.
Open-Domain Information Access with Talking Robots
Kristiina Jokinen
Graham Wilcock
Proceedings of the SIGDIAL 2013 Conference
Explorations in the Speakers’ Interaction Experience and Self-Assessments
Kristiina Jokinen
Proceedings of COLING 2012: Posters
Multimodal Signals and Holistic Interaction Structuring
Kristiina Jokinen
Graham Wilcock
Proceedings of COLING 2012: Posters
Investigating Engagement - intercultural and technological aspects of the collection, analysis, and use of the Estonian Multiparty Conversational video data
Kristiina Jokinen
Silvi Tenjes
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
In this paper we describe the goals of the Estonian corpus collection and analysis activities, and introduce the recent collection of Estonian First Encounters data. The MINT project aims at deepening our understanding of the conversational properties and practices in human interactions. We especially investigate conversational engagement and cooperation, and discuss some observations on the participants' views concerning the interaction they have been engaged.
Constructive Interaction for Talking about Interesting Topics
Kristiina Jokinen
Graham Wilcock
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
The paper discusses mechanisms for topic management in conversations, concentrating on interactions where the interlocutors react to each other's presentation of new information and construct a shared context in which to exchange information about interesting topics. This is illustrated with a robot simulator that can talk about unrestricted (open-domain) topics that the human interlocutor shows interest in. Wikipedia is used as the source of information from which the robotic agent draws its world knowledge.
Multimodal Corpus of Multi-party Conversations in Second Language
Shota Yamasaki
Hirohisa Furukawa
Masafumi Nishida
Kristiina Jokinen
Seiichi Yamamoto
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
We developed a dialogue-based tutoring system for teaching English to Japanese students and plan to transfer the current software tutoring agent into an embodied robot in the hope that the robot will enrich conversation by allowing more natural interactions in small group learning situations. To enable smooth communication between an intelligent agent and the user, the agent must have realistic models on when to take turns, when to interrupt, and how to catch the partner's attention. For developing the realistic models applicable for computer assisted language learning systems, we also need to consider the differences between the mother tongue and second language that affect communication style. We collected a multimodal corpus of multi-party conversations in English as the second language to investigate the differences in communication styles. We describe our multimodal corpus and explore features of communication style e.g. filled pauses, and non-verbal information, such as eye-gaze, which show different characteristics between the mother tongue and second language.
Feedback in Nordic First-Encounters: a Comparative Study
Costanza Navarretta
Elisabeth Ahlsén
Jens Allwood
Kristiina Jokinen
Patrizia Paggio
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
The paper compares how feedback is expressed via speech and head movements in comparable corpora of first encounters in three Nordic languages: Danish, Finnish and Swedish. The three corpora have been collected following common guidelines, and they have been annotated according to the same scheme in the NOMCO project. The results of the comparison show that in this data the most frequent feedback-related head movement is Nod in all three languages. Two types of Nods were distinguished in all corpora: Down-nods and Up-nods; the participants from the three countries use Down- and Up-nods with different frequency. In particular, Danes use Down-nods more frequently than Finns and Swedes, while Swedes use Up-nods more frequently than Finns and Danes. Finally, Finns use more often single Nods than repeated Nods, differing from the Swedish and Danish participants. The differences in the frequency of both Down-nods and Up-Nods in the Danish, Finnish and Swedish interactions are interesting given that Nordic countries are not only geographically near, but are also considered to be very similar culturally. Finally, a comparison of feedback-related words in the Danish and Swedish corpora shows that Swedes and Danes use common feedback words corresponding to yes and no with similar frequency.
Creating Comparable Multimodal Corpora for Nordic Languages
Costanza Navarretta
Elisabeth Ahlsén
Jens Allwood
Kristiina Jokinen
Patrizia Paggio
Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011)
The NOMCO Multimodal Nordic Resource - Goals and Characteristics
Patrizia Paggio
Jens Allwood
Elisabeth Ahlsén
Kristiina Jokinen
Costanza Navarretta
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
This paper presents the multimodal corpora that are being collected and annotated in the Nordic NOMCO project. The corpora will be used to study communicative phenomena such as feedback, turn management and sequencing. They already include video material for Swedish, Danish, Finnish and Estonian, and several social activities are represented. The data will make it possible to verify empirically how gestures (head movements, facial displays, hand gestures and body postures) and speech interact in all the three mentioned aspects of communication. The data are being annotated following the MUMIN annotation scheme, which provides attributes concerning the shape and the communicative functions of head movements, face expressions, body posture and hand gestures. After having described the corpora, the paper discusses how they will be used to study the way feedback is expressed in speech and gestures, and reports results from two pilot studies where we investigated the function of head gestures ― both single and repeated ― in combination with feedback expressions. The annotated corpora will be valuable sources for research on intercultural communication as well as for interaction in the individual languages.
Non-verbal Signals for Turn-taking and Feedback
Kristiina Jokinen
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
This paper concerns non-verbal communication, and describes especially the use of eye-gaze to signal turn-taking and feedback in conversational settings. Eye-gaze supports smooth interaction by providing signals that the interlocutors interpret with respect to such conversational functions as taking turns and giving feedback. New possibilities to study the effect of eye-gaze on the interlocutors communicative behaviour have appeared with the eye-tracking technology which in the past years has matured to the level where its use to study naturally occurring dialogues have become easier and more reliable to conduct. It enables the tracking of eye-fixations and gaze-paths, and thus allows analysis of the persons turn-taking and feedback behaviour through the analysis of their focus of attention. In this paper, experiments on the interlocutors non-verbal communication in conversational settings using the eye-tracker are reported, and results of classifying turn-taking using eye-gaze and gesture information are presented. Also the hybrid method that combines signal level analysis with human interpretation is discussed.
Proceedings of the 17th Nordic Conference of Computational Linguistics (NODALIDA 2009)
Kristiina Jokinen
Eckhard Bick
Proceedings of the 17th Nordic Conference of Computational Linguistics (NODALIDA 2009)
Quality of Service and Communicative Competence in NLG Evaluation
Kristiina Jokinen
Proceedings of the Eleventh European Workshop on Natural Language Generation (ENLG 07)
Proceedings of the Tenth European Workshop on Natural Language Generation (ENLG-05)
Graham Wilcock
Kristiina Jokinen
Chris Mellish
Ehud Reiter
Proceedings of the Tenth European Workshop on Natural Language Generation (ENLG-05)
User Expertise Modeling and Adaptivity in a Speech-Based E-Mail System
Kristiina Jokinen
Kari Kanto
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Introduction: Dialogue Systems: Interaction, Adaptation and Styles of Management
Kristiina Jokinen
Björn Gämback
William Black
Roberta Catizone
Yorick Wilks
Proceedings of the 2003 EACL Workshop on Dialogue Systems: interaction, adaptation and styes of management
Distributed Dialogue Management in a Blackboard Architecture
Antti Kerminen
Kristiina Jokinen
Proceedings of the 2003 EACL Workshop on Dialogue Systems: interaction, adaptation and styes of management
Adaptive Dialogue Systems - Interaction with Interact
Kristiina Jokinen
Antti Kerminen
Tommi Lagus
Jukka Kuusisto
Graham Wilcock
Markku Turunen
Jaakko Hakulinen
Krista Jauhiainen
Proceedings of the Third SIGdial Workshop on Discourse and Dialogue
Confidence-Based Adaptivity in Response Generation for a Spoken Dialogue System
Kristiina Jokinen
Graham Wilcock
Proceedings of the Second SIGdial Workshop on Discourse and Dialogue
Clustering dialogue knowledge with self-organizing maps
Mauri Kaipainen
Kristiina Jokinen
Timo Koskenniemi
Antti Kerminen
Kari Kanto
Proceedings of the 13th Nordic Conference of Computational Linguistics (NODALIDA 2001)
Context Management with Topics for Spoken Dialogue Systems
Kristiina Jokinen
Hideki Tanaka
Akio Yokoo
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics
Context Management with Topics for Spoken Dialogue Systems
Kristiina Jokinen
Hideki Tanaka
Akio Yokoo
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1
Planning Dialogue Contributions With New Information
Kristiina Jokinen
Hideki Tanaka
Akio Yokoo
Natural Language Generation
Goal Formulation based on Communicative Principles
Kristiina Jokinen
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics