Alexandros Potamianos


2023

pdf
Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users
Yohan Jo | Xinyan Zhao | Arijit Biswas | Nikoletta Basiou | Vincent Auvray | Nikolaos Malandrakis | Angeliki Metallinou | Alexandros Potamianos
Findings of the Association for Computational Linguistics: EMNLP 2023

While most task-oriented dialogues assume conversations between the agent and one user at a time, dialogue systems are increasingly expected to communicate with multiple users simultaneously who make decisions collaboratively. To facilitate development of such systems, we release the Multi-User MultiWOZ dataset: task-oriented dialogues among two users and one agent. To collect this dataset, each user utterance from MultiWOZ 2.2 was replaced with a small chat between two users that is semantically and pragmatically consistent with the original user utterance, thus resulting in the same dialogue state and system response. These dialogues reflect interesting dynamics of collaborative decision-making in task-oriented scenarios, e.g., social chatter and deliberation. Supported by this data, we propose the novel task of multi-user contextual query rewriting: to rewrite a task-oriented chat between two users as a concise task-oriented query that retains only task-relevant information and that is directly consumable by the dialogue system. We demonstrate that in multi-user dialogues, using predicted rewrites substantially improves dialogue state tracking without modifying existing dialogue systems that are trained for single-user dialogues. Further, this method surpasses training a medium-sized model directly on multi-user dialogues and generalizes to unseen domains.

pdf
A Zero-Shot Approach for Multi-User Task-Oriented Dialog Generation
Shiv Surya | Yohan Jo | Arijit Biswas | Alexandros Potamianos
Proceedings of the 16th International Natural Language Generation Conference

Prior art investigating task-oriented dialog and automatic generation of such dialogs have focused on single-user dialogs between a single user and an agent. However, there is limited study on adapting such AI agents to multi-user conversations (involving multiple users and an agent). Multi-user conversations are richer than single-user conversations containing social banter and collaborative decision making. The most significant challenge impeding such studies is the lack of suitable multi-user task-oriented dialogs with annotations of user belief states and system actions. One potential solution is multi-user dialog generation from single-user data. Many single-user dialogs datasets already contain dialog state information (intents, slots), thus making them suitable candidates. In this work, we propose a novel approach for expanding single-user task-oriented dialogs (e.g. MultiWOZ) to multi-user dialogs in a zero-shot setting.

2021

pdf
UDALM: Unsupervised Domain Adaptation through Language Modeling
Constantinos Karouzos | Georgios Paraskevopoulos | Alexandros Potamianos
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

In this work we explore Unsupervised Domain Adaptation (UDA) of pretrained language models for downstream tasks. We introduce UDALM, a fine-tuning procedure, using a mixed classification and Masked Language Model loss, that can adapt to the target domain distribution in a robust and sample efficient manner. Our experiments show that performance of models trained with the mixed loss scales with the amount of available target data and the mixed loss can be effectively used as a stopping criterion during UDA training. Furthermore, we discuss the relationship between A-distance and the target error and explore some limitations of the Domain Adversarial Training approach. Our method is evaluated on twelve domain pairs of the Amazon Reviews Sentiment dataset, yielding 91.74% accuracy, which is an 1.11% absolute improvement over the state-of-the-art.

2019

pdf
Attention-based Conditioning Methods for External Knowledge Integration
Katerina Margatina | Christos Baziotis | Alexandros Potamianos
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper, we present a novel approach for incorporating external knowledge in Recurrent Neural Networks (RNNs). We propose the integration of lexicon features into the self-attention mechanism of RNN-based architectures. This form of conditioning on the attention distribution, enforces the contribution of the most salient words for the task at hand. We introduce three methods, namely attentional concatenation, feature-based gating and affine transformation. Experiments on six benchmark datasets show the effectiveness of our methods. Attentional feature-based gating yields consistent performance improvement across tasks. Our approach is implemented as a simple add-on module for RNN-based models with minimal computational overhead and can be adapted to any deep neural architecture.

pdf
SEQˆ3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression
Christos Baziotis | Ion Androutsopoulos | Ioannis Konstas | Alexandros Potamianos
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Neural sequence-to-sequence models are currently the dominant approach in several natural language processing tasks, but require large parallel corpora. We present a sequence-to-sequence-to-sequence autoencoder (SEQˆ3), consisting of two chained encoder-decoder pairs, with words used as a sequence of discrete latent variables. We apply the proposed model to unsupervised abstractive sentence compression, where the first and last sequences are the input and reconstructed sentences, respectively, while the middle sequence is the compressed sentence. Constraining the length of the latent word sequences forces the model to distill important information from the input. A pretrained language model, acting as a prior over the latent sequences, encourages the compressed sentences to be human-readable. Continuous relaxations enable us to sample from categorical distributions, allowing gradient-based optimization, unlike alternatives that rely on reinforcement learning. The proposed model does not require parallel text-summary pairs, achieving promising results in unsupervised sentence compression on benchmark datasets.

pdf
Cross-Topic Distributional Semantic Representations Via Unsupervised Mappings
Eleftheria Briakou | Nikos Athanasiou | Alexandros Potamianos
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a word based on different topics. First, a separate DSM is trained for each topic and then each of the topic-based DSMs is aligned to a common vector space. Our unsupervised mapping approach is motivated by the hypothesis that words preserving their relative distances in different topic semantic sub-spaces constitute robust semantic anchors that define the mappings between them. Aligned cross-topic representations achieve state-of-the-art results for the task of contextual word similarity. Furthermore, evaluation on NLP downstream tasks shows that multiple topic-based embeddings outperform single-prototype models.

pdf
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models
Alexandra Chronopoulou | Christos Baziotis | Alexandros Potamianos
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

A growing number of state-of-the-art transfer learning methods employ language models pretrained on large generic corpora. In this paper we present a conceptually simple and effective transfer learning approach that addresses the problem of catastrophic forgetting. Specifically, we combine the task-specific optimization function with an auxiliary language model objective, which is adjusted during the training process. This preserves language regularities captured by language models, while enabling sufficient adaptation for solving the target task. Our method does not require pretraining or finetuning separate components of the network and we train our models end-to-end in a single step. We present results on a variety of challenging affective and text classification tasks, surpassing well established transfer learning methods with greater level of complexity.

2018

pdf
NTUA-SLP at IEST 2018: Ensemble of Neural Transfer Methods for Implicit Emotion Classification
Alexandra Chronopoulou | Aikaterini Margatina | Christos Baziotis | Alexandros Potamianos
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In this paper we present our approach to tackle the Implicit Emotion Shared Task (IEST) organized as part of WASSA 2018 at EMNLP 2018. Given a tweet, from which a certain word has been removed, we are asked to predict the emotion of the missing word. In this work, we experiment with neural Transfer Learning (TL) methods. Our models are based on LSTM networks, augmented with a self-attention mechanism. We use the weights of various pretrained models, for initializing specific layers of our networks. We leverage a big collection of unlabeled Twitter messages, for pretraining word2vec word embeddings and a set of diverse language models. Moreover, we utilize a sentiment analysis dataset for pretraining a model, which encodes emotion related information. The submitted model consists of an ensemble of the aforementioned TL models. Our team ranked 3rd out of 30 participants, achieving an F1 score of 0.703.

pdf
Neural Activation Semantic Models: Computational lexical semantic models of localized neural activations
Nikos Athanasiou | Elias Iosif | Alexandros Potamianos
Proceedings of the 27th International Conference on Computational Linguistics

Neural activation models have been proposed in the literature that use a set of example words for which fMRI measurements are available in order to find a mapping between word semantics and localized neural activations. Successful mappings let us expand to the full lexicon of concrete nouns using the assumption that similarity of meaning implies similar neural activation patterns. In this paper, we propose a computational model that estimates semantic similarity in the neural activation space and investigates the relative performance of this model for various natural language processing tasks. Despite the simplicity of the proposed model and the very small number of example words used to bootstrap it, the neural activation semantic model performs surprisingly well compared to state-of-the-art word embeddings. Specifically, the neural activation semantic model performs better than the state-of-the-art for the task of semantic similarity estimation between very similar or very dissimilar words, while performing well on other tasks such as entailment and word categorization. These are strong indications that neural activation semantic models can not only shed some light into human cognition but also contribute to computation models for certain tasks.

pdf
NTUA-SLP at SemEval-2018 Task 1: Predicting Affective Content in Tweets with Deep Attentive RNNs and Transfer Learning
Christos Baziotis | Athanasiou Nikolaos | Alexandra Chronopoulou | Athanasia Kolovou | Georgios Paraskevopoulos | Nikolaos Ellinas | Shrikanth Narayanan | Alexandros Potamianos
Proceedings of the 12th International Workshop on Semantic Evaluation

In this paper we present deep-learning models that submitted to the SemEval-2018 Task 1 competition: “Affect in Tweets”. We participated in all subtasks for English tweets. We propose a Bi-LSTM architecture equipped with a multi-layer self attention mechanism. The attention mechanism improves the model performance and allows us to identify salient words in tweets, as well as gain insight into the models making them more interpretable. Our model utilizes a set of word2vec word embeddings trained on a large collection of 550 million Twitter messages, augmented by a set of word affective features. Due to the limited amount of task-specific training data, we opted for a transfer learning approach by pretraining the Bi-LSTMs on the dataset of Semeval 2017, Task 4A. The proposed approach ranked 1st in Subtask E “Multi-Label Emotion Classification”, 2nd in Subtask A “Emotion Intensity Regression” and achieved competitive results in other subtasks.

pdf
NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention
Christos Baziotis | Athanasiou Nikolaos | Athanasia Kolovou | Georgios Paraskevopoulos | Nikolaos Ellinas | Alexandros Potamianos
Proceedings of the 12th International Workshop on Semantic Evaluation

In this paper we present a deep-learning model that competed at SemEval-2018 Task 2 “Multilingual Emoji Prediction”. We participated in subtask A, in which we are called to predict the most likely associated emoji in English tweets. The proposed architecture relies on a Long Short-Term Memory network, augmented with an attention mechanism, that conditions the weight of each word, on a “context vector” which is taken as the aggregation of a tweet’s meaning. Moreover, we initialize the embedding layer of our model, with word2vec word embeddings, pretrained on a dataset of 550 million English tweets. Finally, our model does not rely on hand-crafted features or lexicons and is trained end-to-end with back-propagation. We ranked 2nd out of 48 teams.

pdf
NTUA-SLP at SemEval-2018 Task 3: Tracking Ironic Tweets using Ensembles of Word and Character Level Attentive RNNs
Christos Baziotis | Athanasiou Nikolaos | Pinelopi Papalampidi | Athanasia Kolovou | Georgios Paraskevopoulos | Nikolaos Ellinas | Alexandros Potamianos
Proceedings of the 12th International Workshop on Semantic Evaluation

In this paper we present two deep-learning systems that competed at SemEval-2018 Task 3 “Irony detection in English tweets”. We design and ensemble two independent models, based on recurrent neural networks (Bi-LSTM), which operate at the word and character level, in order to capture both the semantic and syntactic information in tweets. Our models are augmented with a self-attention mechanism, in order to identify the most informative words. The embedding layer of our word-level model is initialized with word2vec word embeddings, pretrained on a collection of 550 million English tweets. We did not utilize any handcrafted features, lexicons or external datasets as prior information and our models are trained end-to-end using back propagation on constrained data. Furthermore, we provide visualizations of tweets with annotations for the salient tokens of the attention layer that can help to interpret the inner workings of the proposed models. We ranked 2nd out of 42 teams in Subtask A and 2nd out of 31 teams in Subtask B. However, post-task-completion enhancements of our models achieve state-of-the-art results ranking 1st for both subtasks.

2017

pdf
Tweester at SemEval-2017 Task 4: Fusion of Semantic-Affective and pairwise classification models for sentiment analysis in Twitter
Athanasia Kolovou | Filippos Kokkinos | Aris Fergadis | Pinelopi Papalampidi | Elias Iosif | Nikolaos Malandrakis | Elisavet Palogiannidi | Haris Papageorgiou | Shrikanth Narayanan | Alexandros Potamianos
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper, we describe our submission to SemEval2017 Task 4: Sentiment Analysis in Twitter. Specifically the proposed system participated both to tweet polarity classification (two-, three- and five class) and tweet quantification (two and five-class) tasks.

pdf
Structural Attention Neural Networks for improved sentiment analysis
Filippos Kokkinos | Alexandros Potamianos
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We introduce a tree-structured attention neural network for sentences and small phrases and apply it to the problem of sentiment classification. Our model expands the current recursive models by incorporating structural information around a node of a syntactic tree using both bottom-up and top-down information propagation. Also, the model utilizes structural attention to identify the most salient representations during the construction of the syntactic tree.

2016

pdf
Tweester at SemEval-2016 Task 4: Sentiment Analysis in Twitter Using Semantic-Affective Model Adaptation
Elisavet Palogiannidi | Athanasia Kolovou | Fenia Christopoulou | Filippos Kokkinos | Elias Iosif | Nikolaos Malandrakis | Haris Papageorgiou | Shrikanth Narayanan | Alexandros Potamianos
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf
A semantic-affective compositional approach for the affective labelling of adjective-noun and noun-noun pairs
Elisavet Palogiannidi | Elias Iosif | Polychronis Koutsakis | Alexandros Potamianos
Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf
The SpeDial datasets: datasets for Spoken Dialogue Systems analytics
José Lopes | Arodami Chorianopoulou | Elisavet Palogiannidi | Helena Moniz | Alberto Abad | Katerina Louka | Elias Iosif | Alexandros Potamianos
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The SpeDial consortium is sharing two datasets that were used during the SpeDial project. By sharing them with the community we are providing a resource to reduce the duration of cycle of development of new Spoken Dialogue Systems (SDSs). The datasets include audios and several manual annotations, i.e., miscommunication, anger, satisfaction, repetition, gender and task success. The datasets were created with data from real users and cover two different languages: English and Greek. Detectors for miscommunication, anger and gender were trained for both systems. The detectors were particularly accurate in tasks where humans have high annotator agreement such as miscommunication and gender. As expected due to the subjectivity of the task, the anger detector had a less satisfactory performance. Nevertheless, we proved that the automatic detection of situations that can lead to problems in SDSs is possible and can be a promising direction to reduce the duration of SDS’s development cycle.

pdf
Cognitively Motivated Distributional Representations of Meaning
Elias Iosif | Spiros Georgiladakis | Alexandros Potamianos
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Although meaning is at the core of human cognition, state-of-the-art distributional semantic models (DSMs) are often agnostic to the findings in the area of semantic cognition. In this work, we present a novel type of DSMs motivated by the dual-processing cognitive perspective that is triggered by lexico-semantic activations in the short-term human memory. The proposed model is shown to perform better than state-of-the-art models for computing semantic similarity between words. The fusion of different types of DSMs is also investigated achieving results that are comparable or better than the state-of-the-art. The used corpora along with a set of tools, as well as large repositories of vectorial word representations are made publicly available for four languages (English, German, Italian, and Greek).

pdf
Affective Lexicon Creation for the Greek Language
Elisavet Palogiannidi | Polychronis Koutsakis | Elias Iosif | Alexandros Potamianos
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Starting from the English affective lexicon ANEW (Bradley and Lang, 1999a) we have created the first Greek affective lexicon. It contains human ratings for the three continuous affective dimensions of valence, arousal and dominance for 1034 words. The Greek affective lexicon is compared with affective lexica in English, Spanish and Portuguese. The lexicon is automatically expanded by selecting a small number of manually annotated words to bootstrap the process of estimating affective ratings of unknown words. We experimented with the parameters of the semantic-affective model in order to investigate their impact to its performance, which reaches 85% binary classification accuracy (positive vs. negative ratings). We share the Greek affective lexicon that consists of 1034 words and the automatically expanded Greek affective lexicon that contains 407K words.

pdf
Crossmodal Network-Based Distributional Semantic Models
Elias Iosif | Alexandros Potamianos
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Despite the recent success of distributional semantic models (DSMs) in various semantic tasks they remain disconnected with real-world perceptual cues since they typically rely on linguistic features. Text data constitute the dominant source of features for the majority of such models, although there is evidence from cognitive science that cues from other modalities contribute to the acquisition and representation of semantic knowledge. In this work, we propose the crossmodal extension of a two-tier text-based model, where semantic representations are encoded in the first layer, while the second layer is used for computing similarity between words. We exploit text- and image-derived features for performing computations at each layer, as well as various approaches for their crossmodal fusion. It is shown that the crossmodal model performs better (from 0.68 to 0.71 correlation coefficient) than the unimodal one for the task of similarity computation between words.

2015

pdf
Feeling is Understanding: From Affective to Semantic Spaces
Elias Iosif | Alexandros Potamianos
Proceedings of the 11th International Conference on Computational Semantics

pdf
Fusion of Compositional Network-based and Lexical Function Distributional Semantic Models
Spiros Georgiladakis | Elias Iosif | Alexandros Potamianos
Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics

2014

pdf bib
SemEval-2014 Task 2: Grammar Induction for Spoken Dialogue Systems
Ioannis Klasinas | Elias Iosif | Katerina Louka | Alexandros Potamianos
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf
SAIL: Sentiment Analysis using Semantic Similarity and Contrast Features
Nikolaos Malandrakis | Michael Falcone | Colin Vaz | Jesse James Bisogni | Alexandros Potamianos | Shrikanth Narayanan
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf
tucSage: Grammar Rule Induction for Spoken Dialogue Systems via Probabilistic Candidate Selection
Arodami Chorianopoulou | Georgia Athanasopoulou | Elias Iosif | Ioannis Klasinas | Alexandros Potamianos
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf
Word Semantic Similarity for Morphologically Rich Languages
Kalliopi Zervanou | Elias Iosif | Alexandros Potamianos
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this work, we investigate the role of morphology on the performance of semantic similarity for morphologically rich languages, such as German and Greek. The challenge in processing languages with richer morphology than English, lies in reducing estimation error while addressing the semantic distortion introduced by a stemmer or a lemmatiser. For this purpose, we propose a methodology for selective stemming, based on a semantic distortion metric. The proposed algorithm is tested on the task of similarity estimation between words using two types of corpus-based similarity metrics: co-occurrence-based and context-based. The performance on morphologically rich languages is boosted by stemming with the context-based metric, unlike English, where the best results are obtained by the co-occurrence-based metric. A key finding is that the estimation error reduction is different when a word is used as a feature, rather than when it is used as a target word.

pdf
Low-Dimensional Manifold Distributional Semantic Models
Georgia Athanasopoulou | Elias Iosif | Alexandros Potamianos
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf
DeepPurple: Lexical, String and Affective Feature Fusion for Sentence-Level Semantic Similarity Estimation
Nikolaos Malandrakis | Elias Iosif | Vassiliki Prokopi | Alexandros Potamianos | Shrikanth Narayanan
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity

pdf
SAIL: A hybrid approach to sentiment analysis
Nikolaos Malandrakis | Abe Kazemzadeh | Alexandros Potamianos | Shrikanth Narayanan
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

pdf
Semantic Similarity Computation for Abstract and Concrete Nouns Using Network-based Distributional Semantic Models
Elias Iosif | Alexandros Potamianos | Maria Giannoudaki | Kalliopi Zervanou
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers

2012

pdf
DeepPurple: Estimating Sentence Semantic Similarity using N-gram Regression Models and Web Snippets
Nikos Malandrakis | Elias Iosif | Alexandros Potamianos
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf
SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks
Elias Iosif | Alexandros Potamianos
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We investigate the creation of corpora from web-harvested data following a scalable approach that has linear query complexity. Individual web queries are posed for a lexicon that includes thousands of nouns and the retrieved data are aggregated. A lexical network is constructed, in which the lexicon nouns are linked according to their context-based similarity. We introduce the notion of semantic neighborhoods, which are exploited for the computation of semantic similarity. Two types of normalization are proposed and evaluated on the semantic tasks of: (i) similarity judgement, and (ii) noun categorization and taxonomy creation. The created corpus along with a set of tools and noun similarities are made publicly available.

pdf
Associative and Semantic Features Extracted From Web-Harvested Corpora
Elias Iosif | Maria Giannoudaki | Eric Fosler-Lussier | Alexandros Potamianos
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We address the problem of automatic classification of associative and semantic relations between words, and particularly those that hold between nouns. Lexical relations such as synonymy, hypernymy/hyponymy, constitute the fundamental types of semantic relations. Associative relations are harder to define, since they include a long list of diverse relations, e.g., """"Cause-Effect"""", """"Instrument-Agency"""". Motivated by findings from the literature of psycholinguistics and corpus linguistics, we propose features that take advantage of general linguistic properties. For evaluation we merged three datasets assembled and validated by cognitive scientists. A proposed priming coefficient that measures the degree of asymmetry in the order of appearance of the words in text achieves the best classification results, followed by context-based similarity metrics. The web-based features achieve classification accuracy that exceeds 85%.

pdf bib
Up from Limited Dialog Systems!
Giuseppe Riccardi | Philipp Cimiano | Alexandros Potamianos | Christina Unger
NAACL-HLT Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data (SDCTD 2012)

2010

pdf
BabyExp: Constructing a Huge Multimodal Resource to Acquire Commonsense Knowledge Like Children Do
Massimo Poesio | Marco Baroni | Oswald Lanz | Alessandro Lenci | Alexandros Potamianos | Hinrich Schütze | Sabine Schulte im Walde | Luca Surian
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

There is by now widespread agreement that the most realistic way to construct the large-scale commonsense knowledge repositories required by natural language and artificial intelligence applications is by letting machines learn such knowledge from large quantities of data, like humans do. A lot of attention has consequently been paid to the development of increasingly sophisticated machine learning algorithms for knowledge extraction. However, the nature of the input that humans are exposed to while learning commonsense knowledge has received much less attention. The BabyExp project is collecting very dense audio and video recordings of the first 3 years of life of a baby. The corpus constructed in this way will be transcribed with automated techniques and made available to the research community. Moreover, techniques to extract commonsense conceptual knowledge incrementally from these multimodal data are also being explored within the project. The current paper describes BabyExp in general, and presents pilot studies on the feasibility of the automated audio and video transcriptions.