Tom Mitchell

Also published as: Tom M. Mitchell

2024

pdf bib
Proceedings of the First Workshop on Natural Language Processing for Human Resources (NLP4HR 2024)
Estevam Hruschka | Thom Lake | Naoki Otani | Tom Mitchell
Proceedings of the First Workshop on Natural Language Processing for Human Resources (NLP4HR 2024)

2023

pdf abs
The Internal State of an LLM Knows When It’s Lying
Amos Azaria | Tom Mitchell
Findings of the Association for Computational Linguistics: EMNLP 2023

While Large Language Models (LLMs) have shown exceptional performance in various tasks, one of their most prominent drawbacks is generating inaccurate or false information with a confident tone. In this paper, we provide evidence that the LLM’s internal state can be used to reveal the truthfulness of statements. This includes both statements provided to the LLM, and statements that the LLM itself generates. Our approach is to train a classifier that outputs the probability that a statement is truthful, based on the hidden layer activations of the LLM as it reads or generates the statement. Experiments demonstrate that given a set of test sentences, of which half are true and half false, our trained classifier achieves an average of 71% to 83% accuracy labeling which sentences are true versus false, depending on the LLM base model. Furthermore, we explore the relationship between our classifier’s performance and approaches based on the probability assigned to the sentence by the LLM. We show that while LLM-assigned sentence probability is related to sentence truthfulness, this probability is also dependent on sentence length and the frequencies of words in the sentence, resulting in our trained classifier providing a more reliable approach to detecting truthfulness, highlighting its potential to enhance the reliability of LLM-generated content and its practical applicability in real-world scenarios.

pdf
Zero-shot Triplet Extraction by Template Infilling
Bosung Kim | Hayate Iso | Nikita Bhutani | Estevam Hruschka | Ndapa Nakashole | Tom Mitchell
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the First Workshop on Matching From Unstructured and Structured Data (MATCHING 2023)
Estevam Hruschka | Tom Mitchell | Sajjadur Rahman | Dunja Mladenić | Marko Grobelnik
Proceedings of the First Workshop on Matching From Unstructured and Structured Data (MATCHING 2023)

2022

pdf bib
Proceedings of the 2nd Workshop on Deriving Insights from User-Generated Text
Estevam Hruschka | Tom Mitchell | Dunja Mladenic | Marko Grobelnik | Nikita Bhutani
Proceedings of the 2nd Workshop on Deriving Insights from User-Generated Text

pdf abs
Towards General Natural Language Understanding with Probabilistic Worldbuilding
Abulhair Saparov | Tom M. Mitchell
Transactions of the Association for Computational Linguistics, Volume 10

We introduce the Probabilistic Worldbuilding Model (PWM), a new fully symbolic Bayesian model of semantic parsing and reasoning, as a first step in a research program toward more domain- and task-general NLU and AI. Humans create internal mental models of their observations that greatly aid in their ability to understand and reason about a large variety of problems. In PWM, the meanings of sentences, acquired facts about the world, and intermediate steps in reasoning are all expressed in a human-readable formal language, with the design goal of interpretability. PWM is Bayesian, designed specifically to be able to generalize to new domains and new tasks. We derive and implement an inference algorithm that reads sentences by parsing and abducing updates to its latent world model that capture the semantics of those sentences, and evaluate it on two out-of-domain question-answering datasets: (1) ProofWriter and (2) a new dataset we call FictionalGeoQA, designed to be more representative of real language but still simple enough to focus on evaluating reasoning ability, while being robust against heuristics. Our method outperforms baselines on both, thereby demonstrating its value as a proof-of-concept.

2021

pdf abs
Conversational Multi-Hop Reasoning with Neural Commonsense Knowledge and Symbolic Logic Rules
Forough Arabshahi | Jennifer Lee | Antoine Bosselut | Yejin Choi | Tom Mitchell
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

One of the challenges faced by conversational agents is their inability to identify unstated presumptions of their users’ commands, a task trivial for humans due to their common sense. In this paper, we propose a zero-shot commonsense reasoning system for conversational agents in an attempt to achieve this. Our reasoner uncovers unstated presumptions from user commands satisfying a general template of if-(state), then-(action), because-(goal). Our reasoner uses a state-of-the-art transformer-based generative commonsense knowledge base (KB) as its source of background knowledge for reasoning. We propose a novel and iterative knowledge query mechanism to extract multi-hop reasoning chains from the neural KB which uses symbolic logic rules to significantly reduce the search space. Similar to any KBs gathered to date, our commonsense KB is prone to missing knowledge. Therefore, we propose to conversationally elicit the missing knowledge from human users with our novel dynamic question generation strategy, which generates and presents contextualized queries to human users. We evaluate the model with a user study with human users that achieves a 35% higher success rate compared to SOTA.

2020

pdf abs
Interactive Task Learning from GUI-Grounded Natural Language Instructions and Demonstrations
Toby Jia-Jun Li | Tom Mitchell | Brad Myers
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We show SUGILITE, an intelligent task automation agent that can learn new tasks and relevant associated concepts interactively from the user’s natural language instructions and demonstrations, using the graphical user interfaces (GUIs) of third-party mobile apps. This system provides several interesting features: (1) it allows users to teach new task procedures and concepts through verbal instructions together with demonstration of the steps of a script using GUIs; (2) it supports users in clarifying their intents for demonstrated actions using GUI-grounded verbal instructions; (3) it infers parameters of tasks and their possible values in utterances using the hierarchical structures of the underlying app GUIs; and (4) it generalizes taught concepts to different contexts and task domains. We describe the architecture of the SUGILITE system, explain the design and implementation of its key features, and show a prototype in the form of a conversational assistant on Android.

2019

pdf abs
Relating Simple Sentence Representations in Deep Neural Networks and the Brain
Sharmistha Jat | Hao Tang | Partha Talukdar | Tom Mitchell
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

What is the relationship between sentence representations learned by deep recurrent models against those encoded by the brain? Is there any correspondence between hidden layers of these recurrent models and brain regions when processing sentences? Can these deep models be used to synthesize brain data which can then be utilized in other extrinsic tasks? We investigate these questions using sentences with simple syntax and semantics (e.g., The bone was eaten by the dog.). We consider multiple neural network architectures, including recently proposed ELMo and BERT. We use magnetoencephalography (MEG) brain recording data collected from human subjects when they were reading these simple sentences. Overall, we find that BERT’s activations correlate the best with MEG brain data. We also find that the deep network representation can be used to generate brain data from new sentences to augment existing brain data. To the best of our knowledge, this is the first work showing that the MEG brain recording when reading a word in a sentence can be used to distinguish earlier words in the sentence. Our exploration is also the first to use deep neural network representations to generate synthetic brain data and to show that it helps in improving subsequent stimuli decoding task accuracy.

pdf abs
Understanding language-elicited EEG data by predicting it from a fine-tuned language model
Dan Schwartz | Tom Mitchell
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Electroencephalography (EEG) recordings of brain activity taken while participants read or listen to language are widely used within the cognitive neuroscience and psycholinguistics communities as a tool to study language comprehension. Several time-locked stereotyped EEG responses to word-presentations – known collectively as event-related potentials (ERPs) – are thought to be markers for semantic or syntactic processes that take place during comprehension. However, the characterization of each individual ERP in terms of what features of a stream of language trigger the response remains controversial. Improving this characterization would make ERPs a more useful tool for studying language comprehension. We take a step towards better understanding the ERPs by finetuning a language model to predict them. This new approach to analysis shows for the first time that all of the ERPs are predictable from embeddings of a stream of language. Prior work has only found two of the ERPs to be predictable. In addition to this analysis, we examine which ERPs benefit from sharing parameters during joint training. We find that two pairs of ERPs previously identified in the literature as being related to each other benefit from joint training, while several other pairs of ERPs that benefit from joint training are suggestive of potential relationships. Extensions of this analysis that further examine what kinds of information in the model embeddings relate to each ERP have the potential to elucidate the processes involved in human language comprehension.

pdf abs
Competence-based Curriculum Learning for Neural Machine Translation
Emmanouil Antonios Platanios | Otilia Stretcu | Graham Neubig | Barnabas Poczos | Tom Mitchell
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Current state-of-the-art NMT systems use large neural networks that are not only slow to train, but also often require many heuristics and optimization tricks, such as specialized learning rate schedules and large batch sizes. This is undesirable as it requires extensive hyperparameter tuning. In this paper, we propose a curriculum learning framework for NMT that reduces training time, reduces the need for specialized heuristics or large batch sizes, and results in overall better performance. Our framework consists of a principled way of deciding which training samples are shown to the model at different times during training, based on the estimated difficulty of a sample and the current competence of the model. Filtering training samples in this manner prevents the model from getting stuck in bad local optima, making it converge faster and reach a better solution than the common approach of uniformly sampling training examples. Furthermore, the proposed method can be easily applied to existing NMT models by simply modifying their input data pipelines. We show that our framework can help improve the training time and the performance of both recurrent neural network models and Transformers, achieving up to a 70% decrease in training time, while at the same time obtaining accuracy improvements of up to 2.2 BLEU.

pdf abs
Look-up and Adapt: A One-shot Semantic Parser
Zhichu Lu | Forough Arabshahi | Igor Labutov | Tom Mitchell
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Computing devices have recently become capable of interacting with their end users via natural language. However, they can only operate within a limited “supported” domain of discourse and fail drastically when faced with an out-of-domain utterance, mainly due to the limitations of their semantic parser. In this paper, we propose a semantic parser that generalizes to out-of-domain examples by learning a general strategy for parsing an unseen utterance through adapting the logical forms of seen utterances, instead of learning to generate a logical form from scratch. Our parser maintains a memory consisting of a representative subset of the seen utterances paired with their logical forms. Given an unseen utterance, our parser works by looking up a similar utterance from the memory and adapting its logical form until it fits the unseen utterance. Moreover, we present a data generation strategy for constructing utterance-logical form pairs from different domains. Our results show an improvement of up to 68.8% on one-shot parsing under two different evaluation settings compared to the baselines.

pdf abs
Learning to Ask for Conversational Machine Learning
Shashank Srivastava | Igor Labutov | Tom Mitchell
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Natural language has recently been explored as a new medium of supervision for training machine learning models. Here, we explore learning classification tasks using language in a conversational setting – where the automated learner does not simply receive language input from a teacher, but can proactively engage the teacher by asking questions. We present a reinforcement learning framework, where the learner’s actions correspond to question types and the reward for asking a question is based on how the teacher’s response changes performance of the resulting machine learning model on the learning task. In this framework, learning good question-asking strategies corresponds to asking sequences of questions that maximize the cumulative (discounted) reward, and hence quickly lead to effective classifiers. Empirical analysis across three domains shows that learned question-asking strategies expedite classifier training by asking appropriate questions at different points in the learning process. The approach allows learning classifiers from a blend of strategies, including learning from observations, explanations and clarifications.

To ensure readability, text is often written and presented with due formatting. These text formatting devices help the writer to effectively convey the narrative. At the same time, these help the readers pick up the structure of the discourse and comprehend the conveyed information. There have been a number of linguistic theories on discourse structure of text. However, these theories only consider unformatted text. Multimedia text contains rich formatting features that can be leveraged for various NLP tasks. In this article, we study some of these discourse features in multimedia text and what communicative function they fulfill in the context. As a case study, we use these features to harvest structured subject knowledge of geometry from textbooks. We conclude that the discourse and text layout features provide information that is complementary to lexical semantic information. Finally, we show that the harvested structured knowledge can be used to improve an existing solver for geometry problems, making it more accurate as well as more explainable.

2018

pdf abs
Zero-shot Learning of Classifiers from Natural Language Quantification
Shashank Srivastava | Igor Labutov | Tom Mitchell
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Humans can efficiently learn new concepts using language. We present a framework through which a set of explanations of a concept can be used to learn a classifier without access to any labeled examples. We use semantic parsing to map explanations to probabilistic assertions grounded in latent class labels and observed attributes of unlabeled data, and leverage the differential semantics of linguistic quantifiers (e.g., ‘usually’ vs ‘always’) to drive model training. Experiments on three domains show that the learned classifiers outperform previous approaches for learning with limited data, and are comparable with fully supervised classifiers trained from a small number of labeled examples.

pdf abs
Contextual Parameter Generation for Universal Neural Machine Translation
Emmanouil Antonios Platanios | Mrinmaya Sachan | Graham Neubig | Tom Mitchell
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation. Our approach requires no changes to the model architecture of a standard NMT system, but instead introduces a new component, the contextual parameter generator (CPG), that generates the parameters of the system (e.g., weights in a neural network). This parameter generator accepts source and target language embeddings as input, and generates the parameters for the encoder and the decoder, respectively. The rest of the model remains unchanged and is shared across all languages. We show how this simple modification enables the system to use monolingual data for training and also perform zero-shot translation. We further show it is able to surpass state-of-the-art performance for both the IWSLT-15 and IWSLT-17 datasets and that the learned language embeddings are able to uncover interesting relationships between languages.

pdf abs
Learning to Learn Semantic Parsers from Natural Language Supervision
Igor Labutov | Bishan Yang | Tom Mitchell
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

As humans, we often rely on language to learn language. For example, when corrected in a conversation, we may learn from that correction, over time improving our language fluency. Inspired by this observation, we propose a learning algorithm for training semantic parsers from supervision (feedback) expressed in natural language. Our algorithm learns a semantic parser from users’ corrections such as “no, what I really meant was before his job, not after”, by also simultaneously learning to parse this natural language feedback in order to leverage it as a form of supervision. Unlike supervision with gold-standard logical forms, our method does not require the user to be familiar with the underlying logical formalism, and unlike supervision from denotation, it does not require the user to know the correct answer to their query. This makes our learning algorithm naturally scalable in settings where existing conversational logs are available and can be leveraged as training data. We construct a novel dataset of natural language feedback in a conversational setting, and show that our method is effective at learning a semantic parser from such natural language supervision.

pdf abs
LIA: A Natural Language Programmable Personal Assistant
Igor Labutov | Shashank Srivastava | Tom Mitchell
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present LIA, an intelligent personal assistant that can be programmed using natural language. Our system demonstrates multiple competencies towards learning from human-like interactions. These include the ability to be taught reusable conditional procedures, the ability to be taught new knowledge about the world (concepts in an ontology) and the ability to be taught how to ground that knowledge in a set of sensors and effectors. Building such a system highlights design questions regarding the overall architecture that such an agent should have, as well as questions about parsing and grounding language in situational contexts. We outline key properties of this architecture, and demonstrate a prototype that embodies them in the form of a personal assistant on an Android device.

2017

pdf abs
A Probabilistic Generative Grammar for Semantic Parsing
Abulhair Saparov | Vijay Saraswat | Tom Mitchell
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

We present a generative model of natural language sentences and demonstrate its application to semantic parsing. In the generative process, a logical form sampled from a prior, and conditioned on this logical form, a grammar probabilistically generates the output sentence. Grammar induction using MCMC is applied to learn the grammar given a set of labeled sentences with corresponding logical forms. We develop a semantic parser that finds the logical form with the highest posterior probability exactly. We obtain strong results on the GeoQuery dataset and achieve state-of-the-art F1 on Jobs.

pdf abs
A Joint Sequential and Relational Model for Frame-Semantic Parsing
Bishan Yang | Tom Mitchell
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We introduce a new method for frame-semantic parsing that significantly improves the prior state of the art. Our model leverages the advantages of a deep bidirectional LSTM network which predicts semantic role labels word by word and a relational network which predicts semantic roles for individual text expressions in relation to a predicate. The two networks are integrated into a single model via knowledge distillation, and a unified graphical model is employed to jointly decode frames and semantic roles during inference. Experiments on the standard FrameNet data show that our model significantly outperforms existing neural and non-neural approaches, achieving a 5.7 F1 gain over the current state of the art, for full frame structure extraction.

pdf abs
Joint Concept Learning and Semantic Parsing from Natural Language Explanations
Shashank Srivastava | Igor Labutov | Tom Mitchell
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Natural language constitutes a predominant medium for much of human learning and pedagogy. We consider the problem of concept learning from natural language explanations, and a small number of labeled examples of the concept. For example, in learning the concept of a phishing email, one might say ‘this is a phishing email because it asks for your bank account number’. Solving this problem involves both learning to interpret open ended natural language statements, and learning the concept itself. We present a joint model for (1) language interpretation (semantic parsing) and (2) concept learning (classification) that does not require labeling statements with logical forms. Instead, the model prefers discriminative interpretations of statements in context of observable features of the data as a weak signal for parsing. On a dataset of email-related concepts, our approach yields across-the-board improvements in classification performance, with a 30% relative improvement in F1 score over competitive methods in the low data regime.

pdf abs
Merging knowledge bases in different languages
Jerónimo Hernández-González | Estevam R. Hruschka Jr. | Tom M. Mitchell
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing

Recently, different systems which learn to populate and extend a knowledge base (KB) from the web in different languages have been presented. Although a large set of concepts should be learnt independently from the language used to read, there are facts which are expected to be more easily gathered in local language (e.g., culture or geography). A system that merges KBs learnt in different languages will benefit from the complementary information as long as common beliefs are identified, as well as from redundancy present in web pages written in different languages. In this paper, we deal with the problem of identifying equivalent beliefs (or concepts) across language specific KBs, assuming that they share the same ontology of categories and relations. In a case study with two KBs independently learnt from different inputs, namely web pages written in English and web pages written in Portuguese respectively, we report on the results of two methodologies: an approach based on personalized PageRank and an inference technique to find out common relevant paths through the KBs. The proposed inference technique efficiently identifies relevant paths, outperforming the baseline (a dictionary-based classifier) in the vast majority of tested categories.

pdf abs
Leveraging Knowledge Bases in LSTMs for Improving Machine Reading
Bishan Yang | Tom Mitchell
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper focuses on how to take advantage of external knowledge bases (KBs) to improve recurrent neural networks for machine reading. Traditional methods that exploit knowledge from KBs encode knowledge as discrete indicator features. Not only do these features generalize poorly, but they require task-specific feature engineering to achieve good performance. We propose KBLSTM, a novel neural model that leverages continuous representations of KBs to enhance the learning of recurrent neural networks for machine reading. To effectively integrate background knowledge with information from the currently processed text, our model employs an attention mechanism with a sentinel to adaptively decide whether to attend to background knowledge and which information from KBs is useful. Experimental results show that our model achieves accuracies that surpass the previous state-of-the-art results for both entity extraction and event extraction on the widely used ACE2005 dataset.

2016

pdf
Joint Extraction of Events and Entities within a Document Context
Bishan Yang | Tom M. Mitchell
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Mapping Verbs in Different Languages to Knowledge Base Relations using Web Text as Interlingua
Derry Tanti Wijaya | Tom M. Mitchell
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2015

pdf
A Compositional and Interpretable Semantic Space
Alona Fyshe | Leila Wehbe | Partha P. Talukdar | Brian Murphy | Tom M. Mitchell
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
“A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce”: Learning State Changing Verbs from Wikipedia Revision History
Derry Tanti Wijaya | Ndapandula Nakashole | Tom Mitchell
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction
Matt Gardner | Tom Mitchell
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
A Knowledge-Intensive Model for Prepositional Phrase Attachment
Ndapandula Nakashole | Tom M. Mitchell
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf abs
Learning a Compositional Semantics for Freebase with an Open Predicate Vocabulary
Jayant Krishnamurthy | Tom M. Mitchell
Transactions of the Association for Computational Linguistics, Volume 3

We present an approach to learning a model-theoretic semantics for natural language tied to Freebase. Crucially, our approach uses an open predicate vocabulary, enabling it to produce denotations for phrases such as “Republican front-runner from Texas” whose semantics cannot be represented using the Freebase schema. Our approach directly converts a sentence’s syntactic CCG parse into a logical form containing predicates derived from the words in the sentence, assigning each word a consistent semantics across sentences. This logical form is evaluated against a learned probabilistic database that defines a distribution over denotations for each textual predicate. A training phase produces this probabilistic database using a corpus of entity-linked text and probabilistic matrix factorization with a novel ranking objective function. We evaluate our approach on a compositional question answering task where it outperforms several competitive baselines. We also compare our approach against manually annotated Freebase queries, finding that our open predicate vocabulary enables us to answer many questions that Freebase cannot.

Tom Mitchell

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2009

2004

Co-authors

Venues