Alex Hedges
2021
Perhaps PTLMs Should Go to School – A Task to Assess Open Book and Closed Book QA
Manuel Ciosici
|
Joe Cecil
|
Dong-Ho Lee
|
Alex Hedges
|
Marjorie Freedman
|
Ralph Weischedel
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Our goal is to deliver a new task and leaderboard to stimulate research on question answering and pre-trained language models (PTLMs) to understand a significant instructional document, e.g., an introductory college textbook or a manual. PTLMs have shown great success in many question-answering tasks, given significant supervised training, but much less so in zero-shot settings. We propose a new task that includes two college-level introductory texts in the social sciences (American Government 2e) and humanities (U.S. History), hundreds of true/false statements based on review questions written by the textbook authors, validation/development tests based on the first eight chapters of the textbooks, blind tests based on the remaining textbook chapters, and baseline results given state-of-the-art PTLMs. Since the questions are balanced, random performance should be ~50%. T5, fine-tuned with BoolQ achieves the same performance, suggesting that the textbook’s content is not pre-represented in the PTLM. Taking the exam closed book, but having read the textbook (i.e., adding the textbook to T5’s pre-training), yields at best minor improvement (56%), suggesting that the PTLM may not have “understood” the textbook (or perhaps misunderstood the questions). Performance is better (~60%) when the exam is taken open-book (i.e., allowing the machine to automatically retrieve a paragraph and use it to answer the question).
Machine-Assisted Script Curation
Manuel Ciosici
|
Joseph Cummings
|
Mitchell DeHaven
|
Alex Hedges
|
Yash Kankanampati
|
Dong-Ho Lee
|
Ralph Weischedel
|
Marjorie Freedman
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations
We describe Machine-Aided Script Curator (MASC), a system for human-machine collaborative script authoring. Scripts produced with MASC include (1) English descriptions of sub-events that comprise a larger, complex event; (2) event types for each of those events; (3) a record of entities expected to participate in multiple sub-events; and (4) temporal sequencing between the sub-events. MASC automates portions of the script creation process with suggestions for event types, links to Wikidata, and sub-events that may have been forgotten. We illustrate how these automations are useful to the script writer with a few case-study scripts.
Search
Co-authors
- Manuel R. Ciosici 2
- Dong-Ho Lee 2
- Marjorie Freedman 2
- Ralph Weischedel 2
- Joe Cecil 1
- show all...