2022
pdf
abs
The CLAMS Platform at Work: Processing Audiovisual Data from the American Archive of Public Broadcasting
Marc Verhagen
|
Kelley Lynch
|
Kyeongmin Rim
|
James Pustejovsky
Proceedings of the Thirteenth Language Resources and Evaluation Conference
The Computational Linguistics Applications for Multimedia Services (CLAMS) platform provides access to computational content analysis tools for multimedia material. The version we present here is a robust update of an initial prototype implementation from 2019. The platform now sports a variety of image, video, audio and text processing tools that interact via a common multi-modal representation language named MMIF (Multi-Media Interchange Format). We describe the overall architecture, the MMIF format, some of the tools included in the platform, the process to set up and run a workflow, visualizations included in CLAMS, and evaluate aspects of the platform on data from the American Archive of Public Broadcasting, showing how CLAMS can add metadata to mass-digitized multimedia collections, metadata that are typically only available implicitly in now largely unsearchable digitized media in archives and libraries.
pdf
abs
SemEval-2022 Task 9: R2VQ – Competence-based Multimodal Question Answering
Jingxuan Tu
|
Eben Holderness
|
Marco Maru
|
Simone Conia
|
Kyeongmin Rim
|
Kelley Lynch
|
Richard Brutti
|
Roberto Navigli
|
James Pustejovsky
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
In this task, we identify a challenge that is reflective of linguistic and cognitive competencies that humans have when speaking and reasoning. Particularly, given the intuition that textual and visual information mutually inform each other for semantic reasoning, we formulate a Competence-based Question Answering challenge, designed to involve rich semantic annotation and aligned text-video objects. The task is to answer questions from a collection of cooking recipes and videos, where each question belongs to a “question family” reflecting a specific reasoning competence. The data and task result is publicly available.
pdf
abs
Competence-based Question Generation
Jingxuan Tu
|
Kyeongmin Rim
|
James Pustejovsky
Proceedings of the 29th International Conference on Computational Linguistics
Models of natural language understanding often rely on question answering and logical inference benchmark challenges to evaluate the performance of a system. While informative, such task-oriented evaluations do not assess the broader semantic abilities that humans have as part of their linguistic competence when speaking and interpreting language. We define competence-based (CB) question generation, and focus on queries over lexical semantic knowledge involving implicit argument and subevent structure of verbs. We present a method to generate such questions and a dataset of English cooking recipes we use for implementing the generation method. Our primary experiment shows that even large pretrained language models perform poorly on CB questions until they are provided with additional contextualized semantic information. The data and the source code is available at: https: //github.com/brandeis-llc/CompQG.
2020
pdf
abs
Reproducing Neural Ensemble Classifier for Semantic Relation Extraction inScientific Papers
Kyeongmin Rim
|
Jingxuan Tu
|
Kelley Lynch
|
James Pustejovsky
Proceedings of the Twelfth Language Resources and Evaluation Conference
Within the natural language processing (NLP) community, shared tasks play an important role. They define a common goal and allowthe the comparison of different methods on the same data. SemEval-2018 Task 7 involves the identification and classification of relationsin abstracts from computational linguistics (CL) publications. In this paper we describe an attempt to reproduce the methods and resultsfrom the top performing system at for SemEval-2018 Task 7. We describe challenges we encountered in the process, report on the resultsof our system, and discuss the ways that our attempt at reproduction can inform best practices.
pdf
abs
Interchange Formats for Visualization: LIF and MMIF
Kyeongmin Rim
|
Kelley Lynch
|
Marc Verhagen
|
Nancy Ide
|
James Pustejovsky
Proceedings of the Twelfth Language Resources and Evaluation Conference
Promoting interoperrable computational linguistics (CL) and natural language processing (NLP) application platforms and interchange-able data formats have contributed improving discoverabilty and accessbility of the openly available NLP software. In this paper, wediscuss the enhanced data visualization capabilities that are also enabled by inter-operating NLP pipelines and interchange formats.For adding openly available visualization tools and graphical annotation tools to the Language Applications Grid (LAPPS Grid) andComputational Linguistics Applications for Multimedia Services (CLAMS) toolboxes, we have developed interchange formats that cancarry annotations and metadata for text and audiovisual source data. We descibe those data formats and present case studies where wesuccessfully adopt open-source visualization tools and combine them with CL tools.
2019
pdf
abs
Computational Linguistics Applications for Multimedia Services
Kyeongmin Rim
|
Kelley Lynch
|
James Pustejovsky
Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
We present Computational Linguistics Applications for Multimedia Services (CLAMS), a platform that provides access to computational content analysis tools for archival multimedia material that appear in different media, such as text, audio, image, and video. The primary goal of CLAMS is: (1) to develop an interchange format between multimodal metadata generation tools to ensure interoperability between tools; (2) to provide users with a portable, user-friendly workflow engine to chain selected tools to extract meaningful analyses; and (3) to create a public software development kit (SDK) for developers that eases deployment of analysis tools within the CLAMS platform. CLAMS is designed to help archives and libraries enrich the metadata associated with their mass-digitized multimedia collections, that would otherwise be largely unsearchable.
2018
pdf
Bridging the LAPPS Grid and CLARIN
Erhard Hinrichs
|
Nancy Ide
|
James Pustejovsky
|
Jan Hajič
|
Marie Hinrichs
|
Mohammad Fazleh Elahi
|
Keith Suderman
|
Marc Verhagen
|
Kyeongmin Rim
|
Pavel Straňák
|
Jozef Mišutka
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
bib
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Silvio Ricardo Cordeiro
|
Shereen Oraby
|
Umashanthi Pavalanathan
|
Kyeongmin Rim
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
2017
pdf
Communicating and Acting: Understanding Gesture in Simulation Semantics
Nikhil Krishnaswamy
|
Pradyumna Narayana
|
Isaac Wang
|
Kyeongmin Rim
|
Rahul Bangar
|
Dhruva Patil
|
Gururaj Mulay
|
Ross Beveridge
|
Jaime Ruiz
|
Bruce Draper
|
James Pustejovsky
IWCS 2017 — 12th International Conference on Computational Semantics — Short papers