Miriam R. L. Petruck

Also published as: Miriam R L Petruck, Miriam R.L. Petruck

2023

pdf abs
Adverbs, Surprisingly
Dmitry Nikolaev | Collin Baker | Miriam R. L. Petruck | Sebastian Padó
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

This paper begins with the premise that adverbs are neglected in computational linguistics. This view derives from two analyses: a literature review and a novel adverb dataset to probe a state-of-the-art language model, thereby uncovering systematic gaps in accounts for adverb meaning. We suggest that using Frame Semantics for characterizing word meaning, as in FrameNet, provides a promising approach to adverb analysis, given its ability to describe ambiguity, semantic roles, and null instantiation.

2022

pdf abs
A Gamified Approach to Frame Semantic Role Labeling
Emily Amspoker | Miriam R L Petruck
Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances)

Much research has investigated the possibility of creating games with a purpose (GWAPs), i.e., online games whose purpose is gathering information to address the insufficient amount of data for training and testing of large language models (Von Ahn and Dabbish, 2008). Based on such work, this paper reports on the development of a game for frame semantic role labeling, where players have fun while using semantic frames as prompts for short story writing. This game will generate more annotations for FrameNet and original content for annotation, supporting FrameNet’s goal of characterizing the English language in terms of Frame Semantics.

pdf abs
Comparing Distributional and Curated Approaches for Cross-lingual Frame Alignment
Collin F. Baker | Michael Ellsworth | Miriam R. L. Petruck | Arthur Lorenzi
Proceedings of the Workshop on Dimensions of Meaning: Distributional and Curated Semantics (DistCurate 2022)

Despite advances in statistical approaches to the modeling of meaning, many ques- tions about the ideal way of exploiting both knowledge-based (e.g., FrameNet, WordNet) and data-based methods (e.g., BERT) remain unresolved. This workshop focuses on these questions with three session papers that run the gamut from highly distributional methods (Lekkas et al., 2022), to highly curated methods (Gamonal, 2022), and techniques with statistical methods producing structured semantics (Lawley and Schubert, 2022). In addition, we begin the workshop with a small comparison of cross-lingual techniques for frame semantic alignment for one language pair (Spanish and English). None of the distributional techniques consistently aligns the 1-best frame match from English to Spanish, all failing in at least one case. Predicting which techniques will align which frames cross-linguistically is not possible from any known characteristic of the alignment technique or the frames. Although distributional techniques are a rich source of semantic information for many tasks, at present curated, knowledge-based semantics remains the only technique that can consistently align frames across languages.

2021

pdf abs
Sister Help: Data Augmentation for Frame-Semantic Role Labeling
Ayush Pancholy | Miriam R L Petruck | Swabha Swayamdipta
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop

While FrameNet is widely regarded as a rich resource of semantics in natural language processing, a major criticism concerns its lack of coverage and the relative paucity of its labeled data compared to other commonly used lexical resources such as PropBank and VerbNet. This paper reports on a pilot study to address these gaps. We propose a data augmentation approach, which uses existing frame-specific annotation to automatically annotate other lexical units of the same frame which are unannotated. Our rule-based approach defines the notion of a **sister lexical unit** and generates frame-specific augmented data for training. We present experiments on frame-semantic role labeling which demonstrate the importance of this data augmentation: we obtain a large improvement to prior results on frame identification and argument identification for FrameNet, utilizing both full-text and lexicographic annotations under FrameNet. Our findings on data augmentation highlight the value of automatic resource creation for improved models in frame-semantic parsing.

pdf abs
FrameNet and Typology
Michael Ellsworth | Collin Baker | Miriam R. L. Petruck
Proceedings of the Third Workshop on Computational Typology and Multilingual NLP

FrameNet and the Multilingual FrameNet project have produced multilingual semantic annotations of parallel texts that yield extremely fine-grained typological insights. Moreover, frame semantic annotation of a wide cross-section of languages would provide information on the limits of Frame Semantics (Fillmore 1982, Fillmore1985). Multilingual semantic annotation offers critical input for research on linguistic diversity and recurrent patterns in computational typology. Drawing on results from FrameNet annotation of parallel texts, this paper proposes frame semantic annotation as a new component to complement the state of the art in computational semantic typology.

2020

pdf bib
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet
Tiago T. Torrent | Collin F. Baker | Oliver Czulo | Kyoko Ohara | Miriam R. L. Petruck
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet

While natural language understanding (NLU) is advancing rapidly, today’s technology differs from human-like language understanding in fundamental ways, notably in its inferior efficiency, interpretability, and generalization. This work proposes an approach to representation and learning based on the tenets of embodied cognitive linguistics (ECL). According to ECL, natural language is inherently executable (like programming languages), driven by mental simulation and metaphoric mappings over hierarchical compositions of structures and schemata learned through embodied interaction. This position paper argues that the use of grounding by metaphoric reasoning and simulation will greatly benefit NLU systems, and proposes a system architecture along with a roadmap towards realizing this vision.

2019

pdf abs
SemEval-2019 Task 2: Unsupervised Lexical Frame Induction
Behrang QasemiZadeh | Miriam R. L. Petruck | Regina Stodden | Laura Kallmeyer | Marie Candito
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper presents Unsupervised Lexical Frame Induction, Task 2 of the International Workshop on Semantic Evaluation in 2019. Given a set of prespecified syntactic forms in context, the task requires that verbs and their arguments be clustered to resemble semantic frame structures. Results are useful in identifying polysemous words, i.e., those whose frame structures are not easily distinguished, as well as discerning semantic relations of the arguments. Evaluation of unsupervised frame induction methods fell into two tracks: Task A) Verb Clustering based on FrameNet 1.7; and B) Argument Clustering, with B.1) based on FrameNet’s core frame elements, and B.2) on VerbNet 3.2 semantic roles. The shared task attracted nine teams, of whom three reported promising results. This paper describes the task and its data, reports on methods and resources that these systems used, and offers a comparison to human annotation.

pdf abs
Meaning Representation of Null Instantiated Semantic Roles in FrameNet
Miriam R L Petruck
Proceedings of the First International Workshop on Designing Meaning Representations

Humans have the unique ability to infer information about participants in a scene, even if they are not mentioned in a text about that scene. Computer systems cannot do so without explicit information about those participants. This paper addresses the linguistic phenomenon of null-instantiated frame elements, i.e., implicit semantic roles, and their representation in FrameNet (FN). It motivates FN’s annotation practice, and illustrates three types of null-instantiated arguments that FrameNet tracks, noting that other lexical resources do not record such semantic-pragmatic information, despite its need in natural language understanding (NLU), and the elaborate efforts to create new datasets. It challenges the community to appeal to FN data to develop more sophisticated techniques for recognizing implicit semantic roles, and creating needed datasets. Although the annotation of null-instantiated roles was lexicographically motivated, FN provides useful information for text processing, and therefore must be considered in the design of any meaning representation for natural language understanding.

2018

pdf
Frame Semantics across Languages: Towards a Multilingual FrameNet
Collin F. Baker | Michael Ellsworth | Miriam R. L. Petruck | Swabha Swayamdipta
Proceedings of the 27th International Conference on Computational Linguistics: Tutorial Abstracts

pdf abs
Representing Spatial Relations in FrameNet
Miriam R. L. Petruck | Michael J. Ellsworth
Proceedings of the First International Workshop on Spatial Language Understanding

While humans use natural language to express spatial relations between and across entities in the world with great facility, natural language systems have a facility that depends on that human facility. This position paper presents approach to representing spatial relations in language, and advocates its adoption for representing the meaning of spatial language. This work shows the importance of axis-orientation systems for capturing the complexity of spatial relations, which FrameNet encodes with semantic types.

pdf bib
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
Agata Savary | Carlos Ramisch | Jena D. Hwang | Nathan Schneider | Melanie Andresen | Sameer Pradhan | Miriam R. L. Petruck
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

2016

pdf
Representing Support Verbs in FrameNet
Miriam R. L. Petruck | Michael Ellsworth
Proceedings of the 12th Workshop on Multiword Expressions

abs
MetaNet: Repository, Identification System, and Applications
Miriam R L Petruck | Ellen K Dodge
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

The ubiquity of metaphor in language (Lakoff and Johnson 1980) has served as impetus for cognitive linguistic approaches to the study of language, mind, and the study of mind (e.g. Thibodeau & Boroditsky 2011). While native speakers use metaphor naturally and easily, the treatment and interpretation of metaphor in computational systems remains challenging because such systems have not succeeded in developing ways to recognize the semantic elements that define metaphor. This tutorial demonstrates MetaNet's frame-based semantic analyses, and their informing of MetaNet's automatic metaphor identification system. Participants will gain a complete understanding of the theoretical basis and the practical workings of MetaNet, and acquire relevant information about the Frame Semantics basis of that knowledge base and the way that FrameNet handles the widespread phenomenon of metaphor in language. The tutorial is geared to researchers and practitioners of language technology, not necessarily experts in metaphor analysis or knowledgeable about either FrameNet or MetaNet, but who are interested in natural language processing tasks that involve automatic metaphor processing, or could benefit from exposure to tools and resources that support frame-based deep semantic, analyses of language, including metaphor as a widespread phenomenon in human language.

2015

pdf
Getting the Roles Right: Using FrameNet in NLP
Collin F. Baker | Nathan Schneider | Miriam R. L. Petruck | Michael Ellsworth
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts

abs
Robust Semantic Analysis of Multiword Expressions with FrameNet
Miriam R. L. Petruck | Valia Kordoni
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

This tutorial will give participants a solid understanding of the linguistic features of multiword expressions (MWEs), focusing on the semantics of such expressions and their importance for natural language processing and language technology, with particular attention to the way that FrameNet (framenet.icsi.berkeley.edu) handles this wide spread phenomenon. Our target audience includes researchers and practitioners of language technology, not necessarily experts in MWEs or knowledgeable about FrameNet, who are interested in NLP tasks that involve or could benefit from considering MWEs as a pervasive phenomenon in human language and communication.NLP research has been interested in automatic processing of multiword expressions, with reports on and tasks relating to such efforts presented at workshops and conferences for at least ten years (e.g. ACL 2003, LREC 2008, COLING 2010, EACL 2014). Overcoming the challenge of automatically processing MWEs remains elusive in part because of the difficulty in recognizing, acquiring, and interpreting such forms.Indeed the phenomenon manifests in a range of linguistic forms (as Sag et al. (2001), among many others, have documented), including: noun + noun compounds (e.g. fish knife, health hazard etc.); adjective + noun compounds (e.g. political agenda, national interest, etc.); particle verbs (shut up, take out, etc.); prepositional verbs (e.g. look into, talk into, etc.); VP idioms, such as kick the bucket, and pull someone’s leg, along with less obviously idiomatic forms like answer the door, mention someone’s name, etc.; expressions that have their own mini-grammars, such as names with honorifics and terms of address (e.g. Rabbi Lord Jonathan Sacks), kinship terms (e.g. second cousin once removed), and time expressions (e.g. January 9, 2015); support verb constructions (e.g. verbs: take a bath, make a promise, etc; and prepositions: in doubt, under review, etc.). Linguists address issues of polysemy, compositionality, idiomaticity, and continuity for each type included here.While native speakers use these forms with ease, the treatment and interpretation of MWEs in computational systems requires considerable effort due to the very issues that concern linguists.

Co-authors

Venues

dmr1