Daniela Stepanov


2020

pdf
Controlled Crowdsourcing for High-Quality QA-SRL Annotation
Paul Roit | Ayal Klein | Daniela Stepanov | Jonathan Mamou | Julian Michael | Gabriel Stanovsky | Luke Zettlemoyer | Ido Dagan
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Question-answer driven Semantic Role Labeling (QA-SRL) was proposed as an attractive open and natural flavour of SRL, potentially attainable from laymen. Recently, a large-scale crowdsourced QA-SRL corpus and a trained parser were released. Trying to replicate the QA-SRL annotation for new texts, we found that the resulting annotations were lacking in quality, particularly in coverage, making them insufficient for further research and evaluation. In this paper, we present an improved crowdsourcing protocol for complex semantic annotation, involving worker selection and training, and a data consolidation phase. Applying this protocol to QA-SRL yielded high-quality annotation with drastically higher coverage, producing a new gold evaluation dataset. We believe that our annotation protocol and gold standard will facilitate future replicable research of natural semantic annotations.

pdf
QANom: Question-Answer driven SRL for Nominalizations
Ayal Klein | Jonathan Mamou | Valentina Pyatkin | Daniela Stepanov | Hangfeng He | Dan Roth | Luke Zettlemoyer | Ido Dagan
Proceedings of the 28th International Conference on Computational Linguistics

We propose a new semantic scheme for capturing predicate-argument relations for nominalizations, termed QANom. This scheme extends the QA-SRL formalism (He et al., 2015), modeling the relations between nominalizations and their arguments via natural language question-answer pairs. We construct the first QANom dataset using controlled crowdsourcing, analyze its quality and compare it to expertly annotated nominal-SRL annotations, as well as to other QA-driven annotations. In addition, we train a baseline QANom parser for identifying nominalizations and labeling their arguments with question-answer pairs. Finally, we demonstrate the extrinsic utility of our annotations for downstream tasks using both indirect supervision and zero-shot settings.

2018

pdf
HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments
Akari Asai | Sara Evensen | Behzad Golshan | Alon Halevy | Vivian Li | Andrei Lopatenko | Daniela Stepanov | Yoshihiko Suhara | Wang-Chiew Tan | Yinzhan Xu
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)