Gerold Schneider


Hypothesis Engineering for Zero-Shot Hate Speech Detection
Janis Goldzycher | Gerold Schneider
Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022)

Standard approaches to hate speech detection rely on sufficient available hate speech annotations. Extending previous work that repurposes natural language inference (NLI) models for zero-shot text classification, we propose a simple approach that combines multiple hypotheses to improve English NLI-based zero-shot hate speech detection. We first conduct an error analysis for vanilla NLI-based zero-shot hate speech detection and then develop four strategies based on this analysis. The strategies use multiple hypotheses to predict various aspects of an input text and combine these predictions into a final verdict. We find that the zero-shot baseline used for the initial error analysis already outperforms commercial systems and fine-tuned BERT-based hate speech detection models on HateCheck. The combination of the proposed strategies further increases the zero-shot accuracy of 79.4% on HateCheck by 7.9 percentage points (pp), and the accuracy of 69.6% on ETHOS by 10.0pp.

Scaling Native Language Identification with Transformer Adapters
Ahmet Yavuz Uluslu | Gerold Schneider
Proceedings of the 5th International Conference on Natural Language and Speech Processing (ICNLSP 2022)


Using Multilingual Resources to Evaluate CEFRLex for Learner Applications
Johannes Graën | David Alfter | Gerold Schneider
Proceedings of the Twelfth Language Resources and Evaluation Conference

The Common European Framework of Reference for Languages (CEFR) defines six levels of learner proficiency, and links them to particular communicative abilities. The CEFRLex project aims at compiling lexical resources that link single words and multi-word expressions to particular CEFR levels. The resources are thought to reflect second language learner needs as they are compiled from CEFR-graded textbooks and other learner-directed texts. In this work, we investigate the applicability of CEFRLex resources for building language learning applications. Our main concerns were that vocabulary in language learning materials might be sparse, i.e. that not all vocabulary items that belong to a particular level would also occur in materials for that level, and, on the other hand, that vocabulary items might be used on lower-level materials if required by the topic (e.g. with a simpler paraphrasing or translation). Our results indicate that the English CEFRLex resource is in accordance with external resources that we jointly employ as gold standard. Together with other values obtained from monolingual and parallel corpora, we can indicate which entries need to be adjusted to obtain values that are even more in line with this gold standard. We expect that this finding also holds for the other languages


NLP Corpus Observatory – Looking for Constellations in Parallel Corpora to Improve Learners’ Collocational Skills
Gerold Schneider | Johannes Graën
Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning


Crossing the border twice: Reimporting prepositions to alleviate L1-specific transfer errors
Johannes Graën | Gerold Schneider
Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition

Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts
Gerold Schneider | Eva Pettersson | Michael Percillier
Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language


Measuring the Public Accountability of New Modes of Governance
Bruno Wueest | Gerold Schneider | Michael Amsler
Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science


UZH in BioNLP 2013
Gerold Schneider | Simon Clematide | Tilia Ellendorff | Don Tuggener | Fabio Rinaldi | Gintarė Grigonytė
Proceedings of the BioNLP Shared Task 2013 Workshop

Exploiting Synergies Between Open Resources for German Dependency Parsing, POS-tagging, and Morphological Analysis
Rico Sennrich | Martin Volk | Gerold Schneider
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013


Dependency parsing for interaction detection in pharmacogenomics
Gerold Schneider | Fabio Rinaldi | Simon Clematide
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We give an overview of our approach to the extraction of interactions between pharmacogenomic entities like drugs, genes and diseases and suggest classes of interaction types driven by data from PharmGKB and partly following the top level ontology WordNet and biomedical types from BioNLP. Our text mining approach to the extraction of interactions is based on syntactic analysis. We use syntactic analyses to explore domain events and to suggest a set of interaction labels for the pharmacogenomics domain.


An Incremental Model for the Coreference Resolution Task of BioNLP 2011
Don Tuggener | Manfred Klenner | Gerold Schneider | Simon Clematide | Fabio Rinaldi
Proceedings of BioNLP Shared Task 2011 Workshop


UZurich in the BioNLP 2009 Shared Task
Kaarel Kaljurand | Gerold Schneider | Fabio Rinaldi
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task


Dependency-Based Relation Mining for Biomedical Literature
Fabio Rinaldi | Gerold Schneider | Kaarel Kaljurand | Michael Hess
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe techniques for the automatic detection of relationships among domain entities (e.g. genes, proteins, diseases) mentioned in the biomedical literature. Our approach is based on the adaptive selection of candidate interactions sentences, which are then parsed using our own dependency parser. Specific syntax-based filters are used to limit the number of possible candidate interacting pairs. The approach has been implemented as a demonstrator over a corpus of 2000 richly annotated MedLine abstracts, and later tested by participation to a text mining competition. In both cases, the results obtained have proved the adequacy of the proposed approach to the task of interaction detection.


Pro3Gres Parser in the CoNLL Domain Adaptation Shared Task
Gerold Schneider | Kaarel Kaljurand | Fabio Rinaldi | Tobias Kuhn
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)


Answering Questions in the Genomics Domain
Fabio Rinaldi | James Dowdall | Gerold Schneider | Andreas Persidis
Proceedings of the Conference on Question Answering in Restricted Domains

Fast, Deep-Linguistic Statistical Dependency Parsing
Gerold Schneider | Fabio Rinaldi | James Dowdall
Proceedings of the Workshop on Recent Advances in Dependency Grammar

A robust and hybrid deep-linguistic theory applied to large-scale parsing
Gerold Schneider | James Dowdall | Fabio Rinaldi
Proceedings of the 3rd workshop on RObust Methods in Analysis of Natural Language Data (ROMAND 2004)


A low-complexity, broad-coverage probabilistic Dependency Parser for English
Gerold Schneider
Proceedings of the HLT-NAACL 2003 Student Research Workshop


Using Syntactic Analysis to Increase Efficiency in Visualizing Text Collections
James Henderson | Paola Merlo | Ivan Petroff | Gerold Schneider
COLING 2002: The 19th International Conference on Computational Linguistics