Gerold Schneider


2024

pdf
Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset
Janis Goldzycher | Paul Röttger | Gerold Schneider
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Hate speech detection models are only as good as the data they are trained on. Datasets sourced from social media suffer from systematic gaps and biases, leading to unreliable models with simplistic decision boundaries. Adversarial datasets, collected by exploiting model weaknesses, promise to fix this problem. However, adversarial data collection can be slow and costly, and individual annotators have limited creativity. In this paper, we introduce GAHD, a new German Adversarial Hate speech Dataset comprising ca. 11k examples. During data collection, we explore new strategies for supporting annotators, to create more diverse adversarial examples more efficiently and provide a manual analysis of annotator disagreements for each strategy. Our experiments show that the resulting dataset is challenging even for state-of-the-art hate speech detection models, and that training on GAHD clearly improves model robustness. Further, we find that mixing multiple support strategies is most advantageous. We make GAHD publicly available at https://github.com/jagol/gahd.

pdf
The Influence of Automatic Speech Recognition on Linguistic Features and Automatic Alzheimer’s Disease Detection from Spontaneous Speech
Jonathan Heitz | Gerold Schneider | Nicolas Langer
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Alzheimer’s disease (AD) represents a major problem for society and a heavy burden for those affected. The study of changes in speech offers a potential means for large-scale AD screening that is non-invasive and inexpensive. Automatic Speech Recognition (ASR) is necessary for a fully automated system. We compare different ASR systems in terms of Word Error Rate (WER) using a publicly available benchmark dataset of speech recordings of AD patients and controls. Furthermore, this study is the first to quantify how popular linguistic features change when replacing manual transcriptions with ASR output. This contributes to the understanding of linguistic features in the context of AD detection. Moreover, we investigate how ASR affects AD classification performance by implementing two popular approaches: A fine-tuned BERT model, and Random Forest on popular linguistic features. Our results show best classification performance when using manual transcripts, but the degradation when using ASR is not dramatic. Performance stays strong, achieving an AUROC of 0.87. Our BERT-based approach is affected more strongly by ASR transcription errors than the simpler and more explainable approach based on linguistic features.

2023

pdf
Evaluating the Effectiveness of Natural Language Inference for Hate Speech Detection in Languages with Limited Labeled Data
Janis Goldzycher | Moritz Preisig | Chantal Amrhein | Gerold Schneider
The 7th Workshop on Online Abuse and Harms (WOAH)

Most research on hate speech detection has focused on English where a sizeable amount of labeled training data is available. However, to expand hate speech detection into more languages, approaches that require minimal training data are needed. In this paper, we test whether natural language inference (NLI) models which perform well in zero- and few-shot settings can benefit hate speech detection performance in scenarios where only a limited amount of labeled data is available in the target language. Our evaluation on five languages demonstrates large performance improvements of NLI fine-tuning over direct fine-tuning in the target language. However, the effectiveness of previous work that proposed intermediate fine-tuning on English data is hard to match. Only in settings where the English training data does not match the test domain, can our customised NLI-formulation outperform intermediate fine-tuning on English. Based on our extensive experiments, we propose a set of recommendations for hate speech detection in languages where minimal labeled training data is available.

2022

pdf
Hypothesis Engineering for Zero-Shot Hate Speech Detection
Janis Goldzycher | Gerold Schneider
Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022)

Standard approaches to hate speech detection rely on sufficient available hate speech annotations. Extending previous work that repurposes natural language inference (NLI) models for zero-shot text classification, we propose a simple approach that combines multiple hypotheses to improve English NLI-based zero-shot hate speech detection. We first conduct an error analysis for vanilla NLI-based zero-shot hate speech detection and then develop four strategies based on this analysis. The strategies use multiple hypotheses to predict various aspects of an input text and combine these predictions into a final verdict. We find that the zero-shot baseline used for the initial error analysis already outperforms commercial systems and fine-tuned BERT-based hate speech detection models on HateCheck. The combination of the proposed strategies further increases the zero-shot accuracy of 79.4% on HateCheck by 7.9 percentage points (pp), and the accuracy of 69.6% on ETHOS by 10.0pp.

pdf
Scaling Native Language Identification with Transformer Adapters
Ahmet Yavuz Uluslu | Gerold Schneider
Proceedings of the 5th International Conference on Natural Language and Speech Processing (ICNLSP 2022)

2020

pdf
Using Multilingual Resources to Evaluate CEFRLex for Learner Applications
Johannes Graën | David Alfter | Gerold Schneider
Proceedings of the Twelfth Language Resources and Evaluation Conference

The Common European Framework of Reference for Languages (CEFR) defines six levels of learner proficiency, and links them to particular communicative abilities. The CEFRLex project aims at compiling lexical resources that link single words and multi-word expressions to particular CEFR levels. The resources are thought to reflect second language learner needs as they are compiled from CEFR-graded textbooks and other learner-directed texts. In this work, we investigate the applicability of CEFRLex resources for building language learning applications. Our main concerns were that vocabulary in language learning materials might be sparse, i.e. that not all vocabulary items that belong to a particular level would also occur in materials for that level, and, on the other hand, that vocabulary items might be used on lower-level materials if required by the topic (e.g. with a simpler paraphrasing or translation). Our results indicate that the English CEFRLex resource is in accordance with external resources that we jointly employ as gold standard. Together with other values obtained from monolingual and parallel corpora, we can indicate which entries need to be adjusted to obtain values that are even more in line with this gold standard. We expect that this finding also holds for the other languages

2018

pdf
NLP Corpus Observatory – Looking for Constellations in Parallel Corpora to Improve Learners’ Collocational Skills
Gerold Schneider | Johannes Graën
Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning

2017

pdf
Crossing the border twice: Reimporting prepositions to alleviate L1-specific transfer errors
Johannes Graën | Gerold Schneider
Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition

pdf
Comparing Rule-based and SMT-based Spelling Normalisation for English Historical Texts
Gerold Schneider | Eva Pettersson | Michael Percillier
Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language

2014

pdf
Measuring the Public Accountability of New Modes of Governance
Bruno Wueest | Gerold Schneider | Michael Amsler
Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science

2013

pdf
UZH in BioNLP 2013
Gerold Schneider | Simon Clematide | Tilia Ellendorff | Don Tuggener | Fabio Rinaldi | Gintarė Grigonytė
Proceedings of the BioNLP Shared Task 2013 Workshop

pdf
Exploiting Synergies Between Open Resources for German Dependency Parsing, POS-tagging, and Morphological Analysis
Rico Sennrich | Martin Volk | Gerold Schneider
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

2012

pdf
Dependency parsing for interaction detection in pharmacogenomics
Gerold Schneider | Fabio Rinaldi | Simon Clematide
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We give an overview of our approach to the extraction of interactions between pharmacogenomic entities like drugs, genes and diseases and suggest classes of interaction types driven by data from PharmGKB and partly following the top level ontology WordNet and biomedical types from BioNLP. Our text mining approach to the extraction of interactions is based on syntactic analysis. We use syntactic analyses to explore domain events and to suggest a set of interaction labels for the pharmacogenomics domain.

2011

pdf
An Incremental Model for the Coreference Resolution Task of BioNLP 2011
Don Tuggener | Manfred Klenner | Gerold Schneider | Simon Clematide | Fabio Rinaldi
Proceedings of BioNLP Shared Task 2011 Workshop

2009

pdf
UZurich in the BioNLP 2009 Shared Task
Kaarel Kaljurand | Gerold Schneider | Fabio Rinaldi
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

2008

pdf
Dependency-Based Relation Mining for Biomedical Literature
Fabio Rinaldi | Gerold Schneider | Kaarel Kaljurand | Michael Hess
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe techniques for the automatic detection of relationships among domain entities (e.g. genes, proteins, diseases) mentioned in the biomedical literature. Our approach is based on the adaptive selection of candidate interactions sentences, which are then parsed using our own dependency parser. Specific syntax-based filters are used to limit the number of possible candidate interacting pairs. The approach has been implemented as a demonstrator over a corpus of 2000 richly annotated MedLine abstracts, and later tested by participation to a text mining competition. In both cases, the results obtained have proved the adequacy of the proposed approach to the task of interaction detection.

2007

pdf
Pro3Gres Parser in the CoNLL Domain Adaptation Shared Task
Gerold Schneider | Kaarel Kaljurand | Fabio Rinaldi | Tobias Kuhn
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2004

pdf
Answering Questions in the Genomics Domain
Fabio Rinaldi | James Dowdall | Gerold Schneider | Andreas Persidis
Proceedings of the Conference on Question Answering in Restricted Domains

pdf
Fast, Deep-Linguistic Statistical Dependency Parsing
Gerold Schneider | Fabio Rinaldi | James Dowdall
Proceedings of the Workshop on Recent Advances in Dependency Grammar

pdf
A robust and hybrid deep-linguistic theory applied to large-scale parsing
Gerold Schneider | James Dowdall | Fabio Rinaldi
Proceedings of the 3rd workshop on RObust Methods in Analysis of Natural Language Data (ROMAND 2004)

2003

pdf
A low-complexity, broad-coverage probabilistic Dependency Parser for English
Gerold Schneider
Proceedings of the HLT-NAACL 2003 Student Research Workshop

2002

pdf
Using Syntactic Analysis to Increase Efficiency in Visualizing Text Collections
James Henderson | Paola Merlo | Ivan Petroff | Gerold Schneider
COLING 2002: The 19th International Conference on Computational Linguistics