Judita Preiss


2023

pdf
Automatic Named Entity Obfuscation in Speech
Judita Preiss
Findings of the Association for Computational Linguistics: ACL 2023

Sharing data containing personal information often requires its anonymization, even when consent for sharing was obtained from the data originator. While approaches exist for automated anonymization of text, the area is not as thoroughly explored in speech. This work focuses on identifying, replacing and inserting replacement named entities synthesized using voice cloning into original audio thereby retaining prosodic information while reducing the likelihood of deanonymization. The approach employs a novel named entity recognition (NER) system built directly on speech by training HuBERT (Hsu et al, 2021) using the English speech NER dataset (Yadav et al, 2020). Name substitutes are found using a masked language model and are synthesized using text to speech voice cloning (Eren and team, 2021), upon which the substitute named entities are re-inserted into the original text. The approach is prototyped on a sample of the LibriSpeech corpus (Panyatov et al, 2015) with each step evaluated individually.

2021

pdf
Predicting Informativeness of Semantic Triples
Judita Preiss
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Many automatic semantic relation extraction tools extract subject-predicate-object triples from unstructured text. However, a large quantity of these triples merely represent background knowledge. We explore using full texts of biomedical publications to create a training corpus of informative and important semantic triples based on the notion that the main contributions of an article are summarized in its abstract. This corpus is used to train a deep learning classifier to identify important triples, and we suggest that an importance ranking for semantic triples could also be generated.

2018

pdf
HiDE: a Tool for Unrestricted Literature Based Discovery
Judita Preiss | Mark Stevenson
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

As the quantity of publications increases daily, researchers are forced to narrow their attention to their own specialism and are therefore less likely to make new connections with other areas. Literature based discovery (LBD) supports the identification of such connections. A number of LBD tools are available, however, they often suffer from limitations such as constraining possible searches or not producing results in real-time. We introduce HiDE (Hidden Discovery Explorer), an online knowledge browsing tool which allows fast access to hidden knowledge generated from all abstracts in Medline. HiDE is fast enough to allow users to explore the full range of hidden connections generated by an LBD system. The tool employs a novel combination of two approaches to LBD: a graph-based approach which allows hidden knowledge to be generated on a large scale and an inference algorithm to identify the most promising (most likely to be non trivial) information. Available at https://skye.shef.ac.uk/kdisc

2014

pdf
Seeking Informativeness in Literature Based Discovery
Judita Preiss
Proceedings of BioNLP 2014

2013

pdf
Unsupervised Domain Tuning to Improve Word Sense Disambiguation
Judita Preiss | Mark Stevenson
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
DALE: A Word Sense Disambiguation System for Biomedical Documents Trained using Automatically Labeled Examples
Judita Preiss | Mark Stevenson
Proceedings of the 2013 NAACL HLT Demonstration Session

pdf
Distinguishing Common and Proper Nouns
Judita Preiss | Mark Stevenson
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity

2012

pdf
Scaling up WSD with Automatically Generated Examples
Weiwei Cheng | Judita Preiss | Mark Stevenson
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

pdf
University_Of_Sheffield: Two Approaches to Semantic Text Similarity
Sam Biggins | Shaabi Mohammed | Sam Oakley | Luke Stringer | Mark Stevenson | Judita Preiss
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf
Identifying Comparable Corpora Using LDA
Judita Preiss
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2009

pdf
Refining the most frequent sense baseline
Judita Preiss | Jon Dehdari | Josh King | Dennis Mehay
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf
HMMs, GRs, and N-Grams as Lexical Substitution Techniques – Are They Portable to Other Languages?
Judita Preiss | Andrew Coonce | Brittany Baker
Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning

2007

pdf
A System for Large-Scale Acquisition of Verbal, Nominal and Adjectival Subcategorization Frames from Corpora
Judita Preiss | Ted Briscoe | Anna Korhonen
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2004

pdf
WSD for subcategorization acquisition task description
Judita Preiss | Anna Korhonen
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf
Probabilistic WSD in Senseval-3
Judita Preiss
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf
Can Anaphoric Definite Descriptions be Replaced by Pronouns?
Judita Preiss | Caroline Gasperin | Ted Briscoe
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Intermediate Parsing for Anaphora Resolution? Implementing the Lappin and Leass non-coreference filters
Judita Preiss | Ted Briscoe
Proceedings of the 2003 EACL Workshop on The Computational Treatment of Anaphora

pdf
Using Grammatical Relations to Compare Parsers
Judita Preiss
10th Conference of the European Chapter of the Association for Computational Linguistics

pdf
Improving Subcategorization Acquisition Using Word Sense Disambiguation
Anna Korhonen | Judita Preiss
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf
Improving Subcategorization Acquisition with WSD
Judita Preiss | Anna Korhonen
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

pdf
Subcategorization Acquisition as an Evaluation Method for WSD
Judita Preiss | Anna Korhonen | Ted Briscoe
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2001

pdf bib
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems
Judita Preiss | David Yarowsky
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

pdf
Disambiguating Noun and Verb Senses Using Automatically Acquired Selectional Preferences
Diana McCarthy | John Carroll | Judita Preiss
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

pdf
Anaphora Resolution with Word Sense Disambiguation
Judita Preiss
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems