John Judge


2022

pdf
Learning to Prioritize: Precision-Driven Sentence Filtering for Long Text Summarization
Alex Mei | Anisha Kabir | Rukmini Bapat | John Judge | Tony Sun | William Yang Wang
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Neural text summarization has shown great potential in recent years. However, current state-of-the-art summarization models are limited by their maximum input length, posing a challenge to summarizing longer texts comprehensively. As part of a layered summarization architecture, we introduce PureText, a simple yet effective pre-processing layer that removes low- quality sentences in articles to improve existing summarization models. When evaluated on popular datasets like WikiHow and Reddit TIFU, we show up to 3.84 and 8.57 point ROUGE-1 absolute improvement on the full test set and the long article subset, respectively, for state-of-the-art summarization models such as BertSum and BART. Our approach provides downstream models with higher-quality sentences for summarization, improving overall model performance, especially on long text articles.

pdf
Mitigating Covertly Unsafe Text within Natural Language Systems
Alex Mei | Anisha Kabir | Sharon Levy | Melanie Subbiah | Emily Allaway | John Judge | Desmond Patton | Bruce Bimber | Kathleen McKeown | William Yang Wang
Findings of the Association for Computational Linguistics: EMNLP 2022

An increasingly prevalent problem for intelligent technologies is text safety, as uncontrolled systems may generate recommendations to their users that lead to injury or life-threatening consequences. However, the degree of explicitness of a generated statement that can cause physical harm varies. In this paper, we distinguish types of text that can lead to physical harm and establish one particularly underexplored category: covertly unsafe text. Then, we further break down this category with respect to the system’s information and discuss solutions to mitigate the generation of text in each of these subcategories. Ultimately, our work defines the problem of covertly unsafe language that causes physical harm and argues that this subtle yet dangerous issue needs to be prioritized by stakeholders and regulators. We highlight mitigation strategies to inspire future researchers to tackle this challenging problem and help improve safety within smart systems.

2015

pdf
Tapadóir
Eimear Maguire | John Judge | Teresa Lynn
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf
Tapadóir
Eimear Maguire | John Judge | Teresa Lynn
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

2014

pdf bib
Proceedings of the First Celtic Language Technology Workshop
John Judge | Teresa Lynn | Monica Ward | Brian Ó Raghallaigh
Proceedings of the First Celtic Language Technology Workshop

pdf
Active Learning for Post-Editing Based Incrementally Retrained MT
Aswarth Abhilash Dara | Josef van Genabith | Qun Liu | John Judge | Antonio Toral
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

pdf
The Strategic Impact of META-NET on the Regional, National and International Level
Georg Rehm | Hans Uszkoreit | Sophia Ananiadou | Núria Bel | Audronė Bielevičienė | Lars Borin | António Branco | Gerhard Budin | Nicoletta Calzolari | Walter Daelemans | Radovan Garabík | Marko Grobelnik | Carmen García-Mateo | Josef van Genabith | Jan Hajič | Inma Hernáez | John Judge | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Joseph Mariani | John McNaught | Maite Melero | Monica Monachini | Asunción Moreno | Jan Odijk | Maciej Ogrodniczuk | Piotr Pęzik | Stelios Piperidis | Adam Przepiórkowski | Eiríkur Rögnvaldsson | Michael Rosner | Bolette Pedersen | Inguna Skadiņa | Koenraad De Smedt | Marko Tadić | Paul Thompson | Dan Tufiş | Tamás Váradi | Andrejs Vasiļjevs | Kadri Vider | Jolanta Zabarskaite
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics. This paper documents the initiative’s work throughout Europe in order to boost progress and innovation in our field.

2008

pdf
Linguistically Light Lexical Extensions for Ontologies
Brian Davis | Siegfried Handschuh | Alexander Troussov | John Judge | Mikhail Sogrin
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The identification of class instances within unstructured text for either the purposes of Ontology population or semantic annotation are usually limited to term mentions of Proper Noun and Personal Noun or fixed Key Phrases within Text Analytics or Ontology based Information Extraction(OBIE) applications. These systems do not generalize to cope with compound nominal classes of multi word expressions. Computational Linguistics’ approaches involving deep analysis tend to suffer from idiomaticity and overgeneration problems while the shallower “words with spaces” approach frequently employed in Information Extraction(IE) and Industrial Text Analytics systems lacks flexibility and is prone to lexical proliferation. We outline a representation for encoding light linguistic features of Compound Nominal term mentions of Concepts within an Ontology as well as a lightweight semantic annotator which complies the above linguistic information into efficient Dictionary formats to drive large scale identification and semantic annotation of the aforementioned concepts.

2006

pdf
QuestionBank: Creating a Corpus of Parse-Annotated Questions
John Judge | Aoife Cahill | Josef van Genabith
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics