Johan Bos

2024

pdf abs
Gaining More Insight into Neural Semantic Parsing with Challenging Benchmarks
Xiao Zhang | Chunliu Wang | Rik van Noord | Johan Bos
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024

The Parallel Meaning Bank (PMB) serves as a corpus for semantic processing with a focus on semantic parsing and text generation. Currently, we witness an excellent performance of neural parsers and generators on the PMB. This might suggest that such semantic processing tasks have by and large been solved. We argue that this is not the case and that performance scores from the past on the PMB are inflated by non-optimal data splits and test sets that are too easy. In response, we introduce several changes. First, instead of the prior random split, we propose a more systematic splitting approach to improve the reliability of the standard test data. Second, except for the standard test set, we also propose two challenge sets: one with longer texts including discourse structure, and one that addresses compositional generalization. We evaluate five neural models for semantic parsing and meaning-to-text generation. Our results show that model performance declines (in some cases dramatically) on the challenge sets, revealing the limitations of neural models when confronting such challenges.

2023

pdf abs
Discourse Representation Structure Parsing for Chinese
Chunliu Wang | Xiao Zhang | Johan Bos
Proceedings of the 4th Natural Logic Meets Machine Learning Workshop

Previous work has predominantly focused on monolingual English semantic parsing. We, instead, explore the feasibility of Chinese semantic parsing in the absence of labeled data for Chinese meaning representations. We describe the pipeline of automatically collecting the linearized Chinese meaning representation data for sequential-to-sequential neural networks. We further propose a test suite designed explicitly for Chinese semantic parsing, which provides fine-grained evaluation for parsing performance, where we aim to study Chinese parsing difficulties. Our experimental results show that the difficulty of Chinese semantic parsing is mainly caused by adverbs. Realizing Chinese parsing through machine translation and an English parser yields slightly lower performance than training a model directly on Chinese data.

In the last five years, there has been a significant focus in Natural Language Processing (NLP) on developing larger Pretrained Language Models (PLMs) and introducing benchmarks such as SuperGLUE and SQuAD to measure their abilities in language understanding, reasoning, and reading comprehension. These PLMs have achieved impressive results on these benchmarks, even surpassing human performance in some cases. This has led to claims of superhuman capabilities and the provocative idea that certain tasks have been solved. In this position paper, we take a critical look at these claims and ask whether PLMs truly have superhuman abilities and what the current benchmarks are really evaluating. We show that these benchmarks have serious limitations affecting the comparison between humans and PLMs and provide recommendations for fairer and more transparent benchmarks.

pdf abs
Pre-Trained Language-Meaning Models for Multilingual Parsing and Generation
Chunliu Wang | Huiyuan Lai | Malvina Nissim | Johan Bos
Findings of the Association for Computational Linguistics: ACL 2023

Pre-trained language models (PLMs) have achieved great success in NLP and have recently been used for tasks in computational semantics. However, these tasks do not fully benefit from PLMs since meaning representations are not explicitly included. We introduce multilingual pre-trained language-meaning models based on Discourse Representation Structures (DRSs), including meaning representations besides natural language texts in the same model, and design a new strategy to reduce the gap between the pre-training and fine-tuning objectives. Since DRSs are language neutral, cross-lingual transfer learning is adopted to further improve the performance of non-English tasks. Automatic evaluation results show that our approach achieves the best performance on both the multilingual DRS parsing and DRS-to-text generation tasks. Correlation analysis between automatic metrics and human judgements on the generation task further validates the effectiveness of our model. Human inspection reveals that out-of-vocabulary tokens are the main cause of erroneous results.

pdf abs
The Sequence Notation: Catching Complex Meanings in Simple Graphs
Johan Bos
Proceedings of the 15th International Conference on Computational Semantics

Current symbolic semantic representations proposed to capture the semantics of human language have served well to give us insight in how meaning is expressed. But they are either too complicated for large-scale annotation tasks or lack expressive power to play a role in inference tasks. What we propose is a meaning representation system that it is interlingual, model-theoretic, and variable-free. It divides the labour involved in representing meaning along three levels: concept, roles, and contexts. As natural languages are expressed as sequences of phonemes or words, the meaning representations that we propose are likewise sequential. However, the resulting meaning representations can also be visualised as directed acyclic graphs.

2022

pdf abs
Transparent Semantic Parsing with Universal Dependencies Using Graph Transformations
Wessel Poelman | Rik van Noord | Johan Bos
Proceedings of the 29th International Conference on Computational Linguistics

Even though many recent semantic parsers are based on deep learning methods, we should not forget that rule-based alternatives might offer advantages over neural approaches with respect to transparency, portability, and explainability. Taking advantage of existing off-the-shelf Universal Dependency parsers, we present a method that maps a syntactic dependency tree to a formal meaning representation based on Discourse Representation Theory. Rather than using lambda calculus to manage variable bindings, our approach is novel in that it consists of using a series of graph transformations. The resulting UD semantic parser shows good performance for English, German, Italian and Dutch, with F-scores over 75%, outperforming a neural semantic parser for the lower-resourced languages. Unlike neural semantic parsers, our UD semantic parser does not hallucinate output, is relatively easy to port to other languages, and is completely transparent.

This paper describes the continuation of a project that aims at establishing an interoperable annotation schema for quantification phenomena as part of the ISO suite of standards for semantic annotation, known as the Semantic Annotation Framework. After a break, caused by the Covid-19 pandemic, the project was relaunched in early 2022 with a second working draft of an annotation scheme, which is discussed in this paper. Keywords: semantic annotation, quantification, interoperability, annotation schema, ISO standard

2021

pdf abs
Evaluating Text Generation from Discourse Representation Structures
Chunliu Wang | Rik van Noord | Arianna Bisazza | Johan Bos
Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)

We present an end-to-end neural approach to generate English sentences from formal meaning representations, Discourse Representation Structures (DRSs). We use a rather standard bi-LSTM sequence-to-sequence model, work with a linearized DRS input representation, and evaluate character-level and word-level decoders. We obtain very encouraging results in terms of reference-based automatic metrics such as BLEU. But because such metrics only evaluate the surface level of generated output, we develop a new metric, ROSE, that targets specific semantic phenomena. We do this with five DRS generation challenge sets focusing on tense, grammatical number, polarity, named entities and quantities. The aim of these challenge sets is to assess the neural generator’s systematicity and generalization to unseen inputs.

pdf abs
Input Representations for Parsing Discourse Representation Structures: Comparing English with Chinese
Chunliu Wang | Rik van Noord | Arianna Bisazza | Johan Bos
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Neural semantic parsers have obtained acceptable results in the context of parsing DRSs (Discourse Representation Structures). In particular models with character sequences as input showed remarkable performance for English. But how does this approach perform on languages with a different writing system, like Chinese, a language with a large vocabulary of characters? Does rule-based tokenisation of the input help, and which granularity is preferred: characters, or words? The results are promising. Even with DRSs based on English, good results for Chinese are obtained. Tokenisation offers a small advantage for English, but not for Chinese. Overall, characters are preferred as input, both for English and Chinese.

pdf abs
Universal Discourse Representation Structure Parsing
Jiangming Liu | Shay B. Cohen | Mirella Lapata | Johan Bos
Computational Linguistics, Volume 47, Issue 2 - June 2021

We consider the task of crosslingual semantic parsing in the style of Discourse Representation Theory (DRT) where knowledge from annotated corpora in a resource-rich language is transferred via bitext to guide learning in other languages. We introduce 𝕌niversal Discourse Representation Theory (𝕌DRT), a variant of DRT that explicitly anchors semantic representations to tokens in the linguistic input. We develop a semantic parsing framework based on the Transformer architecture and utilize it to obtain semantic resources in multiple languages following two learning schemes. The many-to-one approach translates non-English text to English, and then runs a relatively accurate English parser on the translated text, while the one-to-many approach translates gold standard English to non-English text and trains multiple parsers (one per language) on the translations. Experimental results on the Parallel Meaning Bank show that our proposal outperforms strong baselines by a wide margin and can be used to construct (silver-standard) meaning banks for 99 languages.

pdf bib
Proceedings of the 14th International Conference on Computational Semantics (IWCS)
Sina Zarrieß | Johan Bos | Rik van Noord | Lasha Abzianidze
Proceedings of the 14th International Conference on Computational Semantics (IWCS)

2020

pdf abs
Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT
Rik van Noord | Antonio Toral | Johan Bos
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We combine character-level and contextual language model representations to improve performance on Discourse Representation Structure parsing. Character representations can easily be added in a sequence-to-sequence model in either one encoder or as a fully separate encoder, with improvements that are robust to different language models, languages and data sets. For English, these improvements are larger than adding individual sources of linguistic information or adding non-contextual embeddings. A new method of analysis based on semantic tags demonstrates that the character-level representations improve performance across a subset of selected semantic phenomena.

The 2020 Shared Task at the Conference for Computational Language Learning (CoNLL) was devoted to Meaning Representation Parsing (MRP) across frameworks and languages. Extending a similar setup from the previous year, five distinct approaches to the representation of sentence meaning in the form of directed graphs were represented in the English training and evaluation data for the task, packaged in a uniform graph abstraction and serialization; for four of these representation frameworks, additional training and evaluation data was provided for one additional language per framework. The task received submissions from eight teams, of which two do not participate in the official ranking because they arrived after the closing deadline or made use of additional training data. All technical information regarding the task, including system submissions, official results, and links to supporting resources and software are available from the task web site at: http://mrp.nlpl.eu

pdf bib abs
DRS at MRP 2020: Dressing up Discourse Representation Structures as Graphs
Lasha Abzianidze | Johan Bos | Stephan Oepen
Proceedings of the CoNLL 2020 Shared Task: Cross-Framework Meaning Representation Parsing

Discourse Representation Theory (DRT) is a formal account for representing the meaning of natural language discourse. Meaning in DRT is modeled via a Discourse Representation Structure (DRS), a meaning representation with a model-theoretic interpretation, which is usually depicted as nested boxes. In contrast, a directed labeled graph is a common data structure used to encode semantics of natural language texts. The paper describes the procedure of dressing up DRSs as directed labeled graphs to include DRT as a new framework in the 2020 shared task on Cross-Framework and Cross-Lingual Meaning Representation Parsing. Since one of the goals of the shared task is to encourage unified models for several semantic graph frameworks, the conversion procedure was biased towards making the DRT graph framework somewhat similar to other graph-based meaning representation frameworks.

pdf bib abs
Separating Argument Structure from Logical Structure in AMR
Johan Bos
Proceedings of the Second International Workshop on Designing Meaning Representations

The AMR (Abstract Meaning Representation) formalism for representing meaning of natural language sentences puts emphasis on predicate-argument structure and was not designed to deal with scope and quantifiers. By extending AMR with indices for contexts and formulating constraints on these contexts, a formalism is derived that makes correct predictions for inferences involving negation and bound variables. The attractive core predicate-argument structure of AMR is preserved. The resulting framework is similar to the meaning representations of Discourse Representation Theory employed in the Parallel Meaning Bank.

pdf abs
MAGPIE: A Large Corpus of Potentially Idiomatic Expressions
Hessel Haagsma | Johan Bos | Malvina Nissim
Proceedings of the Twelfth Language Resources and Evaluation Conference

Given the limited size of existing idiom corpora, we aim to enable progress in automatic idiom processing and linguistic analysis by creating the largest-to-date corpus of idioms for English. Using a fixed idiom list, automatic pre-extraction, and a strictly controlled crowdsourced annotation procedure, we show that it is feasible to build a high-quality corpus comprising more than 50K instances, an order of a magnitude larger than previous resources. Crucial ingredients of crowdsourcing were the selection of crowdworkers, clear and comprehensive instructions, and an interface that breaks down the task in small, manageable steps. Analysis of the resulting corpus revealed strong effects of genre on idiom distribution, providing new evidence for existing theories on what influences idiom usage. The corpus also contains rich metadata, and is made publicly available.

2019

Large crowdsourced datasets are widely used for training and evaluating neural models on natural language inference (NLI). Despite these efforts, neural models have a hard time capturing logical inferences, including those licensed by phrase replacements, so-called monotonicity reasoning. Since no large dataset has been developed for monotonicity reasoning, it is still unclear whether the main obstacle is the size of datasets or the model architectures themselves. To investigate this issue, we introduce a new dataset, called HELP, for handling entailments with lexical and logical phenomena. We add it to training data for the state-of-the-art neural models and evaluate them on test sets for monotonicity phenomena. The results showed that our data augmentation improved the overall accuracy. We also find that the improvement is better on monotonicity inferences with lexical replacements than on downward inferences with disjunction and modification. This suggests that some types of inferences can be improved by our data augmentation while others are immune to it.

pdf abs
Linguistic Information in Neural Semantic Parsing with Multiple Encoders
Rik van Noord | Antonio Toral | Johan Bos
Proceedings of the 13th International Conference on Computational Semantics - Short Papers

Recently, sequence-to-sequence models have achieved impressive performance on a number of semantic parsing tasks. However, they often do not exploit available linguistic resources, while these, when employed correctly, are likely to increase performance even further. Research in neural machine translation has shown that employing this information has a lot of potential, especially when using a multi-encoder setup. We employ a range of semantic and syntactic resources to improve performance for the task of Discourse Representation Structure Parsing. We show that (i) linguistic features can be beneficial for neural semantic parsing and (ii) the best method of adding these features is by using multiple encoders.

pdf bib
Proceedings of the IWCS Shared Task on Semantic Parsing
Lasha Abzianidze | Rik van Noord | Hessel Haagsma | Johan Bos
Proceedings of the IWCS Shared Task on Semantic Parsing

pdf bib abs
The First Shared Task on Discourse Representation Structure Parsing
Lasha Abzianidze | Rik van Noord | Hessel Haagsma | Johan Bos
Proceedings of the IWCS Shared Task on Semantic Parsing

The paper presents the IWCS 2019 shared task on semantic parsing where the goal is to produce Discourse Representation Structures (DRSs) for English sentences. DRSs originate from Discourse Representation Theory and represent scoped meaning representations that capture the semantics of negation, modals, quantification, and presupposition triggers. Additionally, concepts and event-participants in DRSs are described with WordNet synsets and the thematic roles from VerbNet. To measure similarity between two DRSs, they are represented in a clausal form, i.e. as a set of tuples. Participant systems were expected to produce DRSs in this clausal form. Taking into account the rich lexical information, explicit scope marking, a high number of shared variables among clauses, and highly-constrained format of valid DRSs, all these makes the DRS parsing a challenging NLP task. The results of the shared task displayed improvements over the existing state-of-the-art parser.

pdf bib abs
Thirty Musts for Meaning Banking
Lasha Abzianidze | Johan Bos
Proceedings of the First International Workshop on Designing Meaning Representations

Meaning banking—creating a semantically annotated corpus for the purpose of semantic parsing or generation—is a challenging task. It is quite simple to come up with a complex meaning representation, but it is hard to design a simple meaning representation that captures many nuances of meaning. This paper lists some lessons learned in nearly ten years of meaning annotation during the development of the Groningen Meaning Bank (Bos et al., 2017) and the Parallel Meaning Bank (Abzianidze et al., 2017). The paper’s format is rather unconventional: there is no explicit related work, no methodology section, no results, and no discussion (and the current snippet is not an abstract but actually an introductory preface). Instead, its structure is inspired by work of Traum (2000) and Bender (2013). The list starts with a brief overview of the existing meaning banks (Section 1) and the rest of the items are roughly divided into three groups: corpus collection (Section 2 and 3, annotation methods (Section 4–11), and design of meaning representations (Section 12–30). We hope this overview will give inspiration and guidance in creating improved meaning banks in the future

pdf abs
CCGweb: a New Annotation Tool and a First Quadrilingual CCG Treebank
Kilian Evang | Lasha Abzianidze | Johan Bos
Proceedings of the 13th Linguistic Annotation Workshop

We present the first open-source graphical annotation tool for combinatory categorial grammar (CCG), and the first set of detailed guidelines for syntactic annotation with CCG, for four languages: English, German, Italian, and Dutch. We also release a parallel pilot CCG treebank based on these guidelines, with 4x100 adjudicated sentences, 10K single-annotator fully corrected sentences, and 82K single-annotator partially corrected sentences.

Monotonicity reasoning is one of the important reasoning skills for any intelligent natural language inference (NLI) model in that it requires the ability to capture the interaction between lexical and syntactic structures. Since no test set has been developed for monotonicity reasoning with wide coverage, it is still unclear whether neural models can perform monotonicity reasoning in a proper way. To investigate this issue, we introduce the Monotonicity Entailment Dataset (MED). Performance by state-of-the-art NLI models on the new test set is substantially worse, under 55%, especially on downward reasoning. In addition, analysis using a monotonicity-driven data augmentation method showed that these models might be limited in their generalization ability in upward and downward reasoning.

2018

pdf abs
The Other Side of the Coin: Unsupervised Disambiguation of Potentially Idiomatic Expressions by Contrasting Senses
Hessel Haagsma | Malvina Nissim | Johan Bos
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

Disambiguation of potentially idiomatic expressions involves determining the sense of a potentially idiomatic expression in a given context, e.g. determining that make hay in ‘Investment banks made hay while takeovers shone.’ is used in a figurative sense. This enables automatic interpretation of idiomatic expressions, which is important for applications like machine translation and sentiment analysis. In this work, we present an unsupervised approach for English that makes use of literalisations of idiom senses to improve disambiguation, which is based on the lexical cohesion graph-based method by Sporleder and Li (2009). Experimental results show that, while literalisation carries novel information, its performance falls short of that of state-of-the-art unsupervised methods.

pdf
Evaluating Scoped Meaning Representations
Rik van Noord | Lasha Abzianidze | Hessel Haagsma | Johan Bos
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf abs
What can we learn from Semantic Tagging?
Mostafa Abdou | Artur Kulmizev | Vinit Ravishankar | Lasha Abzianidze | Johan Bos
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We investigate the effects of multi-task learning using the recently introduced task of semantic tagging. We employ semantic tagging as an auxiliary task for three different NLP tasks: part-of-speech tagging, Universal Dependency parsing, and Natural Language Inference. We compare full neural network sharing, partial neural network sharing, and what we term the learning what to share setting where negative transfer between tasks is less likely. Our findings show considerable improvements for all tasks, particularly in the learning what to share setting which shows consistent gains across all tasks.

pdf abs
Exploring Neural Methods for Parsing Discourse Representation Structures
Rik van Noord | Lasha Abzianidze | Antonio Toral | Johan Bos
Transactions of the Association for Computational Linguistics, Volume 6

Neural methods have had several recent successes in semantic parsing, though they have yet to face the challenge of producing meaning representations based on formal semantics. We present a sequence-to-sequence neural semantic parser that is able to produce Discourse Representation Structures (DRSs) for English sentences with high accuracy, outperforming traditional DRS parsers. To facilitate the learning of the output, we represent DRSs as a sequence of flat clauses and introduce a method to verify that produced DRSs are well-formed and interpretable. We compare models using characters and words as input and see (somewhat surprisingly) that the former performs better than the latter. We show that eliminating variable names from the output using De Bruijn indices increases parser performance. Adding silver training data boosts performance even further.

2017

pdf abs
The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing
Rik van Noord | Johan Bos
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

We evaluate a semantic parser based on a character-based sequence-to-sequence model in the context of the SemEval-2017 shared task on semantic parsing for AMRs. With data augmentation, super characters, and POS-tagging we gain major improvements in performance compared to a baseline character-level model. Although we improve on previous character-based neural semantic parsing models, the overall accuracy is still lower than a state-of-the-art AMR parser. An ensemble combining our neural semantic parser with an existing, traditional parser, yields a small gain in performance.

pdf abs
The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations
Lasha Abzianidze | Johannes Bjerva | Kilian Evang | Hessel Haagsma | Rik van Noord | Pierre Ludmann | Duc-Duy Nguyen | Johan Bos
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, assuming that the translations are meaning-preserving. The semantic annotation consists of five main steps: (i) segmentation of the text in sentences and lexical items; (ii) syntactic parsing with Combinatory Categorial Grammar; (iii) universal semantic tagging; (iv) symbolization; and (v) compositional semantic analysis based on Discourse Representation Theory. These steps are performed using statistical models trained in a semi-supervised manner. The employed annotation models are all language-neutral. Our first results are promising.

pdf abs
Meaning Banking beyond Events and Roles
Johan Bos
Proceedings of the Workshop Computational Semantics Beyond Events and Roles

In this talk I will discuss the analysis of several semantic phenomena that need meaning representations that can describe attributes of propositional contexts. I will do this in a version of Discourse Representation Theory, using a universal semantic tagset developed as part of a project that aims to produce a large meaning bank (a semantically-annotated corpus) for four languages (English, Dutch, German and Italian).

pdf bib
Towards Universal Semantic Tagging
Lasha Abzianidze | Johan Bos
Proceedings of the 12th International Conference on Computational Semantics (IWCS) — Short papers

pdf
Indexicals and Compositionality: Inside-Out or Outside-In?
Johan Bos
Proceedings of the 12th International Conference on Computational Semantics (IWCS) — Short papers

pdf
Dealing with Co-reference in Neural Semantic Parsing
Rik van Noord | Johan Bos
Proceedings of the 2nd Workshop on Semantic Deep Learning (SemDeep-2)

2016

pdf abs
Cross-lingual Learning of an Open-domain Semantic Parser
Kilian Evang | Johan Bos
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We propose a method for learning semantic CCG parsers by projecting annotations via a parallel corpus. The method opens an avenue towards cheaply creating multilingual semantic parsers mapping open-domain text to formal meaning representations. A first cross-lingually learned Dutch (from English) semantic parser obtains f-scores ranging from 42.99% to 69.22% depending on the level of label informativity taken into account, compared to 58.40% to 78.88% for the underlying source-language system. These are promising numbers compared to state-of-the-art semantic parsing in open domains.

pdf abs
Semantic Tagging with Deep Residual Networks
Johannes Bjerva | Barbara Plank | Johan Bos
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We propose a novel semantic tagging task, semtagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets). Our tagger uses both word and character representations, and includes a novel residual bypass architecture. We evaluate the tagset both intrinsically on the new task of semantic tagging, as well as on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an auxiliary loss function predicting our semantic tags, significantly outperforms prior results on English Universal Dependencies POS tagging (95.71% accuracy on UD v1.2 and 95.67% accuracy on UD v1.3).

pdf
The Meaning Factory at SemEval-2016 Task 8: Producing AMRs with Boxer
Johannes Bjerva | Johan Bos | Hessel Haagsma
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images
Manuela Hürlimann | Johan Bos
Proceedings of the 5th Workshop on Vision and Language

pdf
Squib: Expressive Power of Abstract Meaning Representations
Johan Bos
Computational Linguistics, Volume 42, Issue 3 - September 2016

2015

pdf
Uncovering Noun-Noun Compound Relations by Gamification
Johan Bos | Malvina Nissim
Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA 2015)

pdf
Open-Domain Semantic Parsing with Boxer
Johan Bos
Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA 2015)

pdf
Adding Semantics to Data-Driven Paraphrasing
Ellie Pavlick | Johan Bos | Malvina Nissim | Charley Beller | Benjamin Van Durme | Chris Callison-Burch
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Proceedings of the Third Joint Conference on Lexical and Computational Semantics (*SEM 2014)
Johan Bos | Anette Frank | Roberto Navigli
Proceedings of the Third Joint Conference on Lexical and Computational Semantics (*SEM 2014)

pdf
RoBox: CCG with Structured Perceptron for Supervised Semantic Parsing of Robotic Spatial Commands
Kilian Evang | Johan Bos
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf
The Meaning Factory: Formal Semantics for Recognizing Textual Entailment and Determining Semantic Similarity
Johannes Bjerva | Johan Bos | Rob van der Goot | Malvina Nissim
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib abs
Is there a place for logic in recognizing textual entailment
Johan Bos
Linguistic Issues in Language Technology, Volume 9, 2014 - Perspectives on Semantic Representations for Textual Inference

From a purely theoretical point of view, it makes sense to approach recognizing textual entailment (RTE) with the help of logic. After all, entailment matters are all about logic. In practice, only few RTE systems follow the bumpy road from words to logic. This is probably because it requires a combination of robust, deep semantic analysis and logical inference—and why develop something with this complexity if you perhaps can get away with something simpler? In this article, with the help of an RTE system based on Combinatory Categorial Grammar, Discourse Representation Theory, and first-order theorem proving, we make an empirical assessment of the logic-based approach. High precision paired with low recall is a key characteristic of this system. The bottleneck in achieving high recall is the lack of a systematic way to produce relevant background knowledge. There is a place for logic in RTE, but it is (still) overshadowed by the knowledge acquisition problem.

2013

pdf bib
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Tutorials)
Johan Bos | Keith Hall
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Tutorials)

pdf
Parsimonious Semantic Representations with Projection Pointers
Noortje J. Venhuizen | Johan Bos | Harm Brouwer
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Long Papers

pdf
Scope Disambiguation as a Tagging Task
Kilian Evang | Johan Bos
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers

pdf
Gamification for Word Sense Labeling
Noortje J. Venhuizen | Valerio Basile | Kilian Evang | Johan Bos
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers

pdf bib
Aligning Formal Meaning Representations with Surface Strings for Wide-Coverage Text Generation
Valerio Basile | Johan Bos
Proceedings of the 14th European Workshop on Natural Language Generation

pdf bib
The Groningen Meaning Bank
Johan Bos
Proceedings of the Joint Symposium on Semantic Processing. Textual Inference and Structures in Corpora

pdf
Elephant: Sequence Labeling for Word and Sentence Segmentation
Kilian Evang | Valerio Basile | Grzegorz Chrupała | Johan Bos
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf bib
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)
Eneko Agirre | Johan Bos | Mona Diab | Suresh Manandhar | Yuval Marton | Deniz Yuret
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf
UGroningen: Negation detection with Discourse Representation Structures
Valerio Basile | Johan Bos | Kilian Evang | Noortje Venhuizen
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf abs
Developing a large semantically annotated corpus
Valerio Basile | Johan Bos | Kilian Evang | Noortje Venhuizen
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

What would be a good method to provide a large collection of semantically annotated texts with formal, deep semantics rather than shallow? We argue that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the starting ingredients for fulfilling this enterprise. The result is a semantic resource that anyone can edit and that integrates various phenomena, including predicate-argument structure, scope, tense, thematic roles, rhetorical relations and presuppositions, into a single semantic formalism: Discourse Representation Theory. Taking texts rather than sentences as the units of annotation results in deep semantic representations that incorporate discourse structure and dependencies. To manage the various (possibly conflicting) annotations provided by experts and non-experts, we introduce a method that stores ``Bits of Wisdom'' in a database as stand-off annotations.

pdf
Predicting the 2011 Dutch Senate Election Results with Twitter
Erik Tjong Kim Sang | Johan Bos
Proceedings of the Workshop on Semantic Analysis in Social Media

pdf
A platform for collaborative semantic annotation
Valerio Basile | Johan Bos | Kilian Evang | Noortje Venhuizen
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics

Whats the best way to assess the performance of a semantic component in an NLP system? Tradition in NLP evaluation tells us that comparing output against a gold standard is a good idea. To define a gold standard, one first needs to decide on the representation language, and in many cases a first-order language seems a good compromise between expressive power and efficiency. Secondly, one needs to decide how to represent the various semantic phenomena, in particular the depth of analysis of quantification, plurals, eventualities, thematic roles, scope, anaphora, presupposition, ellipsis, comparatives, superlatives, tense, aspect, and time-expressions. Hence it will be hard to come up with an annotation scheme unless one permits different level of semantic granularity. The alternative is a theory-neutral black-box type evaluation where we just look at how systems react on various inputs. For this approach, we can consider the well-known task of recognising textual entailment, or the lesser-known task of textual model checking. The disadvantage of black-box methods is that it is difficult to come up with natural data that cover specific semantic phenomena.