David Weir

Also published as: D. J. Weir, David J. Weir, David Wei


2021

pdf bib
Data Augmentation for Hypernymy Detection
Thomas Kober | Julie Weeds | Lorenzo Bertolini | David Weir
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

The automatic detection of hypernymy relationships represents a challenging problem in NLP. The successful application of state-of-the-art supervised approaches using distributed representations has generally been impeded by the limited availability of high quality training data. We have developed two novel data augmentation techniques which generate new training examples from existing ones. First, we combine the linguistic principles of hypernym transitivity and intersective modifier-noun composition to generate additional pairs of vectors, such as “small dog - dog” or “small dog - animal”, for which a hypernymy relationship can be assumed. Second, we use generative adversarial networks (GANs) to generate pairs of vectors for which the hypernymy relation can also be assumed. We furthermore present two complementary strategies for extending an existing dataset by leveraging linguistic resources such as WordNet. Using an evaluation across 3 different datasets for hypernymy detection and 2 different vector spaces, we demonstrate that both of the proposed automatic data augmentation and dataset extension strategies substantially improve classifier performance.

pdf bib
Representing Syntax and Composition with Geometric Transformations
Lorenzo Bertolini | Julie Weeds | David Weir | Qiwei Peng
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Structure-aware Sentence Encoder in Bert-Based Siamese Network
Qiwei Peng | David Weir | Julie Weeds
Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021)

Recently, impressive performance on various natural language understanding tasks has been achieved by explicitly incorporating syntax and semantic information into pre-trained models, such as BERT and RoBERTa. However, this approach depends on problem-specific fine-tuning, and as widely noted, BERT-like models exhibit weak performance, and are inefficient, when applied to unsupervised similarity comparison tasks. Sentence-BERT (SBERT) has been proposed as a general-purpose sentence embedding method, suited to both similarity comparison and downstream tasks. In this work, we show that by incorporating structural information into SBERT, the resulting model outperforms SBERT and previous general sentence encoders on unsupervised semantic textual similarity (STS) datasets and transfer classification tasks.

2020

pdf bib
Leveraging HTML in Free Text Web Named Entity Recognition
Colin Ashby | David Weir
Proceedings of the 28th International Conference on Computational Linguistics

HTML tags are typically discarded in free text Named Entity Recognition from Web pages. We investigate whether these discarded tags might be used to improve NER performance. We compare Text+Tags sentences with their Text-Only equivalents, over five datasets, two free text segmentation granularities and two NER models. We find an increased F1 performance for Text+Tags of between 0.9% and 13.2% over all datasets, variants and models. This performance increase, over datasets of varying entity types, HTML density and construction quality, indicates our method is flexible and adaptable. These findings imply that a similar technique might be of use in other Web-aware NLP tasks, including the enrichment of deep language models.

2017

pdf bib
Improving Semantic Composition with Offset Inference
Thomas Kober | Julie Weeds | Jeremy Reffin | David Weir
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Count-based distributional semantic models suffer from sparsity due to unobserved but plausible co-occurrences in any text collection. This problem is amplified for models like Anchored Packed Trees (APTs), that take the grammatical type of a co-occurrence into account. We therefore introduce a novel form of distributional inference that exploits the rich type structure in APTs and infers missing data by the same mechanism that is used for semantic composition.

pdf bib
One Representation per Word - Does it make Sense for Composition?
Thomas Kober | Julie Weeds | John Wilkie | Jeremy Reffin | David Weir
Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications

In this paper, we investigate whether an a priori disambiguation of word senses is strictly necessary or whether the meaning of a word in context can be disambiguated through composition alone. We evaluate the performance of off-the-shelf single-vector and multi-sense vector models on a benchmark phrase similarity task and a novel task for word-sense discrimination. We find that single-sense vector models perform as well or better than multi-sense vector models despite arguably less clean elementary representations. Our findings furthermore show that simple composition functions such as pointwise addition are able to recover sense specific information from a single-sense vector model remarkably well.

pdf bib
When a Red Herring in Not a Red Herring: Using Compositional Methods to Detect Non-Compositional Phrases
Julie Weeds | Thomas Kober | Jeremy Reffin | David Weir
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Non-compositional phrases such as red herring and weakly compositional phrases such as spelling bee are an integral part of natural language (Sag, 2002). They are also the phrases that are difficult, or even impossible, for good compositional distributional models of semantics. Compositionality detection therefore provides a good testbed for compositional methods. We compare an integrated compositional distributional approach, using sparse high dimensional representations, with the ad-hoc compositional approach of applying simple composition operations to state-of-the-art neural embeddings.

2016

pdf bib
A critique of word similarity as a method for evaluating distributional semantic models
Miroslav Batchkarov | Thomas Kober | Jeremy Reffin | Julie Weeds | David Weir
Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP

pdf bib
Improving Sparse Word Representations with Distributional Inference for Semantic Composition
Thomas Kober | Julie Weeds | Jeremy Reffin | David Weir
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics
David Weir | Julie Weeds | Jeremy Reffin | Thomas Kober
Computational Linguistics, Volume 42, Issue 4 - December 2016

2015

pdf bib
Optimising Agile Social Media Analysis
Thomas Kober | David Weir
Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

2014

pdf bib
Distributional Composition using Higher-Order Dependency Vectors
Julie Weeds | David Weir | Jeremy Reffin
Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)

pdf bib
Learning to Predict Distributions of Words Across Domains
Danushka Bollegala | David Weir | John Carroll
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Learning to Distinguish Hypernyms and Co-Hyponyms
Julie Weeds | Daoud Clarke | Jeremy Reffin | David Weir | Bill Keller
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Method51 for Mining Insight from Social Media Datasets
Simon Wibberley | David Weir | Jeremy Reffin
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations

2013

pdf bib
Language Technology for Agile Social Media Science
Simon Wibberley | David Weir | Jeremy Reffin
Proceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

2011

pdf bib
Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification
Danushka Bollegala | David Weir | John Carroll
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Dependency Parsing Schemata and Mildly Non-Projective Dependency Parsing
Carlos Gómez-Rodríguez | John Carroll | David Weir
Computational Linguistics, Volume 37, Issue 3 - September 2011

pdf bib
Algebraic Approaches to Compositional Distributional Semantics
Daoud Clarke | David Weir | Rudi Lutz
Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)

2010

pdf bib
Semantic Composition with Quotient Algebras
Daoud Clarke | Rudi Lutz | David Weir
Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics

2009

pdf bib
Optimal Reduction of Rule Length in Linear Context-Free Rewriting Systems
Carlos Gómez-Rodríguez | Marco Kuhlmann | Giorgio Satta | David Weir
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Parsing Mildly Non-Projective Dependency Structures
Carlos Gómez-Rodríguez | David Weir | John Carroll
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
A Deductive Approach to Dependency Parsing
Carlos Gómez-Rodríguez | John Carroll | David Weir
Proceedings of ACL-08: HLT

2007

pdf bib
Modelling control in generation
Roger Evans | David Weir | John Carroll | Daniel Paiva | Anja Belz
Proceedings of the Eleventh European Workshop on Natural Language Generation (ENLG 07)

2005

pdf bib
The Distributional Similarity of Sub-Parses
Julie Weeds | David Weir | Bill Keller
Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment

pdf bib
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity
Julie Weeds | David Weir
Computational Linguistics, Volume 31, Number 4, December 2005

2004

pdf bib
Characterising Measures of Lexical Distributional Similarity
Julie Weeds | David Weir | Diana McCarthy
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
A General Framework for Distributional Similarity
Julie Weeds | David Weir
Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing

2002

pdf bib
Evaluation of LTAG Parsing with Supertag Compaction
Olga Shaumyan | John Carroll | David Weir
Proceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+6)

pdf bib
Class-Based Probability Estimation Using a Semantic Hierarchy
Stephen Clark | David Weir
Computational Linguistics, Volume 28, Number 2, June 2002

2001

pdf bib
D-Tree Substitution Grammars
Owen Rambow | K. Vijay-Shanker | David Weir
Computational Linguistics, Volume 27, Number 1, March 2001

pdf bib
Class-Based Probability Estimation Using a Semantic Hierarchy
Stephen Clark | David Weir
Second Meeting of the North American Chapter of the Association for Computational Linguistics

2000

pdf bib
Engineering a Wide-Coverage Lexicalized Grammar
John Carroll | Nicolas Nicolov | Olga Shaumyan | Martine Smets | David Weir
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf bib
A Class-based Probabilistic approach to Structural Disambiguation
Stephen Clark | David Weir
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

1999

pdf bib
An Iterative Approach to Estimating Frequencies over a Semantic Hierarchy
Stephen Clark | David Weir
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf bib
Parsing with an Extended Domain of Locality
John Carroll | Nicolas Nicolov | Olga Shaumyan | Martine Smets | David Weir
Ninth Conference of the European Chapter of the Association for Computational Linguistics

1998

pdf bib
The LexSys project
John Carroll | Nicolas Nicolov | Olga Shaumyan | Martine Smets | David Weir
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

pdf bib
A Structure-sharing Parser for Lexicalized Grammars
Roger Evans | David Weir
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
A structure-sharing parser for lexicalized grammars
Roger Evans | David Weir
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

1997

pdf bib
Encoding Frequency Information in Lexicalized Grammars
John Carroll | David Weir
Proceedings of the Fifth International Workshop on Parsing Technologies

We address the issue of how to associate frequency information with lexicalized grammar formalisms, using Lexicalized Tree Adjoining Grammar as a representative framework. We consider systematically a number of alternative probabilistic frameworks, evaluating their adequacy from both a theoretical and empirical perspective using data from existing large treebanks. We also propose three orthogonal approaches fo r backing off probability estimates to cope with the large number of parameters involved.

pdf bib
Automaton-based Parsing for Lexicalised Grammars
Roger Evans | David Weir
Proceedings of the Fifth International Workshop on Parsing Technologies

In wide-coverage lexicalized grammars many of the elementary structures have substructures in common. This means that during parsing some of the computation associated with different structures is duplicated. This paper explores ways in which the grammar can be precompiled into finite state automata so that some of this shared structure results in shared computation at run-time.

1995

pdf bib
Encoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierarchy
Roger Evans | Gerald Gazdar | David Weir
33rd Annual Meeting of the Association for Computational Linguistics

pdf bib
D-Tree Grammars
Owen Rambow | K. Vijay-Shanker | David Weir
33rd Annual Meeting of the Association for Computational Linguistics

pdf bib
A Tractable Extension of Linear Indexed Grammars
Bill Keller | David Weir
Seventh Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Parsing D-Tree Grammars
K. Vijay-Shanker | David Weir | Owen Rambow
Proceedings of the Fourth International Workshop on Parsing Technologies

1993

pdf bib
Parsing Some Constrained Grammar Formalisms
K Vijay-Shanker | David J. Weir
Computational Linguistics, Volume 19, Number 4, December 1993

1992

pdf bib
Linear Context-Free Rewriting Systems and Deterministic Tree-Walking Transducers
David J. Weir
30th Annual Meeting of the Association for Computational Linguistics

1990

pdf bib
Polynomial Time Parsing of Combinatory Categorial Grammars
K. Vijay-Shanker | David J. Weir
28th Annual Meeting of the Association for Computational Linguistics

pdf bib
Multicomponent Tree Adjoining Grammars
David Weir
Proceedings of the First International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+1)

pdf bib
Parallel TAG Parsing on the Connection Machine
Michael Palis | David Wei
Proceedings of the First International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+1)

1989

pdf bib
Recognition of Combinatory Categorial Grammars and Linear Indexed Grammars
K. Vijay-Shanker | David J. Weir
Proceedings of the First International Workshop on Parsing Technologies

1988

pdf bib
Combinatory Categorial Grammars: Generative Power and Relationship to Linear Context-Free Rewriting Systems
David J. Weir | Aravind K. Joshi
26th Annual Meeting of the Association for Computational Linguistics

1987

pdf bib
Characterizing Structural Descriptions Produced by Various Grammatical Formalisms
K. Vijay-Shanker | David J. Weir | Aravind K. Joshi
25th Annual Meeting of the Association for Computational Linguistics

1986

pdf bib
The Relationship Between Tree Adjoining Grammars And Head Grammars
D. J. Weir | K. Vijay-Shanker | A. K. Joshi
24th Annual Meeting of the Association for Computational Linguistics

pdf bib
Tree Adjoining and Head Wrapping
K. Vijay-Shanker | David J. Weir | Aravind K. Joshi
Coling 1986 Volume 1: The 11th International Conference on Computational Linguistics