Gerald Penn
2025
Inside-Outside Algorithm for Probabilistic Product-Free Lambek Categorial Grammar
Jinman Zhao | Gerald Penn
Proceedings of the 31st International Conference on Computational Linguistics
Jinman Zhao | Gerald Penn
Proceedings of the 31st International Conference on Computational Linguistics
The inside-outside algorithm is widely utilized in statistical models related to context-free grammars. It plays a key role in the EM estimation of probabilistic context-free grammars. In this work, we introduce an inside-outside algorithm for Probabilistic Lambek Categorical Grammar (PLCG)
Tiny Budgets, Big Gains: Parameter Placement Strategy in Parameter Super-Efficient Fine-Tuning
Jinman Zhao | Xueyan Zhang | Jiaru Li | Jingcheng Niu | Yulan Hu | Erxue Min | Gerald Penn
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Jinman Zhao | Xueyan Zhang | Jiaru Li | Jingcheng Niu | Yulan Hu | Erxue Min | Gerald Penn
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
In this work, we propose FoRA-UA, a novel method that, using only 1–5% of the standard LoRA’s parameters, achieves state-of-the-art performance across a wide range of tasks. Specifically, we explore scenarios with extremely limited parameter budgets and derive two key insights: (1) fix-sized sparse frequency representations approximate small matrices more accurately; and (2) with a fixed number of trainable parameters, introducing a smaller intermediate representation to approximate larger matrices results in lower construction error. These findings form the foundation of our FoRA-UA method. By inserting a small intermediate parameter set, we achieve greater model compression without sacrificing performance. We evaluate FoRA-UA across diverse tasks, including natural language understanding (NLU), natural language generation (NLG), instruction tuning, and image classification, demonstrating strong generalisation and robustness under extreme compression.
Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity
Lei Yu | Jingcheng Niu | Zining Zhu | Xi Chen | Gerald Penn
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Lei Yu | Jingcheng Niu | Zining Zhu | Xi Chen | Gerald Penn
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
In this paper, we introduce DiscoGP, a novel framework for extracting self-contained modular units, or sheaves, within neural language models (LMs). Sheaves extend the concept of functional circuits, a unit widely explored in interpretability research, by considering not only subsets of edges in an LM’s computation graph but also the model’s weight parameters. Our framework identifies sheaves through a gradient-based pruning algorithm that operates on both of these in such a way that reduces the original LM to a sparse skeleton that preserves certain core capabilities. Experimental results demonstrate that, across a range of linguistic and reasoning tasks, DiscoGP extracts sheaves that preserve 93-100% of a model’s performance on the identified task while comprising only 1-7% of the original weights and connections. Furthermore, our analysis reveals that, compared to previously identified LM circuits, the sheaves discovered by DiscoGP exhibit superior modularity and functional fidelity. Extending our method to the neuron level also unveils novel insights into the inner workings of LLMs.
Multi-Agent Based Character Simulation for Story Writing
Tian Yu | Ken Shi | Zixin Zhao | Gerald Penn
Proceedings of the Fourth Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2025)
Tian Yu | Ken Shi | Zixin Zhao | Gerald Penn
Proceedings of the Fourth Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2025)
This work proposes a novel multi-agent story-generation system that writes stories from a narrative plan. Traditional approaches tend to generate a section of text directly from its outline. Our system, by contrast, divides this elaboration process into role-play and rewrite steps, where the former step enacts the story in chronological order with LLM-backed character agents, and the latter step refines the role-play result to align with a narrative plan. We show that the stories produced by our system are preferable to two other LLM-based story-generation approaches. We attribute this advancement to the benefits of incorporating a character-based simulation strategy.
An Analysis of Scoring Methods for Reranking in Large Language Model Story Generation
Megan Deering | Gerald Penn
Proceedings of the Fourth Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2025)
Megan Deering | Gerald Penn
Proceedings of the Fourth Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2025)
Outline-conditioned story generation using Large Language Models (LLMs) offers a promising approach for automating narrative creation. Some outline-conditioned story generation methods use automatic scoring during the generation process in order to improve the story quality. However, current research has shown that automatic scoring is not ideal for assessing story quality. This paper evaluates three proposed automatic story-scoring methods to improve the reranking of outputs during the generation process. These scoring methods leverage different prompting strategies and fine-tuning techniques to enhance the accuracy and relevance of the assessments. By experimenting with these approaches within a beam search framework, we aim to identify the most effective methods for optimizing story-generation outcomes. While we have found no significant overall difference between these methods in terms of their agreement with human ratings during story generation, the overall story ratings by human evaluators are average. These findings motivate the need for improved automatic scoring techniques and datasets while also indicating that simpler, more easily implementable scoring methods for reranking perform comparably to more complex approaches.
An Efficient Parser for Bounded-Order Product-Free Lambek Categorial Grammar via Term Graph
Jinman Zhao | Gerald Penn
Proceedings of the 18th International Conference on Parsing Technologies (IWPT, SyntaxFest 2025)
Jinman Zhao | Gerald Penn
Proceedings of the 18th International Conference on Parsing Technologies (IWPT, SyntaxFest 2025)
Lambek Categorial Grammar (LCG) parsing has been proved to be an NP-complete problem. However, in the bounded-order case, the complexity can be reduced to polynomial time. (CITATION) first introduced the term graph, a simple graphical representation for LCG parsing, but his algorithm for using it remained largely inscrutable. (CITATION) later proposed a polynomial algorithm for bounded-order LCG parsing based on cyclic linear logic, yet both approaches remain largely theoretical, with no open-source implementations available. In this work, we combine the term-graph representation with insights from cyclic linear logic to develop a novel parsing algorithm for bounded-order LCG. Furthermore, we release our parser as an open-source tool.
CCG Revisited: A Multilingual Empirical Study of the Kuhlmann-Satta Algorithm
Paul He | Gerald Penn
Proceedings of the 18th International Conference on Parsing Technologies (IWPT, SyntaxFest 2025)
Paul He | Gerald Penn
Proceedings of the 18th International Conference on Parsing Technologies (IWPT, SyntaxFest 2025)
We revisit the polynomial-time CCG parsing algorithm introduced by Kuhlmann & Satta (2014), and provide a publicly available implementation of it. We evaluate its empirical performance against a naive CKY-style parser across the Parallel Meaning Bank (PMB) corpus. While the fast parser is slightly slower on average, relative to the size of the PMB, but the trend improves as a function of sentence length, and the PMB is large enough to witness an inversion. Our analysis quantifies this crossover and highlights the importance of derivational context decomposition in practical parsing scenarios.
Similarity, Transformation and the Newly Found Invariance of Influence Functions
Andrew Yuan Liu | Gerald Penn
Proceedings of the Society for Computation in Linguistics 2025
Andrew Yuan Liu | Gerald Penn
Proceedings of the Society for Computation in Linguistics 2025
Semantic Masking in a Needle-in-a-haystack Test for Evaluating Large Language Model Long-Text Capabilities
Ken Shi | Gerald Penn
Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025)
Ken Shi | Gerald Penn
Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025)
In this paper, we introduce the concept of Semantic Masking, where semantically coherent surrounding text (the haystack) interferes with the retrieval and comprehension of specific information (the needle) embedded within it. We propose the Needle-in-a-Haystack-QA Test, an evaluation pipeline that assesses LLMs’ long-text capabilities through question answering, explicitly accounting for the Semantic Masking effect. We conduct experiments to demonstrate that Semantic Masking significantly impacts LLM performance more than text length does. By accounting for Semantic Masking, we provide a more accurate assessment of LLMs’ true proficiency in utilizing extended contexts, paving the way for future research to develop models that are not only capable of handling longer inputs but are also adept at navigating complex semantic landscapes.
2024
ConTempo: A Unified Temporally Contrastive Framework for Temporal Relation Extraction
Jingcheng Niu | Saifei Liao | Victoria Ng | Simon De Montigny | Gerald Penn
Findings of the Association for Computational Linguistics: ACL 2024
Jingcheng Niu | Saifei Liao | Victoria Ng | Simon De Montigny | Gerald Penn
Findings of the Association for Computational Linguistics: ACL 2024
The task of temporal relation extraction (TRE) involves identifying and extracting temporal relations between events from narratives. We identify two primary issues with TRE systems. First, by formulating TRE as a simple text classification task where every temporal relation is independent, it is hard to enhance the TRE model’s representation of meaning of temporal relations, and its facility with the underlying temporal calculus. We solve the issue by proposing a novel Temporally Contrastive learning model (ConTempo) that increase the model’s awareness of the meaning of temporal relations by leveraging their symmetric or antisymmetric properties. Second, the reusability of innovations has been limited due to incompatibilities in model architectures. Therefore, we propose a unified framework and show that ConTempo is compatible with all three main branches of TRE research. Our results demonstrate that the performance gains of ConTempo are more pronounced, with the total combination achieving state-of-the-art performance on the widely used MATRES and TBD corpora. We furthermore identified and corrected a large number of annotation errors present in the test set of MATRES, after which the performance increase brought by ConTempo becomes more apparent.
LLM-supertagger: Categorial Grammar Supertagging via Large Language Models
Jinman Zhao | Gerald Penn
Findings of the Association for Computational Linguistics: EMNLP 2024
Jinman Zhao | Gerald Penn
Findings of the Association for Computational Linguistics: EMNLP 2024
Supertagging is an essential task in Categorical grammar parsing and is crucial for dissecting sentence structures. Our research explores the capacity of Large Language Models (LLMs) in supertagging for both Combinatory Categorial Grammar (CCG) and Lambek Categorial Grammar (LCG). We also present a simple method that significantly boosts LLMs, enabling them to outperform LSTM and encoder-based models and achieve state-of-the-art performance. This advancement highlights LLMs’ potential in classification tasks, showcasing their adaptability beyond generative capabilities. Our findings demonstrate the evolving utility of LLMs in natural language processing, particularly in complex tasks like supertagging.
A Generative Model for Lambek Categorial Sequents
Jinman Zhao | Gerald Penn
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Jinman Zhao | Gerald Penn
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
In this work, we introduce a generative model, PLC+, for generating Lambek Categorial Grammar(LCG) sequents. We also introduce a simple method to numerically estimate the model’s parameters from an annotated corpus. Then we compare our model with probabilistic context-free grammars (PCFGs) and show that PLC+ simultaneously assigns a higher probability to a common corpus, and has greater coverage.
LCGbank: A Corpus of Syntactic Analyses Based on Proof Nets
Aditya Bhargava | Timothy A. D. Fowler | Gerald Penn
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Aditya Bhargava | Timothy A. D. Fowler | Gerald Penn
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
In syntactic parsing, *proof nets* are graphical structures that have the advantageous property of invariance to spurious ambiguities. Semantically-equivalent derivations correspond to a single proof net. Recent years have seen fresh interest in statistical syntactic parsing with proof nets, including the development of methods based on neural networks. However, training of statistical parsers requires corpora that provide ground-truth syntactic analyses. Unfortunately, there has been a paucity of corpora in formalisms for which proof nets are applicable, such as Lambek categorial grammar (LCG), a formalism related to combinatory categorial grammar (CCG). To address this, we leverage CCGbank and the relationship between LCG and CCG to develop LCGbank, an English-language corpus of syntactic analyses based on LCG proof nets. In contrast to CCGbank, LCGbank eschews type-changing and uses only categorial rules; the syntactic analyses thus provide fully compositional semantics, exploiting the transparency between syntax and semantics that so characterizes categorial grammars.
Proceedings of TextGraphs-17: Graph-based Methods for Natural Language Processing
Dmitry Ustalov | Yanjun Gao | Alexander Panchenko | Elena Tutubalina | Irina Nikishina | Arti Ramesh | Andrey Sakhovskiy | Ricardo Usbeck | Gerald Penn | Marco Valentino
Proceedings of TextGraphs-17: Graph-based Methods for Natural Language Processing
Dmitry Ustalov | Yanjun Gao | Alexander Panchenko | Elena Tutubalina | Irina Nikishina | Arti Ramesh | Andrey Sakhovskiy | Ricardo Usbeck | Gerald Penn | Marco Valentino
Proceedings of TextGraphs-17: Graph-based Methods for Natural Language Processing
2023
Decomposed scoring of CCG dependencies
Aditya Bhargava | Gerald Penn
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Aditya Bhargava | Gerald Penn
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
In statistical parsing with CCG, the standard evaluation method is based on predicate-argument structure and evaluates dependencies labelled in part by lexical categories. When a predicate has multiple argument slots that can be filled, the same lexical category is used for the label of multiple dependencies. In this paper, we show that this evaluation can result in disproportionate penalization of supertagging errors and obfuscate the truly erroneous dependencies. Enabled by the compositional nature of CCG lexical categories, we propose *decomposed scoring* based on subcategorial labels to address this. To evaluate our scoring method, we engage fellow categorial grammar researchers in two English-language judgement tasks: (1) directly ranking the outputs of the standard and experimental scoring methods; and (2) determining which of two sentences has the better parse in cases where the two scoring methods disagree on their ranks. Overall, the judges prefer decomposed scoring in each task; but there is substantial disagreement among the judges in 24% of the given cases, pointing to potential issues with parser evaluations in general.
Discourse Information for Document-Level Temporal Dependency Parsing
Jingcheng Niu | Victoria Ng | Erin Rees | Simon De Montigny | Gerald Penn
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)
Jingcheng Niu | Victoria Ng | Erin Rees | Simon De Montigny | Gerald Penn
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)
In this study, we examine the benefits of incorporating discourse information into document-level temporal dependency parsing. Specifically, we evaluate the effectiveness of integrating both high-level discourse profiling information, which describes the discourse function of sentences, and surface-level sentence position information into temporal dependency graph (TDG) parsing. Unexpectedly, our results suggest that simple sentence position information, particularly when encoded using our novel sentence-position embedding method, performs the best, perhaps because it does not rely on noisy model-generated feature inputs. Our proposed system surpasses the current state-of-the-art TDG parsing systems in performance. Furthermore, we aim to broaden the discussion on the relationship between temporal dependency parsing and discourse analysis, given the substantial similarities shared between the two tasks. We argue that discourse analysis results should not be merely regarded as an additional input feature for temporal dependency parsing. Instead, adopting advanced discourse analysis techniques and research insights can lead to more effective and comprehensive approaches to temporal information extraction tasks.
2022
Using Roark-Hollingshead Distance to Probe BERT’s Syntactic Competence
Jingcheng Niu | Wenjie Lu | Eric Corlett | Gerald Penn
Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
Jingcheng Niu | Wenjie Lu | Eric Corlett | Gerald Penn
Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
Probing BERT’s general ability to reason about syntax is no simple endeavour, primarily because of the uncertainty surrounding how large language models represent syntactic structure. Many prior accounts of BERT’s agility as a syntactic tool (Clark et al., 2013; Lau et al., 2014; Marvin and Linzen, 2018; Chowdhury and Zamparelli, 2018; Warstadt et al., 2019, 2020; Hu et al., 2020) have therefore confined themselves to studying very specific linguistic phenomena, and there has still been no definitive answer as to whether BERT “knows” syntax. The advent of perturbed masking (Wu et al., 2020) would then seem to be significant, because this is a parameter-free probing method that directly samples syntactic trees from BERT’s embeddings. These sampled trees outperform a right-branching baseline, thus providing preliminary evidence that BERT’s syntactic competence bests a simple baseline. This baseline is underwhelming, however, and our reappraisal below suggests that this result, too, is inconclusive. We propose RH Probe, an encoder-decoder probing architecture that operates on two probing tasks. We find strong empirical evidence confirming the existence of important syntactic information in BERT, but this information alone appears not to be enough to reproduce syntax in its entirety. Our probe makes crucial use of a conjecture made by Roark and Holling-shead (2008) that a particular lexical annotation that we shall call RH distance is a sufficient encoding of unlabelled binary syntactic trees, and we prove this conjecture.
Does BERT Rediscover a Classical NLP Pipeline?
Jingcheng Niu | Wenjie Lu | Gerald Penn
Proceedings of the 29th International Conference on Computational Linguistics
Jingcheng Niu | Wenjie Lu | Gerald Penn
Proceedings of the 29th International Conference on Computational Linguistics
Does BERT store surface knowledge in its bottom layers, syntactic knowledge in its middle layers, and semantic knowledge in its upper layers? In re-examining Jawahar et al. (2019) and Tenney et al.’s (2019a) probes into the structure of BERT, we have found that the pipeline-like separation that they asserted lacks conclusive empirical support. BERT’s structure is, however, linguistically founded, although perhaps in a way that is more nuanced than can be explained by layers alone. We introduce a novel probe, called GridLoc, through which we can also take into account token positions, training rounds, and random seeds. Using GridLoc, we are able to detect other, stronger regularities that suggest that pseudo-cognitive appeals to layer depth may not be the preferable mode of explanation for BERT’s inner workings.
A Taxonomical NLP Blueprint to Support Financial Decision Making through Information-Centred Interactions
Siavash Kazemian | Cosmin Munteanu | Gerald Penn
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)
Siavash Kazemian | Cosmin Munteanu | Gerald Penn
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)
Investment management professionals (IMPs) often make decisions after manual analysis of text transcripts of central banks’ conferences or companies’ earning calls. Their current software tools, while interactive, largely leave users unassisted in using these transcripts. A key component to designing speech and NLP techniques for this community is to qualitatively characterize their perceptions of AI as well as their legitimate needs so as to (1) better apply existing NLP methods, (2) direct future research and (3) correct IMPs’ perceptions of what AI is capable of. This paper presents such a study, through a contextual inquiry with eleven IMPs, uncovering their information practices when using such transcripts. We then propose a taxonomy of user requirements and usability criteria to support IMP decision making, and validate the taxonomy through participatory design workshops with four IMPs. Our investigation suggests that: (1) IMPs view visualization methods and natural language processing algorithms primarily as time-saving tools that are incapable of enhancing either discovery or interpretation and (2) their existing software falls well short of the state of the art in both visualization and NLP.
Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing
Dmitry Ustalov | Yanjun Gao | Alexander Panchenko | Marco Valentino | Mokanarangan Thayaparan | Thien Huu Nguyen | Gerald Penn | Arti Ramesh | Abhik Jana
Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing
Dmitry Ustalov | Yanjun Gao | Alexander Panchenko | Marco Valentino | Mokanarangan Thayaparan | Thien Huu Nguyen | Gerald Penn | Arti Ramesh | Abhik Jana
Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing
2021
Feature Structures in the Wild: A Case Study in Mixing Traditional Linguistic Knowledge Representation with Neural Language Models
Gerald Penn | Ken Shi
Proceedings of the ESSLLI 2021 Workshop on Computing Semantics with Types, Frames and Related Structures
Gerald Penn | Ken Shi
Proceedings of the ESSLLI 2021 Workshop on Computing Semantics with Types, Frames and Related Structures
Reanalyzing the Most Probable Sentence Problem: A Case Study in Explicating the Role of Entropy in Algorithmic Complexity
Eric Corlett | Gerald Penn
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Eric Corlett | Gerald Penn
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
When working with problems in natural language processing, we can find ourselves in situations where the traditional measurements of descriptive complexity are ineffective at describing the behaviour of our algorithms. It is easy to see why — the models we use are often general frameworks into which difficult-to-define tasks can be embedded. These frameworks can have more power than we typically use, and so complexity measures such as worst-case running time can drastically overestimate the cost of running our algorithms. In particular, they can make an apparently tractable problem seem NP-complete. Using empirical studies to evaluate performance is a necessary but incomplete method of dealing with this mismatch, since these studies no longer act as a guarantee of good performance. In this paper we use statistical measures such as entropy to give an updated analysis of the complexity of the NP-complete Most Probable Sentence problem for pCFGs, which can then be applied to word sense disambiguation and inference tasks. We can bound both the running time and the error in a simple search algorithm, allowing for a much faster search than the NP-completeness of this problem would suggest.
The Chinese Remainder Theorem for Compact, Task-Precise, Efficient and Secure Word Embeddings
Patricia Thaine | Gerald Penn
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Patricia Thaine | Gerald Penn
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
The growing availability of powerful mobile devices and other edge devices, together with increasing regulatory and security concerns about the exchange of personal information across networks of these devices has challenged the Computational Linguistics community to develop methods that are at once fast, space-efficient, accurate and amenable to secure encoding schemes such as homomorphic encryption. Inspired by recent work that restricts floating point precision to speed up neural network training in hardware-based SIMD, we have developed a method for compressing word vector embeddings into integers using the Chinese Reminder Theorem that speeds up addition by up to 48.27% and at the same time compresses GloVe word embedding libraries by up to 25.86%. We explore the practicality of this simple approach by investigating the trade-off between precision and performance in two NLP tasks: compositional semantic relatedness and opinion target sentiment classification. We find that in both tasks, lowering floating point number precision results in negligible changes to performance.
Proof Net Structure for Neural Lambek Categorial Parsing
Aditya Bhargava | Gerald Penn
Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021)
Aditya Bhargava | Gerald Penn
Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021)
In this paper, we present the first statistical parser for Lambek categorial grammar (LCG), a grammatical formalism for which the graphical proof method known as *proof nets* is applicable. Our parser incorporates proof net structure and constraints into a system based on self-attention networks via novel model elements. Our experiments on an English LCG corpus show that incorporating term graph structure is helpful to the model, improving both parsing accuracy and coverage. Moreover, we derive novel loss functions by expressing proof net constraints as differentiable functions of our model output, enabling us to train our parser without ground-truth derivations.
A Generative Process for Lambek Categorial Proof Nets
Jinman Zhao | Gerald Penn
Proceedings of the 17th Meeting on the Mathematics of Language
Jinman Zhao | Gerald Penn
Proceedings of the 17th Meeting on the Mathematics of Language
Statistically Evaluating Social Media Sentiment Trends towards COVID-19 Non-Pharmaceutical Interventions with Event Studies
Jingcheng Niu | Erin Rees | Victoria Ng | Gerald Penn
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task
Jingcheng Niu | Erin Rees | Victoria Ng | Gerald Penn
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task
In the midst of a global pandemic, understanding the public’s opinion of their government’s policy-level, non-pharmaceutical interventions (NPIs) is a crucial component of the health-policy-making process. Prior work on CoViD-19 NPI sentiment analysis by the epidemiological community has proceeded without a method for properly attributing sentiment changes to events, an ability to distinguish the influence of various events across time, a coherent model for predicting the public’s opinion of future events of the same sort, nor even a means of conducting significance tests. We argue here that this urgently needed evaluation method does already exist. In the financial sector, event studies of the fluctuations in a publicly traded company’s stock price are commonplace for determining the effects of earnings announcements, product placements, etc. The same method is suitable for analysing temporal sentiment variation in the light of policy-level NPIs. We provide a case study of Twitter sentiment towards policy-level NPIs in Canada. Our results confirm a generally positive connection between the announcements of NPIs and Twitter sentiment, and we document a promising correlation between the results of this study and a public-health survey of popular compliance with NPIs.
Structural Realization with GGNNs
Jinman Zhao | Gerald Penn | Huan Ling
Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)
Jinman Zhao | Gerald Penn | Huan Ling
Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)
In this paper, we define an abstract task called structural realization that generates words given a prefix of words and a partial representation of a parse tree. We also present a method for solving instances of this task using a Gated Graph Neural Network (GGNN). We evaluate it with standard accuracy measures, as well as with respect to perplexity, in which its comparison to previous work on language modelling serves to quantify the information added to a lexical selection task by the presence of syntactic knowledge. That the addition of parse-tree-internal nodes to this neural model should improve the model, with respect both to accuracy and to more conventional measures such as perplexity, may seem unsurprising, but previous attempts have not met with nearly as much success. We have also learned that transverse links through the parse tree compromise the model’s accuracy at generating adjectival and nominal parts of speech.
2020
Grammaticality and Language Modelling
Jingcheng Niu | Gerald Penn
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Jingcheng Niu | Gerald Penn
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems
Ever since Pereira (2000) provided evidence against Chomsky’s (1957) conjecture that statistical language modelling is incommensurable with the aims of grammaticality prediction as a research enterprise, a new area of research has emerged that regards statistical language models as “psycholinguistic subjects” and probes their ability to acquire syntactic knowledge. The advent of The Corpus of Linguistic Acceptability (CoLA) (Warstadt et al., 2019) has earned a spot on the leaderboard for acceptability judgements, and the polemic between Lau et al. (2017) and Sprouse et al. (2018) has raised fundamental questions about the nature of grammaticality and how acceptability judgements should be elicited. All the while, we are told that neural language models continue to improve. That is not an easy claim to test at present, however, because there is almost no agreement on how to measure their improvement when it comes to grammaticality and acceptability judgements. The GLUE leaderboard bundles CoLA together with a Matthews correlation coefficient (MCC), although probably because CoLA’s seminal publication was using it to compute inter-rater reliabilities. Researchers working in this area have used other accuracy and correlation scores, often driven by a need to reconcile and compare various discrete and continuous variables with each other. The score that we will advocate for in this paper, the point biserial correlation, in fact compares a discrete variable (for us, acceptability judgements) to a continuous variable (for us, neural language model probabilities). The only previous work in this area to choose the PBC that we are aware of is Sprouse et al. (2018a), and that paper actually applied it backwards (with some justification) so that the language model probability was treated as the discrete binary variable by setting a threshold. With the PBC in mind, we will first reappraise some recent work in syntactically targeted linguistic evaluations (Hu et al., 2020), arguing that while their experimental design sets a new high watermark for this topic, their results may not prove what they have claimed. We then turn to the task-independent assessment of language models as grammaticality classifiers. Prior to the introduction of the GLUE leaderboard, the vast majority of this assessment was essentially anecdotal, and we find the use of the MCC in this regard to be problematic. We conduct several studies with PBCs to compare several popular language models. We also study the effects of several variables such as normalization and data homogeneity on PBC.
Temporal Histories of Epidemic Events (THEE): A Case Study in Temporal Annotation for Public Health
Jingcheng Niu | Victoria Ng | Gerald Penn | Erin E. Rees
Proceedings of the Twelfth Language Resources and Evaluation Conference
Jingcheng Niu | Victoria Ng | Gerald Penn | Erin E. Rees
Proceedings of the Twelfth Language Resources and Evaluation Conference
We present a new temporal annotation standard, THEE-TimeML, and a corpus TheeBank enabling precise temporal information extraction (TIE) for event-based surveillance (EBS) systems in the public health domain. Current EBS must estimate the occurrence time of each event based on coarse document metadata such as document publication time. Because of the complicated language and narration style of news articles, estimated case outbreak times are often inaccurate or even erroneous. Thus, it is necessary to create annotation standards and corpora to facilitate the development of TIE systems in the public health domain to address this problem. We will discuss the adaptations that have proved necessary for this domain as we present THEE-TimeML and TheeBank. Finally, we document the corpus annotation process, and demonstrate the immediate benefit to public health applications brought by the annotations.
FAB: The French Absolute Beginner Corpus for Pronunciation Training
Sean Robertson | Cosmin Munteanu | Gerald Penn
Proceedings of the Twelfth Language Resources and Evaluation Conference
Sean Robertson | Cosmin Munteanu | Gerald Penn
Proceedings of the Twelfth Language Resources and Evaluation Conference
We introduce the French Absolute Beginner (FAB) speech corpus. The corpus is intended for the development and study of Computer-Assisted Pronunciation Training (CAPT) tools for absolute beginner learners. Data were recorded during two experiments focusing on using a CAPT system in paired role-play tasks. The setting grants FAB three distinguishing features from other non-native corpora: the experimental setting is ecologically valid, closing the gap between training and deployment; it features a label set based on teacher feedback, allowing for context-sensitive CAPT; and data have been primarily collected from absolute beginners, a group often ignored. Participants did not read prompts, but instead recalled and modified dialogues that were modelled in videos. Unable to distinguish modelled words solely from viewing videos, speakers often uttered unintelligible or out-of-L2 words. The corpus is split into three partitions: one from an experiment with minimal feedback; another with explicit, word-level feedback; and a third with supplementary read-and-record data. A subset of words in the first partition has been labelled as more or less native, with inter-annotator agreement reported. In the explicit feedback partition, labels are derived from the experiment’s online feedback. The FAB corpus is scheduled to be made freely available by the end of 2020.
Supertagging with CCG primitives
Aditya Bhargava | Gerald Penn
Proceedings of the 5th Workshop on Representation Learning for NLP
Aditya Bhargava | Gerald Penn
Proceedings of the 5th Workshop on Representation Learning for NLP
In CCG and other highly lexicalized grammars, supertagging a sentence’s words with their lexical categories is a critical step for efficient parsing. Because of the high degree of lexicalization in these grammars, the lexical categories can be very complex. Existing approaches to supervised CCG supertagging treat the categories as atomic units, even when the categories are not simple; when they encounter words with categories unseen during training, their guesses are accordingly unsophisticated. In this paper, we make use of the primitives and operators that constitute the lexical categories of categorial grammars. Instead of opaque labels, we treat lexical categories themselves as linear sequences. We present an LSTM-based model that replaces standard word-level classification with prediction of a sequence of primitives, similarly to LSTM decoders. Our model obtains state-of-the-art word accuracy for single-task English CCG supertagging, increases parser coverage and F1, and is able to produce novel categories. Analysis shows a synergistic effect between this decomposed view and incorporation of prediction history.
2019
Rationally Reappraising ATIS-based Dialogue Systems
Jingcheng Niu | Gerald Penn
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Jingcheng Niu | Gerald Penn
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
The Air Travel Information Service (ATIS) corpus has been the most common benchmark for evaluating Spoken Language Understanding (SLU) tasks for more than three decades since it was released. Recent state-of-the-art neural models have obtained F1-scores near 98% on the task of slot filling. We developed a rule-based grammar for the ATIS domain that achieves a 95.82% F1-score on our evaluation set. In the process, we furthermore discovered numerous shortcomings in the ATIS corpus annotation, which we have fixed. This paper presents a detailed account of these shortcomings, our proposed repairs, our rule-based grammar and the neural slot-filling architectures associated with ATIS. We also rationally reappraise the motivations for choosing a neural architecture in view of this account. Fixing the annotation errors results in a relative error reduction of between 19.4 and 52% across all architectures. We nevertheless argue that neural models must play a different role in ATIS dialogues because of the latter’s lack of variety.
Proceedings of the 16th Meeting on the Mathematics of Language
Philippe de Groote | Frank Drewes | Gerald Penn
Proceedings of the 16th Meeting on the Mathematics of Language
Philippe de Groote | Frank Drewes | Gerald Penn
Proceedings of the 16th Meeting on the Mathematics of Language
2017
Vowel and Consonant Classification through Spectral Decomposition
Patricia Thaine | Gerald Penn
Proceedings of the First Workshop on Subword and Character Level Models in NLP
Patricia Thaine | Gerald Penn
Proceedings of the First Workshop on Subword and Character Level Models in NLP
We consider two related problems in this paper. Given an undeciphered alphabetic writing system or mono-alphabetic cipher, determine: (1) which of its letters are vowels and which are consonants; and (2) whether the writing system is a vocalic alphabet or an abjad. We are able to show that a very simple spectral decomposition based on character co-occurrences provides nearly perfect performance with respect to answering both question types.
2016
Evaluating Sentiment Analysis in the Context of Securities Trading
Siavash Kazemian | Shunan Zhao | Gerald Penn
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Siavash Kazemian | Shunan Zhao | Gerald Penn
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2014
Unsupervised Sentence Enhancement for Automatic Summarization
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Evaluating Sentiment Analysis Evaluation: A Case Study in Securities Trading
Siavash Kazemian | Shunan Zhao | Gerald Penn
Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Siavash Kazemian | Shunan Zhao | Gerald Penn
Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
2013
Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The mathematics of language learning
András Kornai | Gerald Penn | James Rogers | Anssi Yli-Jyrä
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Tutorials)
András Kornai | Gerald Penn | James Rogers | Anssi Yli-Jyrä
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Tutorials)
Why Letter Substitution Puzzles are Not Hard to Solve: A Case Study in Entropy and Probabilistic Search-Complexity
Eric Corlett | Gerald Penn
Proceedings of the 13th Meeting on the Mathematics of Language (MoL 13)
Eric Corlett | Gerald Penn
Proceedings of the 13th Meeting on the Mathematics of Language (MoL 13)
2012
Flexible Structural Analysis of Near-Meet-Semilattices for Typed Unification-Based Grammar Design
Rouzbeh Farahmand | Gerald Penn
Proceedings of COLING 2012
Rouzbeh Farahmand | Gerald Penn
Proceedings of COLING 2012
On Panini and the Generative Capacity of Contextualized Replacement Systems
Gerald Penn | Paul Kiparsky
Proceedings of COLING 2012: Posters
Gerald Penn | Paul Kiparsky
Proceedings of COLING 2012: Posters
Evaluating Distributional Models of Semantics for Syntactically Invariant Inference
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Unsupervised Detection of Downward-Entailing Operators By Maximizing Classification Certainty
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Miriam Butt | Sheelagh Carpendale | Gerald Penn | Jelena Prokić | Michael Cysouw
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Miriam Butt | Sheelagh Carpendale | Gerald Penn | Jelena Prokić | Michael Cysouw
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Ecological Validity and the Evaluation of Speech Summarization Quality
Anthony McCallum | Cosmin Munteanu | Gerald Penn | Xiaodan Zhu
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization
Anthony McCallum | Cosmin Munteanu | Gerald Penn | Xiaodan Zhu
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization
2011
Indexing Spoken Documents with Hierarchical Semantic Structures: Semantic Tree-to-string Alignment Models
Xiaodan Zhu | Colin Cherry | Gerald Penn
Proceedings of 5th International Joint Conference on Natural Language Processing
Xiaodan Zhu | Colin Cherry | Gerald Penn
Proceedings of 5th International Joint Conference on Natural Language Processing
2010
The Quantitative Study of Writing Systems
Gerald Penn
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Conférences invitées
Gerald Penn
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Conférences invitées
Imposing Hierarchical Browsing Structures onto Spoken Documents
Xiaodan Zhu | Colin Cherry | Gerald Penn
Coling 2010: Posters
Xiaodan Zhu | Colin Cherry | Gerald Penn
Coling 2010: Posters
Utilizing Extra-Sentential Context for Parsing
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Ron Kaplan | Jill Burstein | Mary Harper | Gerald Penn
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Ron Kaplan | Jill Burstein | Mary Harper | Gerald Penn
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Entity-Based Local Coherence Modelling Using Topological Fields
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Accurate Context-Free Parsing with Combinatory Categorial Grammar
Timothy A. D. Fowler | Gerald Penn
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Timothy A. D. Fowler | Gerald Penn
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
An Exact A* Method for Deciphering Letter-Substitution Ciphers
Eric Corlett | Gerald Penn
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Eric Corlett | Gerald Penn
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices
Matthew Skala | Victoria Krakovna | János Kramár | Gerald Penn
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Matthew Skala | Victoria Krakovna | János Kramár | Gerald Penn
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
2009
Topological Field Parsing of German
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Jackie Chi Kit Cheung | Gerald Penn
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Summarizing multiple spoken documents: finding evidence from untranscribed audio
Xiaodan Zhu | Gerald Penn | Frank Rudzicz
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Xiaodan Zhu | Gerald Penn | Frank Rudzicz
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Improving Automatic Speech Recognition for Lectures through Transformation-based Rules Learned from Minimal Data
Cosmin Munteanu | Gerald Penn | Xiaodan Zhu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Cosmin Munteanu | Gerald Penn | Xiaodan Zhu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
2008
A Critical Reassessment of Evaluation Baselines for Speech Summarization
Gerald Penn | Xiaodan Zhu
Proceedings of ACL-08: HLT
Gerald Penn | Xiaodan Zhu
Proceedings of ACL-08: HLT
Interactive Visualization for Computational Linguistics
Christopher Collins | Gerald Penn | Sheelagh Carpendale
Tutorial Abstracts of ACL-08: HLT
Christopher Collins | Gerald Penn | Sheelagh Carpendale
Tutorial Abstracts of ACL-08: HLT
Proceedings of the Workshop on Parsing German
Sandra Kübler | Gerald Penn
Proceedings of the Workshop on Parsing German
Sandra Kübler | Gerald Penn
Proceedings of the Workshop on Parsing German
2006
Quantitative Methods for Classifying Writing Systems
Gerald Penn | Travis Choma
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Gerald Penn | Travis Choma
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Comparing the roles of textual, acoustic and spoken-language features on spontaneous-conversation summarization
Xiaodan Zhu | Gerald Penn
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Xiaodan Zhu | Gerald Penn
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Control Strategies for Parsing with Freer Word-Order Languages
Gerald Penn | Stefan Banjevic | Michael Demko
Proceedings of the Third Workshop on Constraints and Language Processing
Gerald Penn | Stefan Banjevic | Michael Demko
Proceedings of the Third Workshop on Constraints and Language Processing
2004
Optimizing Typed Feature Structure Grammar Parsing through Non-Statistical Indexing
Cosmin Munteanu | Gerald Penn
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Cosmin Munteanu | Gerald Penn
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Head-Driven Parsing for Word Lattices
Christopher Collins | Bob Carpenter | Gerald Penn
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Christopher Collins | Bob Carpenter | Gerald Penn
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Balancing Clarity and Efficiency in Typed Feature Logic Through Delaying
Gerald Penn
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Gerald Penn
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
2003
AVM Description Compilation using Types as Modes
Gerald Penn
10th Conference of the European Chapter of the Association for Computational Linguistics
Gerald Penn
10th Conference of the European Chapter of the Association for Computational Linguistics
Topological Parsing
Gerald Penn | Mohammad Haji-Abdolhosseini
10th Conference of the European Chapter of the Association for Computational Linguistics
Gerald Penn | Mohammad Haji-Abdolhosseini
10th Conference of the European Chapter of the Association for Computational Linguistics
Book Reviews: Linguistic Evolution through Language Acquisition: Formal and Computational Models edited by Ted Briscoe; Implementing Typed Feature Structure Grammars by Ann Copestake
Michael A. Arbib | Gerald Penn
Computational Linguistics, Volume 29, Number 3, September 2003: Special Issue on the Web as Corpus
Michael A. Arbib | Gerald Penn
Computational Linguistics, Volume 29, Number 3, September 2003: Special Issue on the Web as Corpus
A Tabulation-Based Parsing Method that Reduces Copying
Gerald Penn | Cosmin Munteanu
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics
Gerald Penn | Cosmin Munteanu
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics
2002
Generalized Encoding of Description Spaces and its Application to Typed Feature Structures
Gerald Penn
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics
Gerald Penn
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics
A Web-based Instructional Platform for Contraint-Based Grammar Formalisms and Parsing
W. Detmar Meurers | Gerald Penn | Frank Richter
Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics
W. Detmar Meurers | Gerald Penn | Frank Richter
Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics
2001
Tractability and Structural Closures in Attribute Logic Type Signatures
Gerald Penn
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics
Gerald Penn
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics
2000
Book Reviews: The Mathematics of Syntactic Structure: Trees and their Logics
Gerald Penn
Computational Linguistics, Volume 26, Number 2, June 2000
Gerald Penn
Computational Linguistics, Volume 26, Number 2, June 2000
1998
Parametric Types for Typed Attribute-Value Logic
Gerald Penn
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics
Gerald Penn
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics
Parametric Types for Typed Attribute-Value Logic
Gerald Penn
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2
Gerald Penn
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2
1997
Head-Driven Generation and Indexing in ALE
Gerald Penn
Computational Environments for Grammar Development and Linguistic Engineering
Gerald Penn
Computational Environments for Grammar Development and Linguistic Engineering
1994
Search
Fix author
Co-authors
- Jingcheng Niu 10
- Jackie Chi Kit Cheung 8
- Jinman Zhao 7
- Xiaodan Zhu 7
- Cosmin Munteanu 6
- Aditya Bhargava 4
- Eric Corlett 4
- Victoria Ng 4
- Siavash Kazemian 3
- Ken Shi 3
- Sheelagh Carpendale 2
- Colin Cherry 2
- Christopher Collins 2
- Simon De Montigny 2
- Timothy A. D. Fowler 2
- Yanjun Gao 2
- Wenjie Lu 2
- Alexander Panchenko 2
- Arti Ramesh 2
- Erin Rees 2
- Patricia Thaine 2
- Dmitry Ustalov 2
- Marco Valentino 2
- Shunan Zhao 2
- Michael A. Arbib 1
- Stefan Banjevic 1
- Jill Burstein 1
- Miriam Butt 1
- Bob Carpenter 1
- Xi Chen 1
- Travis Choma 1
- Michael Cysouw 1
- Megan Deering 1
- Michael Demko 1
- Frank Drewes 1
- Rouzbeh Farahmand 1
- Mohammad Haji-Abdolhosseini 1
- Mary Harper 1
- Paul He 1
- Yulan Hu 1
- Abhik Jana 1
- Ronald M. Kaplan 1
- Paul Kiparsky 1
- András Kornai 1
- Victoria Krakovna 1
- János Kramár 1
- Sandra Kübler 1
- Jiaru Li 1
- Saifei Liao 1
- Huan Ling 1
- Andrew Yuan Liu 1
- Anthony McCallum 1
- Detmar Meurers 1
- Erxue Min 1
- Thien Huu Nguyen 1
- Irina Nikishina 1
- Jelena Prokić 1
- Erin E. Rees 1
- Frank Richter 1
- Sean Robertson 1
- James Rogers 1
- Frank Rudzicz 1
- Andrey Sakhovskiy 1
- Matthew Skala 1
- Mokanarangan Thayaparan 1
- Richmond H. Thomason 1
- Elena Tutubalina 1
- Ricardo Usbeck 1
- Anssi Yli-Jyrä 1
- Lei Yu 1
- Tian Yu 1
- Xueyan Zhang 1
- Zixin Zhao 1
- Zining Zhu 1
- Philippe de Groote 1