Jinman Zhao


2021

pdf
Structural Realization with GGNNs
Jinman Zhao | Gerald Penn | Huan Ling
Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)

In this paper, we define an abstract task called structural realization that generates words given a prefix of words and a partial representation of a parse tree. We also present a method for solving instances of this task using a Gated Graph Neural Network (GGNN). We evaluate it with standard accuracy measures, as well as with respect to perplexity, in which its comparison to previous work on language modelling serves to quantify the information added to a lexical selection task by the presence of syntactic knowledge. That the addition of parse-tree-internal nodes to this neural model should improve the model, with respect both to accuracy and to more conventional measures such as perplexity, may seem unsurprising, but previous attempts have not met with nearly as much success. We have also learned that transverse links through the parse tree compromise the model’s accuracy at generating adjectival and nominal parts of speech.

pdf bib
A Generative Process for Lambek Categorial Proof Nets
Jinman Zhao | Gerald Penn
Proceedings of the 17th Meeting on the Mathematics of Language

2018

pdf
Generalizing Word Embeddings using Bag of Subwords
Jinman Zhao | Sidharth Mudgal | Yingyu Liang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We approach the problem of generalizing pre-trained word embeddings beyond fixed-size vocabularies without using additional contextual information. We propose a subword-level word vector generation model that views words as bags of character n-grams. The model is simple, fast to train and provides good vectors for rare or unseen words. Experiments show that our model achieves state-of-the-art performances in English word similarity task and in joint prediction of part-of-speech tag and morphosyntactic attributes in 23 languages, suggesting our model’s ability in capturing the relationship between words’ textual representations and their embeddings.