Shawn Tan


2021

pdf bib
Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle
Yikang Shen | Shawn Tan | Alessandro Sordoni | Siva Reddy | Aaron Courville
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Syntax is fundamental to our thinking about language. Failing to capture the structure of input language could lead to generalization problems and over-parametrization. In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM). The model explicitly models the structure with an incremental parser and maintains the conditional probability setting of a standard language model (left-to-right). To train the incremental parser and avoid exposure bias, we also propose a novel dynamic oracle, so that SOM is more robust to wrong parsing decisions. Experiments show that SOM can achieve strong results in language modeling, incremental parsing, and syntactic generalization tests while using fewer parameters than other models.

2020

pdf bib
Recursive Top-Down Production for Sentence Generation with Latent Trees
Shawn Tan | Yikang Shen | Alessandro Sordoni | Aaron Courville | Timothy J. O’Donnell
Findings of the Association for Computational Linguistics: EMNLP 2020

We model the recursive production property of context-free grammars for natural and synthetic languages. To this end, we present a dynamic programming algorithm that marginalises over latent binary tree structures with N leaves, allowing us to compute the likelihood of a sequence of N tokens under a latent tree model, which we maximise to train a recursive neural function. We demonstrate performance on two synthetic tasks: SCAN, where it outperforms previous models on the LENGTH split, and English question formation, where it performs comparably to decoders with the ground-truth tree structure. We also present experimental results on German-English translation on the Multi30k dataset, and qualitatively analyse the induced tree structures our model learns for the SCAN tasks and the German-English translation task.