John Bauer


Semgrex and Ssurgeon, Searching and Manipulating Dependency Graphs
John Bauer | Chloé Kiddon | Eric Yeh | Alex Shan | Christopher D. Manning
Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT/SyntaxFest 2023)

Searching dependency graphs and manipulating them can be a time consuming and challenging task to get right. We document Semgrex, a system for searching dependency graphs, and introduce Ssurgeon, a system for manipulating the output of Semgrex. The compact language used by these systems allows for easy command line or API processing of dependencies. Additionally, integration with publicly released toolkits in Java and Python allows for searching text relations and attributes over natural text.


The Stanford CoreNLP Natural Language Processing Toolkit
Christopher Manning | Mihai Surdeanu | John Bauer | Jenny Finkel | Steven Bethard | David McClosky
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

A Gold Standard Dependency Corpus for English
Natalia Silveira | Timothy Dozat | Marie-Catherine de Marneffe | Samuel Bowman | Miriam Connor | John Bauer | Chris Manning
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present a gold standard annotation of syntactic dependencies in the English Web Treebank corpus using the Stanford Dependencies formalism. This resource addresses the lack of a gold standard dependency treebank for English, as well as the limited availability of gold standard syntactic annotations for English informal text genres. We also present experiments on the use of this resource, both for training dependency parsers and for evaluating the quality of different versions of the Stanford Parser, which includes a converter tool to produce dependency annotation from constituency trees. We show that training a dependency parser on a mix of newswire and web data leads to better performance on that type of data without hurting performance on newswire text, and therefore gold standard annotations for non-canonical text can be a valuable resource for parsing. Furthermore, the systematic annotation effort has informed both the SD formalism and its implementation in the Stanford Parser’s dependency converter. In response to the challenges encountered by annotators in the EWT corpus, the formalism has been revised and extended, and the converter has been improved.


Feature-Rich Phrase-based Translation: Stanford University’s Submission to the WMT 2013 Translation Task
Spence Green | Daniel Cer | Kevin Reschke | Rob Voigt | John Bauer | Sida Wang | Natalia Silveira | Julia Neidert | Christopher D. Manning
Proceedings of the Eighth Workshop on Statistical Machine Translation

Parsing with Compositional Vector Grammars
Richard Socher | John Bauer | Christopher D. Manning | Andrew Y. Ng
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)


Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French
Spence Green | Marie-Catherine de Marneffe | John Bauer | Christopher D. Manning
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing