2024
pdf
abs
BinaryAlign: Word Alignment as Binary Sequence Labeling
Gaetan Latouche
|
Marc-André Carbonneau
|
Benjamin Swanson
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Real world deployments of word alignment are almost certain to cover both high and low resource languages. However, the state-of-the-art for this task recommends a different model class depending on the availability of gold alignment training data for a particular language pair. We propose BinaryAlign, a novel word alignment technique based on binary sequence labeling that outperforms existing approaches in both scenarios, offering a unifying approach to the task. Additionally, we vary the specific choice of multilingual foundation model, perform stratified error analysis over alignment error type, and explore the performance of BinaryAlign on non-English language pairs. We make our source code publicly available.
2023
pdf
abs
Generating Video Game Scripts with Style
Gaetan Lopez Latouche
|
Laurence Marcotte
|
Ben Swanson
Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023)
While modern language models can generate a scripted scene in the format of a play, movie, or video game cutscene the quality of machine generated text remains behind that of human authors. In this work, we focus on one aspect of this quality gap; generating text in the style of an arbitrary and unseen character. We propose the Style Adaptive Semiparametric Scriptwriter (SASS) which leverages an adaptive weighted style memory to generate dialog lines in accordance with a character’s speaking patterns. Using the LIGHT dataset as well as a new corpus of scripts from twenty-three AAA video games, we show that SASS not only outperforms similar models but in some cases can also be used in conjunction with them to yield further improvement.
2021
pdf
abs
Story Centaur: Large Language Model Few Shot Learning as a Creative Writing Tool
Ben Swanson
|
Kory Mathewson
|
Ben Pietrzak
|
Sherol Chen
|
Monica Dinalescu
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
Few shot learning with large language models has the potential to give individuals without formal machine learning training the access to a wide range of text to text models. We consider how this applies to creative writers and present Story Centaur, a user interface for prototyping few shot models and a set of recombinable web components that deploy them. Story Centaur’s goal is to expose creative writers to few shot learning with a simple but powerful interface that lets them compose their own co-creation tools that further their own unique artistic directions. We build out several examples of such tools, and in the process probe the boundaries and issues surrounding generation with large language models.
2020
pdf
abs
Usnea: An Authorship Tool for Interactive Fiction using Retrieval Based Semantic Parsing
Ben Swanson
|
Boris Smus
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
The reader of a choose your own adventure novel and the user of a modern virtual assistant have a subtle similarity; both may, through the right lens, be viewed as engaging with a work of Interactive Fiction. This literary form emerged in the 1970s and has grown like a vine along the branch of modern technology, one guided by the advances of the other. In this work we weave together threads from the Interactive Fiction community and neural semantic parsing for dialog systems, defining the data model and necessary algorithms for a novel type of Interactive Fiction and open sourcing its accompanying authoring tool. Specifically, our work integrates retrieval based semantic parsing predicates into the branching story structures well known to the Interactive Fiction community, relaxing the relatively strict lexical options of preexisting systems.
2014
pdf
Natural Language Generation with Vocabulary Constraints
Ben Swanson
|
Elif Yamangil
|
Eugene Charniak
Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications
pdf
Data Driven Language Transfer Hypotheses
Ben Swanson
|
Eugene Charniak
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers
2013
pdf
A Context Free TAG Variant
Ben Swanson
|
Elif Yamangil
|
Eugene Charniak
|
Stuart Shieber
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
pdf
Exploring Syntactic Representations for Native Language Identification
Ben Swanson
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications
pdf
Extracting the Native Language Signal for Second Language Acquisition
Ben Swanson
|
Eugene Charniak
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2012
pdf
Correction Detection and Error Type Selection as an ESL Educational Aid
Ben Swanson
|
Elif Yamangil
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
pdf
Native Language Detection with Tree Substitution Grammars
Benjamin Swanson
|
Eugene Charniak
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)