2023
pdf
abs
Semantic Accuracy in Natural Language Generation: A Thesis Proposal
Patricia Schmidtova
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
With the fast-growing popularity of current large pre-trained language models (LLMs), it is necessary to dedicate efforts to making them more reliable. In this thesis proposal, we aim to improve the reliability of natural language generation systems (NLG) by researching the semantic accuracy of their outputs. We look at this problem from the outside (evaluation) and from the inside (interpretability). We propose a novel method for evaluating semantic accuracy and discuss the importance of working towards a unified and objective benchmark for NLG metrics. We also review interpretability approaches which could help us pinpoint the sources of inaccuracies within the models and explore potential mitigation strategies.
2022
pdf
abs
THEaiTRobot: An Interactive Tool for Generating Theatre Play Scripts
Rudolf Rosa
|
Patrícia Schmidtová
|
Alisa Zakhtarenko
|
Ondrej Dusek
|
Tomáš Musil
|
David Mareček
|
Saad Ul Islam
|
Marie Novakova
|
Klara Vosecka
|
Daniel Hrbek
|
David Kostak
Proceedings of the 15th International Conference on Natural Language Generation: System Demonstrations
We present a free online demo of THEaiTRobot, an open-source bilingual tool for interactively generating theatre play scripts, in two versions. THEaiTRobot 1.0 uses the GPT-2 language model with minimal adjustments. THEaiTRobot 2.0 uses two models created by fine-tuning GPT-2 on purposefully collected and processed datasets and several other components, generating play scripts in a hierarchical fashion (title → synopsis → script). The underlying tool is used in the THEaiTRE project to generate scripts for plays, which are then performed on stage by a professional theatre.
pdf
abs
GPT-2-based Human-in-the-loop Theatre Play Script Generation
Rudolf Rosa
|
Patrícia Schmidtová
|
Ondřej Dušek
|
Tomáš Musil
|
David Mareček
|
Saad Obaid
|
Marie Nováková
|
Klára Vosecká
|
Josef Doležal
Proceedings of the 4th Workshop of Narrative Understanding (WNU2022)
We experiment with adapting generative language models for the generation of long coherent narratives in the form of theatre plays. Since fully automatic generation of whole plays is not currently feasible, we created an interactive tool that allows a human user to steer the generation somewhat while minimizing intervention. We pursue two approaches to long-text generation: a flat generation with summarization of context, and a hierarchical text-to-text two-stage approach, where a synopsis is generated first and then used to condition generation of the final script. Our preliminary results and discussions with theatre professionals show improvements over vanilla language model generation, but also identify important limitations of our approach.