Frank Drewes

2023

pdf
ADCluster: Adaptive Deep Clustering for Unsupervised Learning from Unlabeled Documents
Arezoo Hatefi | Xuan-Son Vu | Monowar Bhuyan | Frank Drewes
Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023)

2022

pdf abs
Dynamic Topic Modeling by Clustering Embeddings from Pretrained Language Models: A Research Proposal
Anton Eklund | Mona Forsman | Frank Drewes
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop

A new trend in topic modeling research is to do Neural Topic Modeling by Clustering document Embeddings (NTM-CE) created with a pretrained language model. Studies have evaluated static NTM-CE models and found them performing comparably to, or even better than other topic models. An important extension of static topic modeling is making the models dynamic, allowing the study of topic evolution over time, as well as detecting emerging and disappearing topics. In this research proposal, we present two research questions to understand dynamic topic modeling with NTM-CE theoretically and practically. To answer these, we propose four phases with the aim of establishing evaluation methods for dynamic topic modeling, finding NTM-CE-specific properties, and creating a framework for dynamic NTM-CE. For evaluation, we propose to use both quantitative measurements of coherence and human evaluation supported by our recently developed tool.

pdf abs
Improved N-Best Extraction with an Evaluation on Language Data
Johanna Björklund | Frank Drewes | Anna Jonsson
Computational Linguistics, Volume 48, Issue 1 - March 2022

We show that a previously proposed algorithm for the N-best trees problem can be made more efficient by changing how it arranges and explores the search space. Given an integer N and a weighted tree automaton (wta) M over the tropical semiring, the algorithm computes N trees of minimal weight with respect to M. Compared with the original algorithm, the modifications increase the laziness of the evaluation strategy, which makes the new algorithm asymptotically more efficient than its predecessor. The algorithm is implemented in the software Betty, and compared to the state-of-the-art algorithm for extracting the N best runs, implemented in the software toolkit Tiburon. The data sets used in the experiments are wtas resulting from real-world natural language processing tasks, as well as artificially created wtas with varying degrees of nondeterminism. We find that Betty outperforms Tiburon on all tested data sets with respect to running time, while Tiburon seems to be the more memory-efficient choice.

2021

pdf bib
Proceedings of the 17th Meeting on the Mathematics of Language
Henrik Björklund | Frank Drewes
Proceedings of the 17th Meeting on the Mathematics of Language

pdf abs
Bridging Perception, Memory, and Inference through Semantic Relations
Johanna Björklund | Adam Dahlgren Lindström | Frank Drewes
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

There is a growing consensus that surface form alone does not enable models to learn meaning and gain language understanding. This warrants an interest in hybrid systems that combine the strengths of neural and symbolic methods. We favour triadic systems consisting of neural networks, knowledge bases, and inference engines. The network provides perception, that is, the interface between the system and its environment. The knowledge base provides explicit memory and thus immediate access to established facts. Finally, inference capabilities are provided by the inference engine which reflects on the perception, supported by memory, to reason and discover new facts. In this work, we probe six popular language models for semantic relations and outline a future line of research to study how the constituent subsystems can be jointly realised and integrated.

2020

pdf abs
Probing Multimodal Embeddings for Linguistic Properties: the Visual-Semantic Case
Adam Dahlgren Lindström | Johanna Björklund | Suna Bensch | Frank Drewes
Proceedings of the 28th International Conference on Computational Linguistics

Semantic embeddings have advanced the state of the art for countless natural language processing tasks, and various extensions to multimodal domains, such as visual-semantic embeddings, have been proposed. While the power of visual-semantic embeddings comes from the distillation and enrichment of information through machine learning, their inner workings are poorly understood and there is a shortage of analysis tools. To address this problem, we generalize the notion ofprobing tasks to the visual-semantic case. To this end, we (i) discuss the formalization of probing tasks for embeddings of image-caption pairs, (ii) define three concrete probing tasks within our general framework, (iii) train classifiers to probe for those properties, and (iv) compare various state-of-the-art embeddings under the lens of the proposed probing tasks. Our experiments reveal an up to 16% increase in accuracy on visual-semantic embeddings compared to the corresponding unimodal embeddings, which suggest that the text and image dimensions represented in the former do complement each other.

2019

pdf bib
A Survey of Recent Advances in Efficient Parsing for Graph Grammars
Frank Drewes
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing

pdf abs
Bottom-Up Unranked Tree-to-Graph Transducers for Translation into Semantic Graphs
Johanna Björklund | Shay B. Cohen | Frank Drewes | Giorgio Satta
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing

We propose a formal model for translating unranked syntactic trees, such as dependency trees, into semantic graphs. These tree-to-graph transducers can serve as a formal basis of transition systems for semantic parsing which recently have been shown to perform very well, yet hitherto lack formalization. Our model features “extended” rules and an arc-factored normal form, comes with an efficient translation algorithm, and can be equipped with weights in a straightforward manner.

pdf bib
Proceedings of the 16th Meeting on the Mathematics of Language
Philippe de Groote | Frank Drewes | Gerald Penn
Proceedings of the 16th Meeting on the Mathematics of Language

pdf bib
Parsing Weighted Order-Preserving Hyperedge Replacement Grammars
Henrik Björklund | Frank Drewes | Petter Ericson
Proceedings of the 16th Meeting on the Mathematics of Language

2018

pdf abs
Weighted DAG Automata for Semantic Graphs
David Chiang | Frank Drewes | Daniel Gildea | Adam Lopez | Giorgio Satta
Computational Linguistics, Volume 44, Issue 1 - April 2018

Graphs have a variety of uses in natural language processing, particularly as representations of linguistic meaning. A deficit in this area of research is a formal framework for creating, combining, and using models involving graphs that parallels the frameworks of finite automata for strings and finite tree automata for trees. A possible starting point for such a framework is the formalism of directed acyclic graph (DAG) automata, defined by Kamimura and Slutzki and extended by Quernheim and Knight. In this article, we study the latter in depth, demonstrating several new results, including a practical recognition algorithm that can be used for inference and learning with models defined on DAG automata. We also propose an extension to graphs with unbounded node degree and show that our results carry over to the extended formalism.

Co-authors

Venues

Frank Drewes

2023

2022

2021

2020

2019

2018

2017

2016

2013

2012

2011

2010

Co-authors

Venues