Karo Moilanen


2022

pdf
Topic Modeling With Topological Data Analysis
Ciarán Byrne | Danijela Horak | Karo Moilanen | Amandla Mabona
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Recent unsupervised topic modelling ap-proaches that use clustering techniques onword, token or document embeddings can ex-tract coherent topics. A common limitationof such approaches is that they reveal noth-ing about inter-topic relationships which areessential in many real-world application do-mains. We present an unsupervised topic mod-elling method which harnesses TopologicalData Analysis (TDA) to extract a topologicalskeleton of the manifold upon which contextu-alised word embeddings lie. We demonstratethat our approach, which performs on par witha recent baseline, is able to construct a networkof coherent topics together with meaningfulrelationships between them.

2020

pdf
Pointing to Select: A Fast Pointer-LSTM for Long Text Classification
Jinhua Du | Yan Huang | Karo Moilanen
Proceedings of the 28th International Conference on Computational Linguistics

Recurrent neural networks (RNNs) suffer from well-known limitations and complications which include slow inference and vanishing gradients when processing long sequences in text classification. Recent studies have attempted to accelerate RNNs via various ad hoc mechanisms to skip irrelevant words in the input. However, word skipping approaches proposed to date effectively stop at each or a given time step to decide whether or not a given input word should be skipped, breaking the coherence of input processing in RNNs. Furthermore, current methods cannot change skip rates during inference and are consequently unable to support different skip rates in demanding real-world conditions. To overcome these limitations, we propose Pointer- LSTM, a novel LSTM framework which relies on a pointer network to select important words for target prediction. The model maintains a coherent input process for the LSTM modules and makes it possible to change the skip rate during inference. Our evaluation on four public data sets demonstrates that Pointer-LSTM (a) is 1.1x∼3.5x faster than the standard LSTM architecture; (b) is more accurate than Leap-LSTM (the state-of-the-art LSTM skipping model) at high skip rates; and (c) reaches robust accuracy levels even when the skip rate is changed during inference.

2019

pdf
AIG Investments.AI at the FinSBD Task: Sentence Boundary Detection through Sequence Labelling and BERT Fine-tuning
Jinhua Du | Yan Huang | Karo Moilanen
Proceedings of the First Workshop on Financial Technology and Natural Language Processing

2009

pdf
Multi-entity Sentiment Scoring
Karo Moilanen | Stephen Pulman
Proceedings of the International Conference RANLP-2009

2008

pdf
The Good, the Bad, and the Unknown: Morphosyllabic Sentiment Tagging of Unseen Words
Karo Moilanen | Stephen Pulman
Proceedings of ACL-08: HLT, Short Papers