Pekka Marttinen

2023

pdf abs
Reader: Model-based language-instructed reinforcement learning
Nicola Dainese | Pekka Marttinen | Alexander Ilin
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We explore how we can build accurate world models, which are partially specified by language, and how we can plan with them in the face of novelty and uncertainty. We propose the first model-based reinforcement learning approach to tackle the environment Read To Fight Monsters (Zhong et al., 2019), a grounded policy learning problem. In RTFM an agent has to reason over a set of rules and a goal, both described in a language manual, and the observations, while taking into account the uncertainty arising from the stochasticity of the environment, in order to generalize successfully its policy to test episodes. We demonstrate the superior performance and sample efficiency of our model-based approach to the existing model-free SOTA agents in eight variants of RTFM. Furthermore, we show how the agent’s plans can be inspected, which represents progress towards more interpretable agents.

pdf abs
Patient Outcome and Zero-shot Diagnosis Prediction with Hypernetwork-guided Multitask Learning
Shaoxiong Ji | Pekka Marttinen
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Multitask deep learning has been applied to patient outcome prediction from text, taking clinical notes as input and training deep neural networks with a joint loss function of multiple tasks. However, the joint training scheme of multitask learning suffers from inter-task interference, and diagnosis prediction among the multiple tasks has the generalizability issue due to rare diseases or unseen diagnoses. To solve these challenges, we propose a hypernetwork-based approach that generates task-conditioned parameters and coefficients of multitask prediction heads to learn task-specific prediction and balance the multitask learning. We also incorporate semantic task information to improve the generalizability of our task-conditioned multitask model. Experiments on early and discharge notes extracted from the real-world MIMIC database show our method can achieve better performance on multitask patient outcome prediction than strong baselines in most cases. Besides, our method can effectively handle the scenario with limited information and improve zero-shot prediction on unseen diagnosis categories.

2021

pdf
Medical Code Assignment with Gated Convolution and Note-Code Interaction
Shaoxiong Ji | Shirui Pan | Pekka Marttinen
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf abs
Dilated Convolutional Attention Network for Medical Code Assignment from Clinical Text
Shaoxiong Ji | Erik Cambria | Pekka Marttinen
Proceedings of the 3rd Clinical Natural Language Processing Workshop

Medical code assignment, which predicts medical codes from clinical texts, is a fundamental task of intelligent medical information systems. The emergence of deep models in natural language processing has boosted the development of automatic assignment methods. However, recent advanced neural architectures with flat convolutions or multi-channel feature concatenation ignore the sequential causal constraint within a text sequence and may not learn meaningful clinical text representations, especially for lengthy clinical notes with long-term sequential dependency. This paper proposes a Dilated Convolutional Attention Network (DCAN), integrating dilated convolutions, residual connections, and label attention, for medical code assignment. It adopts dilated convolutions to capture complex medical patterns with a receptive field which increases exponentially with dilation size. Experiments on a real-world clinical dataset empirically show that our model improves the state of the art.

Pekka Marttinen

2023

2021

2020

Co-authors

Venues