Pekka Marttinen


2023

pdf
Patient Outcome and Zero-shot Diagnosis Prediction with Hypernetwork-guided Multitask Learning
Shaoxiong Ji | Pekka Marttinen
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Multitask deep learning has been applied to patient outcome prediction from text, taking clinical notes as input and training deep neural networks with a joint loss function of multiple tasks.However, the joint training scheme of multitask learning suffers from inter-task interference, and diagnosis prediction among the multiple tasks has the generalizability issue due to rare diseases or unseen diagnoses.To solve these challenges, we propose a hypernetwork-based approach that generates task-conditioned parameters and coefficients of multitask prediction heads to learn task-specific prediction and balance the multitask learning.We also incorporate semantic task information to improve the generalizability of our task-conditioned multitask model. Experiments on early and discharge notes extracted from the real-world MIMIC database show our method can achieve better performance on multitask patient outcome prediction than strong baselines in most cases.Besides, our method can effectively handle the scenario with limited information and improve zero-shot prediction on unseen diagnosis categories.

2021

pdf
Medical Code Assignment with Gated Convolution and Note-Code Interaction
Shaoxiong Ji | Shirui Pan | Pekka Marttinen
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf
Dilated Convolutional Attention Network for Medical Code Assignment from Clinical Text
Shaoxiong Ji | Erik Cambria | Pekka Marttinen
Proceedings of the 3rd Clinical Natural Language Processing Workshop

Medical code assignment, which predicts medical codes from clinical texts, is a fundamental task of intelligent medical information systems. The emergence of deep models in natural language processing has boosted the development of automatic assignment methods. However, recent advanced neural architectures with flat convolutions or multi-channel feature concatenation ignore the sequential causal constraint within a text sequence and may not learn meaningful clinical text representations, especially for lengthy clinical notes with long-term sequential dependency. This paper proposes a Dilated Convolutional Attention Network (DCAN), integrating dilated convolutions, residual connections, and label attention, for medical code assignment. It adopts dilated convolutions to capture complex medical patterns with a receptive field which increases exponentially with dilation size. Experiments on a real-world clinical dataset empirically show that our model improves the state of the art.