Martine De Cock


2021

pdf
Private Text Classification with Convolutional Neural Networks
Samuel Adams | David Melanson | Martine De Cock
Proceedings of the Third Workshop on Privacy in Natural Language Processing

Text classifiers are regularly applied to personal texts, leaving users of these classifiers vulnerable to privacy breaches. We propose a solution for privacy-preserving text classification that is based on Convolutional Neural Networks (CNNs) and Secure Multiparty Computation (MPC). Our method enables the inference of a class label for a personal text in such a way that (1) the owner of the personal text does not have to disclose their text to anyone in an unencrypted manner, and (2) the owner of the text classifier does not have to reveal the trained model parameters to the text owner or to anyone else. To demonstrate the feasibility of our protocol for practical private text classification, we implemented it in the PyTorch-based MPC framework CrypTen, using a well-known additive secret sharing scheme in the honest-but-curious setting. We test the runtime of our privacy-preserving text classifier, which is fast enough to be used in practice.

2012

pdf
Discovering Missing Wikipedia Inter-language Links by means of Cross-lingual Word Sense Disambiguation
Els Lefever | Véronique Hoste | Martine De Cock
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Wikipedia pages typically contain inter-language links to the corresponding pages in other languages. These links, however, are often incomplete. This paper describes a set of experiments in which the viability of discovering such missing inter-language links for ambiguous nouns by means of a cross-lingual Word Sense Disambiguation approach is investigated. The input for the inter-language link detection system is a set of Dutch pages for a given ambiguous noun and the output of the system is a set of links to the corresponding pages in three target languages (viz. French, Spanish and Italian). The experimental results show that although it is a very challenging task, the system succeeds to detect missing inter-language links between Wikipedia documents for a manually labeled test set. The final goal of the system is to provide a human editor with a list of possible missing links that should be manually verified.

2011

pdf
ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation
Els Lefever | Véronique Hoste | Martine De Cock
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2006

pdf
Supporting temporal question answering: strategies for offline data collection
David Ahn | Steven Schockaert | Martine De Cock | Etienne Kerre
Proceedings of the Fifth International Workshop on Inference in Computational Semantics (ICoS-5)