Milen Kouylekov


2018

pdf
OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora
Pierre Lison | Jörg Tiedemann | Milen Kouylekov
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2015

pdf
Semantic Parsing for Textual Entailment
Elisabeth Lien | Milen Kouylekov
Proceedings of the 14th International Conference on Parsing Technologies

2014

pdf
UIO-Lien: Entailment Recognition using Minimal Recursion Semantics
Elisabeth Lien | Milen Kouylekov
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf
Semantic Technologies for Querying Linguistic Annotations: An Experiment Focusing on Graph-Structured Data
Milen Kouylekov | Stephan Oepen
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

With growing interest in the creation and search of linguistic annotations that form general graphs (in contrast to formally simpler, rooted trees), there also is an increased need for infrastructures that support the exploration of such representations, for example logical-form meaning representations or semantic dependency graphs. In this work, we heavily lean on semantic technologies and in particular the data model of the Resource Description Framework (RDF) to represent, store, and efficiently query very large collections of text annotated with graph-structured representations of sentence meaning.

pdf
RDF Triple Stores and a Custom SPARQL Front-End for Indexing and Searching (Very) Large Semantic Networks
Milen Kouylekov | Stephan Oepen
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations

2013

pdf
Celi: EDITS and Generic Text Pair Classification
Milen Kouylekov | Luca Dini | Alessio Bosca | Marco Trevisan
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf
Linguagrid: a network of Linguistic and Semantic Services for the Italian Language.
Alessio Bosca | Luca Dini | Milen Kouylekov | Marco Trevisan
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In order to handle the increasing amount of textual information today available on the web and exploit the knowledge latent in this mass of unstructured data, a wide variety of linguistic knowledge and resources (Language Identification, Morphological Analysis, Entity Extraction, etc.). is crucial. In the last decade LRaas (Language Resource as a Service) emerged as a novel paradigm for publishing and sharing these heterogeneous software resources over the Web. In this paper we present an overview of Linguagrid, a recent initiative that implements an open network of linguistic and semantic Web Services for the Italian language, as well as a new approach for enabling customizable corpus-based linguistic services on Linguagrid LRaaS infrastructure. A corpus ingestion service in fact allows users to upload corpora of documents and to generate classification/clustering models tailored to their needs by means of standard machine learning techniques applied to the textual contents and metadata from the corpora. The models so generated can then be accessed through proper Web Services and exploited to process and classify new textual contents.

pdf
CELI: An Experiment with Cross Language Textual Entailment
Milen Kouylekov | Luca Dini | Alessio Bosca | Marco Trevisan
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

2011

pdf
Is it Worth Submitting this Run? Assess your RTE System with a Good Sparring Partner
Milen Kouylekov | Yashar Mehdad | Matteo Negri
Proceedings of the TextInfer 2011 Workshop on Textual Entailment

2010

pdf
Mining Wikipedia for Large-scale Repositories of Context-Sensitive Entailment Rules
Milen Kouylekov | Yashar Mehdad | Matteo Negri
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper focuses on the central role played by lexical information in the task of Recognizing Textual Entailment. In particular, the usefulness of lexical knowledge extracted from several widely used static resources, represented in the form of entailment rules, is compared with a method to extract lexical information from Wikipedia as a dynamic knowledge resource. The proposed acquisition method aims at maximizing two key features of the resulting entailment rules: coverage (i.e. the proportion of rules successfully applied over a dataset of TE pairs), and context sensitivity (i.e. the proportion of rules applied in appropriate contexts). Evaluation results show that Wikipedia can be effectively used as a source of lexical entailment rules, featuring both higher coverage and context sensitivity with respect to other resources.

pdf
An Open-Source Package for Recognizing Textual Entailment
Milen Kouylekov | Matteo Negri
Proceedings of the ACL 2010 System Demonstrations

pdf
FBK_NK: A WordNet-Based System for Multi-Way Classification of Semantic Relations
Matteo Negri | Milen Kouylekov
Proceedings of the 5th International Workshop on Semantic Evaluation

2009

pdf
Question Answering over Structured Data: an Entailment-Based Approach to Question Analysis
Matteo Negri | Milen Kouylekov
Proceedings of the International Conference RANLP-2009

2008

pdf
The QALL-ME Benchmark: a Multilingual Resource of Annotated Spoken Requests for Question Answering
Elena Cabrio | Milen Kouylekov | Bernardo Magnini | Matteo Negri | Laura Hasler | Constantin Orasan | David Tomás | Jose Luis Vicedo | Guenter Neumann | Corinna Weber
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper presents the QALL-ME benchmark, a multilingual resource of annotated spoken requests in the tourism domain, freely available for research purposes. The languages currently involved in the project are Italian, English, Spanish and German. It introduces a semantic annotation scheme for spoken information access requests, specifically derived from Question Answering (QA) research. In addition to pragmatic and semantic annotations, we propose three QA-based annotation levels: the Expected Answer Type, the Expected Answer Quantifier and the Question Topical Target of a request, to fully capture the content of a request and extract the sought-after information. The QALL-ME benchmark is developed under the EU-FP6 QALL-ME project which aims at the realization of a shared and distributed infrastructure for Question Answering (QA) systems on mobile devices (e.g. mobile phones). Questions are formulated by the users in free natural language input, and the system returns the actual sequence of words which constitutes the answer from a collection of information sources (e.g. documents, databases). Within this framework, the benchmark has the twofold purpose of training machine learning based applications for QA, and testing their actual performance with a rapid turnaround in controlled laboratory setting.

pdf
Entailment-based Question Answering for Structured Data
Bogdan Sacaleanu | Constantin Orasan | Christian Spurk | Shiyan Ou | Oscar Ferrandez | Milen Kouylekov | Matteo Negri
Coling 2008: Companion volume: Demonstrations

2006

pdf
Investigating a Generic Paraphrase-Based Approach for Relation Extraction
Lorenza Romano | Milen Kouylekov | Idan Szpektor | Ido Dagan | Alberto Lavelli
11th Conference of the European Chapter of the Association for Computational Linguistics

pdf
Building a Large-Scale Repository of Textual Entailment Rules
Milen Kouylekov | Bernardo Magnini
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Entailment rules are rules where the left hand side (LHS) specifies some knowledge which entails the knowledge expressed n the RHS of the rule, with some degree of confidence. Simple entailment rules can be combined in complex entailment chains, which n turn are at the basis of entailment-based reasoning, which has been recently proposed as a pervasive and application independent approach to Natural Language Understanding. We present the first elease of a large-scale repository of entailment rules at the lexical level, which have been derived from a number of available resources, including WordNet and a word similarity database. Experiments on the PASCAL-RTE dataset show that this resource plays a crucial role in recognizing textual entailment.

2004

pdf
Multilingual Pattern Libraries for Question Answering: a Case Study for Definition Questions
Hristo Tanev | Milen Kouylekov | Matteo Negri | Bonaventura Coppola | Bernardo Magnini
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf
Development of Corpora within the CLaRK System: The BulTreeBank Project Experience
Kiril Simov | Alexander Simov | Milen Kouylekov | Krasimira Ivanova | Ilko Grigorov | Hristo Ganev
Demonstrations

2002

pdf
Building a Linguistically Interpreted Corpus of Bulgarian: the BulTreeBank
Kiril Simov | Petya Osenova | Milena Slavcheva | Sia Kolkovska | Elisaveta Balabanova | Dimitar Doikoff | Krassimira Ivanova | Alexander Simov | Milen Kouylekov
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf
Cascaded Regular Grammars over XML Documents
Kiril Simov | Milen Kouylekov | Alexander Simov
COLING-02: The 2nd Workshop on NLP and XML (NLPXML-2002)