Marius Pasca
Also published as: Marius Paşca, Marius A. Pasca
2020
Interpreting Open-Domain Modifiers: Decomposition of Wikipedia Categories into Disambiguated Property-Value Pairs
Marius Pasca
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Marius Pasca
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
This paper proposes an open-domain method for automatically annotating modifier constituents (20th-century’) within Wikipedia categories (20th-century male writers) with properties (date of birth). The annotations offer a semantically-anchored understanding of the role of the constituents in defining the underlying meaning of the categories. In experiments over an evaluation set of Wikipedia categories, the proposed method annotates constituent modifiers as semantically-anchored properties, rather than as mere strings in a previous method. It does so at a better trade-off between precision and recall.
2019
Wikipedia as a Resource for Text Analysis and Retrieval
Marius Pasca
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Marius Pasca
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
This tutorial examines the role of Wikipedia in tasks related to text analysis and retrieval. Text analysis tasks, which take advantage of Wikipedia, include coreference resolution, word sense and entity disambiguation and information extraction. In information retrieval, a better understanding of the structure and meaning of queries helps in matching queries against documents, clustering search results, answer and entity retrieval and retrieving knowledge panels for queries asking about popular entities.
2017
Acquisition, Representation and Usage of Conceptual Hierarchies
Marius Pasca
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
Marius Pasca
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
Through subsumption and instantiation, individual instances (“artificial intelligence”, “the spotted pig”) otherwise spanning a wide range of domains can be brought together and organized under conceptual hierarchies. The hierarchies connect more specific concepts (“computer science subfields”, “gastropubs”) to more general concepts (“academic disciplines”, “restaurants”) through IsA relations. Explicit or implicit properties applicable to, and defining, more general concepts are inherited by their more specific concepts, down to the instances connected to the lower parts of the hierarchies. Subsumption represents a crisp, universally-applicable principle towards consistently representing IsA relations in any knowledge resource. Yet knowledge resources often exhibit significant differences in their scope, representation choices and intended usage, to cause significant differences in their expected usage and impact on various tasks. This tutorial examines the theoretical foundations of subsumption, and its practical embodiment through IsA relations compiled manually or extracted automatically. It addresses IsA relations from their formal definition; through practical choices made in their representation within the larger and more widely-used of the available knowledge resources; to their automatic acquisition from document repositories, as opposed to their manual compilation by human contributors; to their impact in text analysis and information retrieval. As search engines move away from returning a set of links and closer to returning results that more directly answer queries, IsA relations play an increasingly important role towards a better understanding of documents and queries. The tutorial teaches the audience about definitions, assumptions and practical choices related to modeling and representing IsA relations in existing, human-compiled resources of instances, concepts and resulting conceptual hierarchies; methods for automatically extracting sets of instances within unlabeled or labeled concepts, where the concepts may be considered as a flat set or organized hierarchically; and applications of IsA relations in information retrieval.
Identifying 1950s American Jazz Musicians: Fine-Grained IsA Extraction via Modifier Composition
Ellie Pavlick | Marius Paşca
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Ellie Pavlick | Marius Paşca
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
We present a method for populating fine-grained classes (e.g., “1950s American jazz musicians”) with instances (e.g., Charles Mingus ). While state-of-the-art methods tend to treat class labels as single lexical units, the proposed method considers each of the individual modifiers in the class label relative to the head. An evaluation on the task of reconstructing Wikipedia category pages demonstrates a >10 point increase in AUC, over a strong baseline relying on widely-used Hearst patterns.
2016
Revisiting Taxonomy Induction over Wikipedia
Amit Gupta | Francesco Piccinno | Mikhail Kozhevnikov | Marius Paşca | Daniele Pighin
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Amit Gupta | Francesco Piccinno | Mikhail Kozhevnikov | Marius Paşca | Daniele Pighin
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Guided by multiple heuristics, a unified taxonomy of entities and categories is distilled from the Wikipedia category network. A comprehensive evaluation, based on the analysis of upward generalization paths, demonstrates that the taxonomy supports generalizations which are more than twice as accurate as the state of the art. The taxonomy is available at http://headstaxonomy.com.
The Role of Wikipedia in Text Analysis and Retrieval
Marius Paşca
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Tutorial Abstracts
Marius Paşca
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Tutorial Abstracts
This tutorial examines the characteristics, advantages and limitations of Wikipedia relative to other existing, human-curated resources of knowledge; derivative resources, created by converting semi-structured content in Wikipedia into structured data; the role of Wikipedia and its derivatives in text analysis; and the role of Wikipedia and its derivatives in enhancing information retrieval.
2015
Knowledge Acquisition for Web Search
Marius Pasca
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
Marius Pasca
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
The identification of textual items, or documents, that best match a user’s information need, as expressed in search queries, forms the core functionality of information retrieval systems. Well-known challenges are associated with understanding the intent behind user queries; and, more importantly, with matching inherently-ambiguous queries to documents that may employ lexically different phrases to convey the same meaning. The conversion of semi-structured content from Wikipedia and other resources into structured data produces knowledge potentially more suitable to database-style queries and, ideally, to use in information retrieval. In parallel, the availability of textual documents on the Web enables an aggressive push towards the automatic acquisition of various types of knowledge from text. Methods developed under the umbrella of open-domain information extraction acquire open-domain classes of instances and relations from Web text. The methods operate over unstructured or semi-structured text available within collections of Web documents, or over relatively more intriguing streams of anonymized search queries. Some of the methods import the automatically-extracted data into human-generated resources, or otherwise exploit existing human-generated resources. In both cases, the goal is to expand the coverage of the initial resources, thus providing information about more of the topics that people in general, and Web search users in particular, may be interested in.
Interpreting Compound Noun Phrases Using Web Search Queries
Marius Paşca
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Marius Paşca
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2014
Queries as a Source of Lexicalized Commonsense Knowledge
Marius Paşca
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Marius Paşca
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Acquisition of Noncontiguous Class Attributes from Web Search Queries
Marius Paşca
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
Marius Paşca
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
2013
Open-Domain Fine-Grained Class Extraction from Web Search Queries
Marius Paşca
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
Marius Paşca
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
2012
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Jun’ichi Tsujii | James Henderson | Marius Paşca
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Jun’ichi Tsujii | James Henderson | Marius Paşca
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Instance-Driven Attachment of Semantic Annotations over Conceptual Hierarchies
Janara Christensen | Marius Paşca
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Janara Christensen | Marius Paşca
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
2011
Attribute Extraction from Synthetic Web Search Queries
Marius Paşca
Proceedings of 5th International Joint Conference on Natural Language Processing
Marius Paşca
Proceedings of 5th International Joint Conference on Natural Language Processing
Fine-Grained Class Label Markup of Search Queries
Joseph Reisinger | Marius Paşca
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Joseph Reisinger | Marius Paşca
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Ranking Class Labels Using Query Sessions
Marius Paşca
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Marius Paşca
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Web Search Queries as a Corpus
Marius Paşca
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Marius Paşca
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
2010
Instance Sense Induction from Attribute Sets
Ricardo Martin-Brualla | Enrique Alfonseca | Marius Pasca | Keith Hall | Enrique Robledo-Arnuncio | Massimiliano Ciaramita
Coling 2010: Posters
Ricardo Martin-Brualla | Enrique Alfonseca | Marius Pasca | Keith Hall | Enrique Robledo-Arnuncio | Massimiliano Ciaramita
Coling 2010: Posters
The Role of Queries in Ranking Labeled Instances Extracted from Text
Marius Paşca
Coling 2010: Posters
Marius Paşca
Coling 2010: Posters
Proceedings of the NAACL HLT 2010 Workshop on Semantic Search
Donghui Feng | Jamie Callan | Eduard Hovy | Marius Pasca
Proceedings of the NAACL HLT 2010 Workshop on Semantic Search
Donghui Feng | Jamie Callan | Eduard Hovy | Marius Pasca
Proceedings of the NAACL HLT 2010 Workshop on Semantic Search
2009
Outclassing Wikipedia in Open-Domain Information Extraction: Weakly-Supervised Acquisition of Attributes over Conceptual Hierarchies
Marius Paşca
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
Marius Paşca
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches
Eneko Agirre | Enrique Alfonseca | Keith Hall | Jana Kravalova | Marius Paşca | Aitor Soroa
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Eneko Agirre | Enrique Alfonseca | Keith Hall | Jana Kravalova | Marius Paşca | Aitor Soroa
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Latent Variable Models of Concept-Attribute Attachment
Joseph Reisinger | Marius Paşca
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Joseph Reisinger | Marius Paşca
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
2008
Weakly-Supervised Acquisition of Labeled Class Instances using Graph Random Walks
Partha Pratim Talukdar | Joseph Reisinger | Marius Paşca | Deepak Ravichandran | Rahul Bhagat | Fernando Pereira
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
Partha Pratim Talukdar | Joseph Reisinger | Marius Paşca | Deepak Ravichandran | Rahul Bhagat | Fernando Pereira
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
Answering Definition Questions via Temporally-Anchored Text Snippets
Marius Paşca
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I
Marius Paşca
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I
Low-Complexity Heuristics for Deriving Fine-Grained Classes of Named Entities from Web Textual Data
Marius Paşca
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Marius Paşca
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
We introduce a low-complexity method for acquiring fine-grained classes of named entities from the Web. The method exploits the large amounts of textual data available on the Web, while avoiding the use of any expensive text processing techniques or tools. The quality of the extracted classes is encouraging with respect to both the precision of the sets of named entities acquired within various classes, and the labels assigned to the sets of named entities.
Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs
Marius Paşca | Benjamin Van Durme
Proceedings of ACL-08: HLT
Marius Paşca | Benjamin Van Durme
Proceedings of ACL-08: HLT
Mining Parenthetical Translations from the Web by Word Alignment
Dekang Lin | Shaojun Zhao | Benjamin Van Durme | Marius Paşca
Proceedings of ACL-08: HLT
Dekang Lin | Shaojun Zhao | Benjamin Van Durme | Marius Paşca
Proceedings of ACL-08: HLT
2006
Using Encyclopedic Knowledge for Named entity Disambiguation
Razvan Bunescu | Marius Paşca
11th Conference of the European Chapter of the Association for Computational Linguistics
Razvan Bunescu | Marius Paşca
11th Conference of the European Chapter of the Association for Computational Linguistics
Names and Similarities on the Web: Fact Extraction in the Fast Lane
Marius Paşca | Dekang Lin | Jeffrey Bigham | Andrei Lifchits | Alpa Jain
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
Marius Paşca | Dekang Lin | Jeffrey Bigham | Andrei Lifchits | Alpa Jain
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
2005
Aligning Needles in a Haystack: Paraphrase Acquisition Across the Web
Marius Paşca | Péter Dienes
Second International Joint Conference on Natural Language Processing: Full Papers
Marius Paşca | Péter Dienes
Second International Joint Conference on Natural Language Processing: Full Papers
Book Review: New Directions in Question Answering, edited by Mark T. Maybury
Marius Paşca
Computational Linguistics, Volume 31, Number 3, September 2005
Marius Paşca
Computational Linguistics, Volume 31, Number 3, September 2005
2002
Performance Issues and Error Analysis in an Open-Domain Question Answering System
Dan Moldovan | Marius Pasca | Sanda Harabagiu | Mihai Surdeanu
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics
Dan Moldovan | Marius Pasca | Sanda Harabagiu | Mihai Surdeanu
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics
2001
The Role of Lexico-Semantic Feedback in Open-Domain Textual Question-Answering
Sanda Harabagiu | Dan Moldovan | Marius Paşca | Rada Mihalcea | Mihai Surdeanu | Răzvan Bunescu | Roxana Gîrju | Vasile Rus | Paul Morărescu
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics
Sanda Harabagiu | Dan Moldovan | Marius Paşca | Rada Mihalcea | Mihai Surdeanu | Răzvan Bunescu | Roxana Gîrju | Vasile Rus | Paul Morărescu
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics
Answer Mining from On-Line Documents
Marius Pasca | Sanda Harabagiu
Proceedings of the ACL 2001 Workshop on Open-Domain Question Answering
Marius Pasca | Sanda Harabagiu
Proceedings of the ACL 2001 Workshop on Open-Domain Question Answering
2000
Search
Fix author
Co-authors
- Sanda Harabagiu 5
- Dan Moldovan 3
- Joseph Reisinger 3
- Enrique Alfonseca 2
- Razvan Bunescu 2
- Benjamin Van Durme 2
- Roxana Girju 2
- Keith Hall 2
- Dekang Lin 2
- Rada Mihalcea 2
- Vasile Rus 2
- Mihai Surdeanu 2
- Eneko Agirre 1
- Rahul Bhagat 1
- Jeffrey P. Bigham 1
- Jamie Callan 1
- Janara Christensen 1
- Massimiliano Ciaramita 1
- Péter Dienes 1
- Donghui Feng 1
- Richard Goodrum 1
- Amit Gupta 1
- James Henderson 1
- Eduard Hovy 1
- Alpa Jain 1
- Mikhail Kozhevnikov 1
- Jana Kravalová 1
- Andrei Lifchits 1
- Steven J. Maiorano 1
- Ricardo Martin-Brualla 1
- Paul Morarescu 1
- Ellie Pavlick 1
- Fernando Pereira 1
- Francesco Piccinno 1
- Daniele Pighin 1
- Deepak Ravichandran 1
- Enrique Robledo-Arnuncio 1
- Aitor Soroa 1
- Partha Talukdar 1
- Jun’ichi Tsujii 1
- Shaojun Zhao 1