Zornitsa Kozareva


2023

pdf
Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models
Peter Hase | Mona Diab | Asli Celikyilmaz | Xian Li | Zornitsa Kozareva | Veselin Stoyanov | Mohit Bansal | Srinivasan Iyer
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Language models can memorize a considerable amount of factual information during pretraining that can be elicited through prompting or finetuning models on tasks like question answering. In this paper, we discuss approaches to measuring model factual beliefs, updating incorrect factual beliefs in models, and visualizing graphical relationships between factual beliefs. Our main contributions include: (1) new metrics for evaluating belief-updating methods focusing on the logical consistency of beliefs, (2) a training objective for Sequential, Local, and Generalizing updates (SLAG) that improves the performance of existing hypernetwork approaches, and (3) the introduction of the belief graph, a new form of visualization for language models that shows relationships between stored model beliefs. Our experiments suggest that models show only limited consistency between factual beliefs, but update methods can both fix incorrect model beliefs and greatly improve their consistency. Although off-the-shelf optimizers are surprisingly strong belief-updating baselines, our learned optimizers can outperform them in more difficult settings than have been considered in past work.

2022

pdf bib
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Valerio Basile | Zornitsa Kozareva | Sanja Stajner
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
Proceedings of The Third Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)
Angela Fan | Iryna Gurevych | Yufang Hou | Zornitsa Kozareva | Sasha Luccioni | Nafise Sadat Moosavi | Sujith Ravi | Gyuwan Kim | Roy Schwartz | Andreas Rücklé
Proceedings of The Third Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)

pdf bib
Findings of the Association for Computational Linguistics: EMNLP 2022
Yoav Goldberg | Zornitsa Kozareva | Yue Zhang
Findings of the Association for Computational Linguistics: EMNLP 2022

pdf
Improving In-Context Few-Shot Learning via Self-Supervised Training
Mingda Chen | Jingfei Du | Ramakanth Pasunuru | Todor Mihaylov | Srini Iyer | Veselin Stoyanov | Zornitsa Kozareva
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Self-supervised pretraining has made few-shot learning possible for many NLP tasks. But the pretraining objectives are not typically adapted specifically for in-context few-shot learning. In this paper, we propose to use self-supervision in an intermediate training stage between pretraining and downstream few-shot usage with the goal to teach the model to perform in-context few shot learning. We propose and evaluate four self-supervised objectives on two benchmarks. We find that the intermediate self-supervision stage produces models that outperform strong baselines. Ablation study shows that several factors affect the downstream performance, such as the amount of training data and the diversity of the self-supervised objectives. Human-annotated cross-task supervision and self-supervision are complementary. Qualitative analysis suggests that the self-supervised-trained models are better at following task requirements.

pdf bib
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Yoav Goldberg | Zornitsa Kozareva | Yue Zhang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

pdf
ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection
Badr AlKhamissi | Faisal Ladhak | Srinivasan Iyer | Veselin Stoyanov | Zornitsa Kozareva | Xian Li | Pascale Fung | Lambert Mathias | Asli Celikyilmaz | Mona Diab
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Hate speech detection is complex; it relies on commonsense reasoning, knowledge of stereotypes, and an understanding of social nuance that differs from one culture to the next. It is also difficult to collect a large-scale hate speech annotated dataset. In this work, we frame this problem as a few-shot learning task, and show significant gains with decomposing the task into its “constituent” parts. In addition, we see that infusing knowledge from reasoning datasets (e.g. ATOMIC2020) improves the performance even further. Moreover, we observe that the trained models generalize to out-of-distribution datasets, showing the superiority of task decomposition and knowledge infusion compared to previously used methods. Concretely, our method outperforms the baseline by 17.83% absolute gain in the 16-shot case.

pdf
Few-shot Learning with Multilingual Generative Language Models
Xi Victoria Lin | Todor Mihaylov | Mikel Artetxe | Tianlu Wang | Shuohui Chen | Daniel Simig | Myle Ott | Naman Goyal | Shruti Bhosale | Jingfei Du | Ramakanth Pasunuru | Sam Shleifer | Punit Singh Koura | Vishrav Chaudhary | Brian O’Horo | Jeff Wang | Luke Zettlemoyer | Zornitsa Kozareva | Mona Diab | Veselin Stoyanov | Xian Li
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these models are known to be able to jointly represent many different languages, their training data is dominated by English, potentially limiting their cross-lingual generalization. In this work, we train multilingual generative language models on a corpus covering a diverse set of languages, and study their few- and zero-shot learning capabilities in a wide range of tasks. Our largest model with 7.5 billion parameters sets new state of the art in few-shot learning in more than 20 representative languages, outperforming GPT-3 of comparable size in multilingual commonsense reasoning (with +7.4% absolute accuracy improvement in 0-shot settings and +9.4% in 4-shot settings) and natural language inference (+5.4% in each of 0-shot and 4-shot settings). On the FLORES-101 machine translation benchmark, our model outperforms GPT-3 on 171 out of 182 directions with 32 training examples, while surpassing the official supervised baseline in 45 directions. We conduct an in-depth analysis of different multilingual prompting approaches, showing in particular that strong few-shot learning performance across languages can be achieved via cross-lingual transfer through both templates and demonstration examples.


Efficient Large Scale Language Modeling with Mixtures of Experts
Mikel Artetxe | Shruti Bhosale | Naman Goyal | Todor Mihaylov | Myle Ott | Sam Shleifer | Xi Victoria Lin | Jingfei Du | Srinivasan Iyer | Ramakanth Pasunuru | Giridharan Anantharaman | Xian Li | Shuohui Chen | Halil Akin | Mandeep Baines | Louis Martin | Xing Zhou | Punit Singh Koura | Brian O’Horo | Jeffrey Wang | Luke Zettlemoyer | Mona Diab | Zornitsa Kozareva | Veselin Stoyanov
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Mixture of Experts layers (MoEs) enable efficient scaling of language models through conditional computation. This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning. With the exception of fine-tuning, we find MoEs to be substantially more compute efficient. At more modest training budgets, MoEs can match the performance of dense models using ~4 times less compute. This gap narrows at scale, but our largest MoE model (1.1T parameters) consistently outperforms a compute-equivalent dense model (6.7B parameters). Overall, this performance gap varies greatly across tasks and domains, suggesting that MoE and dense models generalize differently in ways that are worthy of future study. We make our code and models publicly available for research use.

2021

pdf
SoDA: On-device Conversational Slot Extraction
Sujith Ravi | Zornitsa Kozareva
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue

We propose a novel on-device neural sequence labeling model which uses embedding-free projections and character information to construct compact word representations to learn a sequence model using a combination of bidirectional LSTM with self-attention and CRF. Unlike typical dialog models that rely on huge, complex neural network architectures and large-scale pre-trained Transformers to achieve state-of-the-art results, our method achieves comparable results to BERT and even outperforms its smaller variant DistilBERT on conversational slot extraction tasks. Our method is faster than BERT models while achieving significant model size reduction–our model requires 135x and 81x fewer model parameters than BERT and DistilBERT, respectively. We conduct experiments on multiple conversational datasets and show significant improvements over existing methods including recent on-device models. Experimental results and ablation studies also show that our neural models preserve tiny memory footprint necessary to operate on smart devices, while still maintaining high performance.

pdf
ProFormer: Towards On-Device LSH Projection Based Transformers
Chinnadhurai Sankar | Sujith Ravi | Zornitsa Kozareva
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

At the heart of text based neural models lay word representations, which are powerful but occupy a lot of memory making it challenging to deploy to devices with memory constraints such as mobile phones, watches and IoT. To surmount these challenges, we introduce ProFormer – a projection based transformer architecture that is faster and lighter making it suitable to deploy to memory constraint devices and preserve user privacy. We use LSH projection layer to dynamically generate word representations on-the-fly without embedding lookup tables leading to significant memory footprint reduction from O(V.d) to O(T), where V is the vocabulary size, d is the embedding dimension size and T is the dimension of the LSH projection representation. We also propose a local projection attention (LPA) layer, which uses self-attention to transform the input sequence of N LSH word projections into a sequence of N/K representations reducing the computations quadratically by O(Kˆ2). We evaluate ProFormer on multiple text classification tasks and observed improvements over prior state-of-the-art on-device approaches for short text classification and comparable performance for long text classification tasks. ProFormer is also competitive with other popular but highly resource-intensive approaches like BERT and even outperforms small-sized BERT variants with significant resource savings – reduces the embedding memory footprint from 92.16 MB to 1.7 KB and requires 16x less computation overhead, which is very impressive making it the fastest and smallest on-device model.

pdf
On-Device Text Representations Robust To Misspellings via Projections
Chinnadhurai Sankar | Sujith Ravi | Zornitsa Kozareva
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Recently, there has been a strong interest in developing natural language applications that live on personal devices such as mobile phones, watches and IoT with the objective to preserve user privacy and have low memory. Advances in Locality-Sensitive Hashing (LSH)-based projection networks have demonstrated state-of-the-art performance in various classification tasks without explicit word (or word-piece) embedding lookup tables by computing on-the-fly text representations. In this paper, we show that the projection based neural classifiers are inherently robust to misspellings and perturbations of the input text. We empirically demonstrate that the LSH projection based classifiers are more robust to common misspellings compared to BiLSTMs (with both word-piece & word-only tokenization) and fine-tuned BERT based methods. When subject to misspelling attacks, LSH projection based classifiers had a small average accuracy drop of 2.94% across multiple classifications tasks, while the fine-tuned BERT model accuracy had a significant drop of 11.44%.

pdf bib
Proceedings of the 5th Workshop on Structured Prediction for NLP (SPNLP 2021)
Zornitsa Kozareva | Sujith Ravi | Andreas Vlachos | Priyanka Agrawal | André Martins
Proceedings of the 5th Workshop on Structured Prediction for NLP (SPNLP 2021)

2020

pdf bib
Proceedings of the Fourth Workshop on Structured Prediction for NLP
Priyanka Agrawal | Zornitsa Kozareva | Julia Kreutzer | Gerasimos Lampouras | André Martins | Sujith Ravi | Andreas Vlachos
Proceedings of the Fourth Workshop on Structured Prediction for NLP

2019

pdf bib
Proceedings of the Third Workshop on Structured Prediction for NLP
Andre Martins | Andreas Vlachos | Zornitsa Kozareva | Sujith Ravi | Gerasimos Lampouras | Vlad Niculae | Julia Kreutzer
Proceedings of the Third Workshop on Structured Prediction for NLP

pdf
ProSeqo: Projection Sequence Networks for On-Device Text Classification
Zornitsa Kozareva | Sujith Ravi
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We propose a novel on-device sequence model for text classification using recurrent projections. Our model ProSeqo uses dynamic recurrent projections without the need to store or look up any pre-trained embeddings. This results in fast and compact neural networks that can perform on-device inference for complex short and long text classification tasks. We conducted exhaustive evaluation on multiple text classification tasks. Results show that ProSeqo outperformed state-of-the-art neural and on-device approaches for short text classification tasks such as dialog act and intent prediction. To the best of our knowledge, ProSeqo is the first on-device long text classification neural model. It achieved comparable results to previous neural approaches for news article, answers and product categorization, while preserving small memory footprint and maintaining high accuracy.

pdf
PRADO: Projection Attention Networks for Document Classification On-Device
Prabhu Kaliamoorthi | Sujith Ravi | Zornitsa Kozareva
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Recently, there has been a great interest in the development of small and accurate neural networks that run entirely on devices such as mobile phones, smart watches and IoT. This enables user privacy, consistent user experience and low latency. Although a wide range of applications have been targeted from wake word detection to short text classification, yet there are no on-device networks for long text classification. We propose a novel projection attention neural network PRADO that combines trainable projections with attention and convolutions. We evaluate our approach on multiple large document text classification tasks. Our results show the effectiveness of the trainable projection model in finding semantically similar phrases and reaching high performance while maintaining compact size. Using this approach, we train tiny neural networks just 200 Kilobytes in size that improve over prior CNN and LSTM models and achieve near state of the art performance on multiple long document classification tasks. We also apply our model for transfer learning, show its robustness and ability to further improve the performance in limited data scenarios.

pdf
Transferable Neural Projection Representations
Chinnadhurai Sankar | Sujith Ravi | Zornitsa Kozareva
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Neural word representations are at the core of many state-of-the-art natural language processing models. A widely used approach is to pre-train, store and look up word or character embedding matrices. While useful, such representations occupy huge memory making it hard to deploy on-device and often do not generalize to unknown words due to vocabulary pruning. In this paper, we propose a skip-gram based architecture coupled with Locality-Sensitive Hashing (LSH) projections to learn efficient dynamically computable representations. Our model does not need to store lookup tables as representations are computed on-the-fly and require low memory footprint. The representations can be trained in an unsupervised fashion and can be easily transferred to other NLP tasks. For qualitative evaluation, we analyze the nearest neighbors of the word representations and discover semantically similar words even with misspellings. For quantitative evaluation, we plug our transferable projections into a simple LSTM and run it on multiple NLP tasks and show how our transferable projections achieve better performance compared to prior work.

pdf
On-device Structured and Context Partitioned Projection Networks
Sujith Ravi | Zornitsa Kozareva
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

A challenging problem in on-device text classification is to build highly accurate neural models that can fit in small memory footprint and have low latency. To address this challenge, we propose an on-device neural network SGNN++ which dynamically learns compact projection vectors from raw text using structured and context-dependent partition projections. We show that this results in accelerated inference and performance improvements. We conduct extensive evaluation on multiple conversational tasks and languages such as English, Japanese, Spanish and French. Our SGNN++ model significantly outperforms all baselines, improves upon existing on-device neural models and even surpasses RNN, CNN and BiLSTM models on dialog act and intent prediction. Through a series of ablation studies we show the impact of the partitioned projections and structured information leading to 10% improvement. We study the impact of the model size on accuracy and introduce quatization-aware training for SGNN++ to further reduce the model size while preserving the same quality. Finally, we show fast inference on mobile phones.

2018

pdf
Self-Governing Neural Networks for On-Device Short Text Classification
Sujith Ravi | Zornitsa Kozareva
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Deep neural networks reach state-of-the-art performance for wide range of natural language processing, computer vision and speech applications. Yet, one of the biggest challenges is running these complex networks on devices such as mobile phones or smart watches with tiny memory footprint and low computational capacity. We propose on-device Self-Governing Neural Networks (SGNNs), which learn compact projection vectors with local sensitive hashing. The key advantage of SGNNs over existing work is that they surmount the need for pre-trained word embeddings and complex networks with huge parameters. We conduct extensive evaluation on dialog act classification and show significant improvement over state-of-the-art results. Our findings show that SGNNs are effective at capturing low-dimensional semantic text representations, while maintaining high accuracy.

pdf
Self-Governing Neural Networks for On-Device Short Text Classification
Sujith Ravi | Zornitsa Kozareva
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Deep neural networks reach state-of-the-art performance for wide range of natural language processing, computer vision and speech applications. Yet, one of the biggest challenges is running these complex networks on devices such as mobile phones or smart watches with tiny memory footprint and low computational capacity. We propose on-device Self-Governing Neural Networks (SGNNs), which learn compact projection vectors with local sensitive hashing. The key advantage of SGNNs over existing work is that they surmount the need for pre-trained word embeddings and complex networks with huge parameters. We conduct extensive evaluation on dialog act classification and show significant improvement over state-of-the-art results. Our findings show that SGNNs are effective at capturing low-dimensional semantic text representations, while maintaining high accuracy.

2016

pdf
Recognizing Salient Entities in Shopping Queries
Zornitsa Kozareva | Qi Li | Ke Zhai | Weiwei Guo
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
Which Tumblr Post Should I Read Next?
Zornitsa Kozareva | Makoto Yamada
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Multilingual Affect Polarity and Valence Prediction in Metaphors
Zornitsa Kozareva
Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf
Everyone Likes Shopping! Multi-class Product Categorization for e-Commerce
Zornitsa Kozareva
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2013

pdf
SemEval-2013 Task 4: Free Paraphrases of Noun Compounds
Iris Hendrickx | Zornitsa Kozareva | Preslav Nakov | Diarmuid Ó Séaghdha | Stan Szpakowicz | Tony Veale
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

pdf
SemEval-2013 Task 2: Sentiment Analysis in Twitter
Preslav Nakov | Sara Rosenthal | Zornitsa Kozareva | Veselin Stoyanov | Alan Ritter | Theresa Wilson
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

pdf
Multilingual Affect Polarity and Valence Prediction in Metaphor-Rich Texts
Zornitsa Kozareva
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the First Workshop on Metaphor in NLP
Ekaterina Shutova | Beata Beigman Klebanov | Joel Tetreault | Zornitsa Kozareva
Proceedings of the First Workshop on Metaphor in NLP

pdf bib
Proceedings of TextGraphs-8 Graph-based Methods for Natural Language Processing
Zornitsa Kozareva | Irina Matveeva | Gabor Melli | Vivi Nastase
Proceedings of TextGraphs-8 Graph-based Methods for Natural Language Processing

2012

pdf
SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning
Andrew Gordon | Zornitsa Kozareva | Melissa Roemmele
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf
Learning Verbs on the Fly
Zornitsa Kozareva
Proceedings of COLING 2012: Posters

pdf
Cause-Effect Relation Learning
Zornitsa Kozareva
Workshop Proceedings of TextGraphs-7: Graph-based Methods for Natural Language Processing

2011

pdf
Combining Relational and Attributional Similarity for Semantic Relation Classification
Preslav Nakov | Zornitsa Kozareva
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf
Class Label Enhancement via Related Instances
Zornitsa Kozareva | Konstantin Voevodski | Shanghua Teng
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf
Insights from Network Structure for Text Mining
Zornitsa Kozareva | Eduard Hovy
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Proceedings of the ACL 2011 Workshop on Relational Models of Semantics
Su Nam Kim | Zornitsa Kozareva | Preslav Nakov | Diarmuid Ó Séaghdha | Sebastian Padó | Stan Szpakowicz
Proceedings of the ACL 2011 Workshop on Relational Models of Semantics

pdf
Unsupervised Name Ambiguity Resolution Using A Generative Model
Zornitsa Kozareva | Sujith Ravi
Proceedings of the First workshop on Unsupervised Learning in NLP

pdf bib
Proceedings of the RANLP 2011 Workshop on Information Extraction and Knowledge Acquisition
Preslav Nakov | Zornitsa Kozareva | Kuzman Ganchev | Jerry Hobbs
Proceedings of the RANLP 2011 Workshop on Information Extraction and Knowledge Acquisition

2010

pdf
SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations between Pairs of Nominals
Iris Hendrickx | Su Nam Kim | Zornitsa Kozareva | Preslav Nakov | Diarmuid Ó Séaghdha | Sebastian Padó | Marco Pennacchiotti | Lorenza Romano | Stan Szpakowicz
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf
A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web
Zornitsa Kozareva | Eduard Hovy
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf
Not All Seeds Are Equal: Measuring the Quality of Text Mining Seeds
Zornitsa Kozareva | Eduard Hovy
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns
Zornitsa Kozareva | Eduard Hovy
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf
SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals
Iris Hendrickx | Su Nam Kim | Zornitsa Kozareva | Preslav Nakov | Diarmuid Ó Séaghdha | Sebastian Padó | Marco Pennacchiotti | Lorenza Romano | Stan Szpakowicz
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf
Toward Completeness in Concept Extraction and Classification
Eduard Hovy | Zornitsa Kozareva | Ellen Riloff
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2008

pdf
Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs
Zornitsa Kozareva | Ellen Riloff | Eduard Hovy
Proceedings of ACL-08: HLT

2007

pdf
A Language Independent Approach for Name Categorization and Discrimination
Zornitsa Kozareva | Sonia Vázquez | Andrés Montoyo
Proceedings of the Workshop on Balto-Slavonic Natural Language Processing

pdf
UA-ZBSA: A Headline Emotion Classification through Web Information
Zornitsa Kozareva | Borja Navarro | Sonia Vázquez | Andrés Montoyo
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf
UA-ZSA: Web Page Clustering on the basis of Name Disambiguation
Zornitsa Kozareva | Sonia Vazquez | Andres Montoyo
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf
Improving Name Discrimination: A Language Salad Approach
Ted Pedersen | Anagha Kulkarni | Roxana Angheluta | Zornitsa Kozareva | Thamar Solorio
Proceedings of the Cross-Language Knowledge Induction Workshop

pdf
Bootstrapping Named Entity Recognition with Automatically Generated Gazetteer Lists
Zornitsa Kozareva
Student Research Workshop

2004

pdf
Cluster Analysis and Classification of Named Entities
Joaquim F. Ferreira da Silva | Zornitsa Kozareva | José Gabriel Pereira Lopes
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf
Extracting Named Entities. A Statistical Approach
Joaquim Silva | Zornitsa Kozareva | Veska Noncheva | Gabriel Lopes
Actes de la 11ème conférence sur le Traitement Automatique des Langues Naturelles. Posters

Named entities and more generally Multiword Lexical Units (MWUs) are important for various applications. However, language independent methods for automatically extracting MWUs do not provide us with clean data. So, in this paper we propose a method for selecting possible named entities from automatically extracted MWUs, and later, a statistics-based language independent unsupervised approach is applied to possible named entities in order to cluster them according to their type. Statistical features used by our clustering process are described and motivated. The Model-Based Clustering Analysis (MBCA) software enabled us to obtain different clusters for proposed named entities. The method was applied to Bulgarian and English. For some clusters, precision is very high; other clusters still need further refinement. Based on the obtained clusters, it is also possible to classify new possible named entities.
Search
Co-authors