Eric Nichols


2017

pdf
Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition
Leon Derczynski | Eric Nichols | Marieke van Erp | Nut Limsopatham
Proceedings of the 3rd Workshop on Noisy User-generated Text

This shared task focuses on identifying unusual, previously-unseen entities in the context of emerging discussions. Named entities form the basis of many modern approaches to other tasks (like event clustering and summarization), but recall on them is a real problem in noisy text - even among annotators. This drop tends to be due to novel entities and surface forms. Take for example the tweet “so.. kktny in 30 mins?!” – even human experts find the entity ‘kktny’ hard to detect and resolve. The goal of this task is to provide a definition of emerging and of rare entities, and based on that, also datasets for detecting these entities. The task as described in this paper evaluated the ability of participating entries to detect and classify novel and emerging named entities in noisy text.

pdf
Lexical Acquisition through Implicit Confirmations over Multiple Dialogues
Kohei Ono | Ryu Takeda | Eric Nichols | Mikio Nakano | Kazunori Komatani
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue

We address the problem of acquiring the ontological categories of unknown terms through implicit confirmation in dialogues. We develop an approach that makes implicit confirmation requests with an unknown term’s predicted category. Our approach does not degrade user experience with repetitive explicit confirmations, but the system has difficulty determining if information in the confirmation request can be correctly acquired. To overcome this challenge, we propose a method for determining whether or not the predicted category is correct, which is included in an implicit confirmation request. Our method exploits multiple user responses to implicit confirmation requests containing the same ontological category. Experimental results revealed that the proposed method exhibited a higher precision rate for determining the correctly predicted categories than when only single user responses were considered.

2016

pdf
DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets
Fabrice Dugas | Eric Nichols
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT)

In this paper, we describe the DeepNNNER entry to The 2nd Workshop on Noisy User-generated Text (WNUT) Shared Task #2: Named Entity Recognition in Twitter. Our shared task submission adopts the bidirectional LSTM-CNN model of Chiu and Nichols (2016), as it has been shown to perform well on both newswire and Web texts. It uses word embeddings trained on large-scale Web text collections together with text normalization to cope with the diversity in Web texts, and lexicons for target named entity classes constructed from publicly-available sources. Extended evaluation comparing the effectiveness of various word embeddings, text normalization, and lexicon settings shows that our system achieves a maximum F1-score of 47.24, performance surpassing that of the shared task’s second-ranked system.

pdf
Named Entity Recognition with Bidirectional LSTM-CNNs
Jason P.C. Chiu | Eric Nichols
Transactions of the Association for Computational Linguistics, Volume 4

Named entity recognition is a challenging task that has traditionally required large amounts of knowledge in the form of feature engineering and lexicons to achieve high performance. In this paper, we present a novel neural network architecture that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering. We also propose a novel method of encoding partial lexicon matches in neural networks and compare it to existing approaches. Extensive evaluation shows that, given only tokenized text and publicly available word embeddings, our system is competitive on the CoNLL-2003 dataset and surpasses the previously reported state of the art performance on the OntoNotes 5.0 dataset by 2.13 F1 points. By using two lexicons constructed from publicly-available sources, we establish new state of the art performance with an F1 score of 91.62 on CoNLL-2003 and 86.28 on OntoNotes, surpassing systems that employ heavy feature engineering, proprietary lexicons, and rich entity linking information.

2015

pdf
SpRL-CWW: Spatial Relation Classification with Independent Multi-class Models
Eric Nichols | Fadi Botros
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2012

pdf
A Latent Discriminative Model for Compositional Entailment Relation Recognition using Natural Logic
Yotaro Watanabe | Junta Mizuno | Eric Nichols | Naoaki Okazaki | Kentaro Inui
Proceedings of COLING 2012

2011

pdf
Recognizing Confinement in Web Texts
Megumi Ohki | Eric Nichols | Suguru Matsuyoshi | Koji Murakami | Junta Mizuno | Shouko Masuda | Kentaro Inui | Yuji Matsumoto
Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)

2010

pdf
Automatic Classification of Semantic Relations between Facts and Opinions
Koji Murakami | Eric Nichols | Junta Mizuno | Yotaro Watanabe | Hayato Goto | Megumi Ohki | Suguru Matsuyoshi | Kentaro Inui | Yuji Matsumoto
Proceedings of the Second Workshop on NLP Challenges in the Information Explosion Era (NLPIX 2010)

2009

pdf
Annotating Semantic Relations Combining Facts and Opinions
Koji Murakami | Shouko Masuda | Suguru Matsuyoshi | Eric Nichols | Kentaro Inui | Yuji Matsumoto
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

2008

pdf bib
Improving statistical machine translation by paraphrasing the training data.
Francis Bond | Eric Nichols | Darren Scott Appling | Michael Paul
Proceedings of the 5th International Workshop on Spoken Language Translation: Papers

Large amounts of training data are essential for training statistical machine translations systems. In this paper we show how training data can be expanded by paraphrasing one side. The new data is made by parsing then generating using a precise HPSG based grammar, which gives sentences with the same meaning, but minor variations in lexical choice and word order. In experiments with Japanese and English, we showed consistent gains on the Tanaka Corpus with less consistent improvement on the IWSLT 2005 evaluation data.

2007

pdf
Combining resources for open source machine translation
Eric Nichols | Francis Bond | Darren Scott Appling | Yuji Matsumoto
Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers

2006

pdf bib
Multilingual Ontology Acquisition from Multiple MRDs
Eric Nichols | Francis Bond | Takaaki Tanaka | Sanae Fujita | Dan Flickinger
Proceedings of the 2nd Workshop on Ontology Learning and Population: Bridging the Gap between Text and Knowledge

2005

pdf bib
Extracting Representative Arguments from Dictionaries for Resolving Zero Pronouns
Shigeko Nariyama | Eric Nichols | Francis Bond | Takaaki Tanaka | Hiromi Nakaiwa
Proceedings of Machine Translation Summit X: Papers

We propose a method to alleviate the problem of referential granularity for Japanese zero pronoun resolution. We use dictionary definition sentences to extract ‘representative’ arguments of predicative definition words; e.g. ‘arrest’ is likely to take police as the subject and criminal as its object. These representative arguments are far more informative than ‘person’ that is provided by other valency dictionaries. They are auto-extracted using both Shallow parsing and Deep parsing for greater quality and quantity. Initial results are highly promising, obtaining more specific information about selectional preferences. An architecture of zero pronoun resolution using these representative arguments is described.

2004

pdf
Acquiring an Ontology for a Fundamental Vocabulary
Francis Bond | Eric Nichols | Sanae Fujita | Takaaki Tanaka
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
The Hinoki Treebank. Working Toward Text Understanding
Francis Bond | Sanae Fujita | Chikara Hashimoto | Kaname Kasahara | Shigeko Nariyama | Eric Nichols | Akira Ohtani | Takaaki Tanaka | Shigeaki Amano
Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora