2016
pdf
bib
Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning
Yulia Tsvetkov
|
Manaal Faruqui
|
Wang Ling
|
Brian MacWhinney
|
Chris Dyer
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2014
pdf
bib
abs
Two Approaches to Metaphor Detection
Brian MacWhinney
|
Davida Fromm
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Methods for automatic detection and interpretation of metaphors have focused on analysis and utilization of the ways in which metaphors violate selectional preferences (Martin, 2006). Detection and interpretation processes that rely on this method can achieve wide coverage and may be able to detect some novel metaphors. However, they are prone to high false alarm rates, often arising from imprecision in parsing and supporting ontological and lexical resources. An alternative approach to metaphor detection emphasizes the fact that many metaphors become conventionalized collocations, while still preserving their active metaphorical status. Given a large enough corpus for a given language, it is possible to use tools like SketchEngine (Kilgariff, Rychly, Smrz, & Tugwell, 2004) to locate these high frequency metaphors for a given target domain. In this paper, we examine the application of these two approaches and discuss their relative strengths and weaknesses for metaphors in the target domain of economic inequality in English, Spanish, Farsi, and Russian.
pdf
bib
abs
Resources for the Detection of Conventionalized Metaphors in Four Languages
Lori Levin
|
Teruko Mitamura
|
Brian MacWhinney
|
Davida Fromm
|
Jaime Carbonell
|
Weston Feely
|
Robert Frederking
|
Anatole Gershman
|
Carlos Ramirez
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper describes a suite of tools for extracting conventionalized metaphors in English, Spanish, Farsi, and Russian. The method depends on three significant resources for each language: a corpus of conventionalized metaphors, a table of conventionalized conceptual metaphors (CCM table), and a set of extraction rules. Conventionalized metaphors are things like “escape from poverty” and “burden of taxation”. For each metaphor, the CCM table contains the metaphorical source domain word (such as “escape”) the target domain word (such as “poverty”) and the grammatical construction in which they can be found. The extraction rules operate on the output of a dependency parser and identify the grammatical configurations (such as a verb with a prepositional phrase complement) that are likely to contain conventional metaphors. We present results on detection rates for conventional metaphors and analysis of the similarity and differences of source domains for conventional metaphors in the four languages.
2012
pdf
bib
A Morphologically Annotated Hebrew CHILDES Corpus
Aviad Albert
|
Brian MacWhinney
|
Bracha Nir
|
Shuly Wintner
Proceedings of the Workshop on Computational Models of Language Acquisition and Loss
pdf
bib
abs
Morphosyntactic Analysis of the CHILDES and TalkBank Corpora
Brian MacWhinney
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
This paper describes the construction and usage of the MOR and GRASP programs for part of speech tagging and syntactic dependency analysis of the corpora in the CHILDES and TalkBank databases. We have written MOR grammars for 11 languages and GRASP analyses for three. For English data, the MOR tagger reaches 98% accuracy on adult corpora and 97% accuracy on child language corpora. The paper discusses the construction of MOR lexicons with an emphasis on compounds and special conversational forms. The shape of rules for controlling allomorphy and morpheme concatenation are discussed. The analysis of bilingual corpora is illustrated in the context of the Cantonese-English bilingual corpora. Methods for preparing data for MOR analysis and for developing MOR grammars are discussed. We believe that recent computational work using this system is leading to significant advances in child language acquisition theory and theories of grammar identification more generally.
2010
pdf
bib
abs
A Morphologically-Analyzed CHILDES Corpus of Hebrew
Bracha Nir
|
Brian MacWhinney
|
Shuly Wintner
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
We present a corpus of transcribed spoken Hebrew that forms an integral part of a comprehensive data system that has been developed to suit the specific needs and interests of child language researchers: CHILDES (Child Language Data Exchange System). We introduce a dedicated transcription scheme for the spoken Hebrew data that is aware both of the phonology and of the standard orthography of the language. We also introduce a morphological analyzer that was specifically developed for this corpus.
2007
pdf
bib
Phon 1.2: A Computational Basis for Phonological Database Elaboration and Model Testing
Yvan Rose
|
Gregory Hedlund
|
Rod Byrne
|
Todd Wareham
|
Brian MacWhinney
Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition
pdf
bib
High-accuracy Annotation and Parsing of CHILDES Transcripts
Kenji Sagae
|
Eric Davis
|
Alon Lavie
|
Brian MacWhinney
|
Shuly Wintner
Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition
2005
pdf
bib
Item Based Constructions and the Logical Problem
Brian MacWhinney
Proceedings of the Workshop on Psychocomputational Models of Human Language Acquisition
pdf
bib
Automatic Measurement of Syntactic Development in Child Language
Kenji Sagae
|
Alon Lavie
|
Brian MacWhinney
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)
2004
pdf
bib
Collaborative Commentary: Opening Up Spoken Language Databases
Brian MacWhinney
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
pdf
bib
Talkbank: Building an Open Unified Multimodal Database of Communicative Interaction
Brian MacWhinney
|
Steven Bird
|
Christopher Cieri
|
Craig Martell
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
pdf
bib
Adding Syntactic Annotations to Transcripts of Parent-Child Dialogs
Kenji Sagae
|
Brian MacWhinney
|
Alon Lavie
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
2001
pdf
bib
Parsing the CHILDES Database: Methodology and Lessons Learned
Kenji Sagae
|
Alon Lavie
|
Brian MacWhinney
Proceedings of the Seventh International Workshop on Parsing Technologies