Other Workshops and Events (2000)


Contents

up

pdf (full)
bib (full)
Proceedings of the 12th Nordic Conference of Computational Linguistics (NODALIDA 1999)

pdf bib
Proceedings of the 12th Nordic Conference of Computational Linguistics (NODALIDA 1999)
Torbjørn Nordgård

pdf bib
BusTUC–A natural language bus route adviser in Prolog
Tore Amble

pdf bib
Developing a grammar checker for Swedish
Antti Arppe

pdf bib
Detecting grammar errors with Lingsoft’s Swedish grammar checker
Juhani Birn

pdf bib
Pivot Alignment
Lars Borin

pdf bib
Granska–an efficient hybrid system for Swedish grammar checking
Rickard Domeij | Ola Knutsson | Johan Carlberger | Viggo Kann

pdf bib
Adapting an English Information Extraction System to Swedish
Kristofer Franzén

pdf bib
The shortcomings of a tagger
Kristin Hagen | Janne Bondi Johannessen | Anders Nøklestad

pdf bib
Merging Classifiers for Improved Information Retrieval
Anette Hulth | Lars Asker

pdf bib
Extracting Keywords from Digital Document Collections
Anna Jonsson

pdf bib
Ontologically Supported Semantic Matching
Atanas K. Kiryakov | Kiril Iv. Simov

pdf bib
Automatic Detection of Lexicalised Phrases in Swedish
Janne Lindberg

pdf bib
Towards a Finite-State Parser for Swedish
Beáta Megyesi | Sara Rydin

pdf bib
Semantic Clustering of Adjectives and Verbs Based on Syntactic Patterns
Costanza Navarretta

pdf bib
An HPSG Account of Danish Pre-nominals
Anne Neville

pdf bib
Tonem 1 eller 2 eller 1,5? (Toneme 1 or 2 or 1,5?) [In Norwegian]
Arild Noven | Per Arne Larsen | Bente Moxness | Kolbjørn Slethei

pdf bib
Syntactic Analysis and Error Correction for Danish in the SCARRIE Project
Patrizia Paggio

pdf bib
Designing a System for Swedish Spoken Document Retrieval
Botond Pakucs | Björn Gambäck

pdf bib
Statistics and Phonotactical Rules in Finding OCR Errors
Stina Nylander

pdf bib
An Information Retrieval System with Cooperative Behaviour
Paulo Quaresma | Irene Pimenta Rodrigues

pdf bib
An Evaluation of the Translation Corpus Aligner, with special reference to the language pair English-Portuguese
Diana Santos | Signe Oksefjell

pdf bib
Automatic proofreading for Norwegian: The challenges of lexical and grammatical variation
Koenraad de Smedt | Victoria Rosén

pdf bib
Word Alignment Step by Step
Jörg Tiedemann

pdf bib
On Using the Two-level Model as the Basis of Morphological Analysis and Synthesis of Estonian
Heli Uibo

pdf bib
LFG-DOT: Combining Constraint-Based and Empirical Methodologies for Robust MT
Andy Way








up

bib (full) Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop

pdf bib
Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop

pdf bib
Learning in Natural Language: Theory and Algorithmic Approaches
Dan Roth

pdf bib
Corpus-Based Grammar Specialization
Nicola Cancedda | Christer Samuelsson

pdf bib
Pronunciation by Analogy in Normal and Impaired Readers
R.I. Damper | Y. Marchand

pdf bib
The Role of Algorithm Bias vs Information Source in Learning Algorithms for Morphosyntactic Disambiguation
Guy De Pauw | Walter Daelemans

pdf bib
Increasing our Ignorance’ of Language: Identifying Language Structure in an Unknown ‘Signal’
John Elliot | Eric Atwell | Bill Whyte

pdf bib
A Comparison between Supervised Learning Algorithms for Word Sense Disambiguation
Gerard Escudero | Lluís Màrquez | German Rigau

pdf bib
Incorporating Position Information into a Maximum Entropy/Minimum Divergence Translation Model
George Foster

pdf bib
Memory-Based Learning for Article Generation
Guido Minnen | Francis Bond | Ann Copestake

pdf bib
Overfitting Avoidance for Stochastic Modeling of Attribute-Value Grammars
Tony Mullen | Miles Osborne

pdf bib
Learning Distributed Linguistic Classes
Stephan Raaijmakers

pdf bib
Modeling the Effect of Cross-Language Ambiguity on Human Syntax Acquisition
William Gregory Sakas

pdf bib
Knowledge-Free Induction of Morphology Using Latent Semantic Analysis
Patrick Schone | Daniel Jurafsky

pdf bib
Using Induced Rules as Complex Features in Memory-Based Language Learning
Antal van den Bosch

pdf bib
Using Perfect Sampling in Parameter Estimation of a Whole Sentence Maximum Entropy Language Model
F. Amaya | J. M. Benedí

pdf bib
Experiments on Unsupervised Learning for Extracting Relevant Fragments from Spoken Dialog Corpus
Konstantin Biatov

pdf bib
Generating Synthetic Speech Prosody with Lazy Learning in Tree Structures
Laurent Blin | Laurent Miclet

pdf bib
Inducing Syntactic Categories by Context Distribution Clustering
Alexander Clark

pdf bib
ALLiS: a Symbolic Learning System for Natural Language Learning
Hervé Déjean

pdf bib
Combining Text and Heuristics for Cost-Sensitive Spam Filtering
José M. Gómez Hidalgo | Manual Maña López | Enrique Puertas Sanz

pdf bib
Genetic Algorithms for Feature Relevance Assignment in Memory-Based Language Processing
Anne Kool | Walter Daelemans | Jakub Zavrel

pdf bib
Shallow Parsing by Inferencing with Classifiers
Vasin Punyakanok | Dan Roth

pdf bib
Minimal Commitment and Full Lexical Disambiguation: Balancing Rules and Hidden Markov Models
Patrick Ruch | Robert Baud | Pierrette Bouillon | Gilbert Robert

pdf bib
Learning IE Rules for a Set of Related Concepts
J. Turmo | H. Rodriguez

pdf bib
A Default First Order Family Weight Determination Procedure for WPDV Models
Hans van Halteren

pdf bib
A Comparison of PCFG Models
Jose Luis Verdú-Mas | Jorge Calera-Rubio | Rafael C. Carrasco

pdf bib
Introduction to the CoNLL-2000 Shared Task Chunking
Erik F. Tjong Kim Sang | Sabine Buchholz

pdf bib
Learning Syntactic Structures with XML
Hervé Déjean

pdf bib
A Context Sensitive Maximum Likelihood Approach to Chunking
Christer Johansson

pdf bib
Chunking with Maximum Entropy Models
Rob Koeling

pdf bib
Use of Support Vector Learning for Chunk Identification
Taku Kudoh | Yuji Matsumoto

pdf bib
Shallow Parsing as Part-of-Speech Tagging
Miles Osborne

pdf bib
Improving Chunking by Means of Lexical-Contextual Information in Statistical Language Models
Ferran Pla | Antonio Molina | Natividad Prieto

pdf bib
Text Chunking by System Combination
Erik F. Tjong Kim Sang

pdf bib
Chunking with WPDV Models
Hans van Halteren

pdf bib
Single-Classifier Memory-Based Phrase Chunking
Jorn Veenstra | Antal van den Bosch

pdf bib
Phrase Parsing with Rule Sequence Processors: an Application to the Shared CoNLL Task
Marc Vilain | David Day

pdf bib
Hybrid Text Chunking
GuoDong Zhou | Jian Su | TongGuan Tey

pdf bib
Extracting a Domain-Specific Ontology from a Corporate Intranet
Jörg-Uwe Kietz | Raphael Volz | Alexander Maedche

pdf bib
Learning from a Substructural Perspective
Pieter Adriaans | Erik de Haas

pdf bib
Incorporating Linguistics Constraints into Inductive Logic Programming
James Cussens | Stephen Pulman

pdf bib
Learning from Parsed Sentences with INTHELEX
F. Esposito | S. Ferilli | N. Fanizzi | G. Semeraro

pdf bib
Inductive Logic Programming for Corpus-Based Acquisition of Semantic Lexicons
Pascale Sébillot | Pierrette Bouillon | Cecile Fabre

pdf bib
The Acquisition of Word Order by a Computational Learning System
Aline Villavicencio

pdf bib
Recognition and Tagging of Compound Verb Groups in Czech
Eva Zácková | Luboš Popelínský | Miloš Nepil




up

bib (full) 1st SIGdial Workshop on Discourse and Dialogue

pdf bib
1st SIGdial Workshop on Discourse and Dialogue

pdf bib
Japanese Dialogue Corpus of Multi-Level Annotation
Shu Nakazato

pdf bib
ADAM- An Architecture for xml-based Dialogue Annotation on Multiple levels
Claudia Soria | Roldano Cattoni | Morena Danieli

pdf bib
The MATE Markup Framework
Laila Dybkjaer | Niels Ole Bernsen

pdf bib
Issues in the Transcription of English Conversational Grunts
Nigel Ward

pdf bib
Identifying Prosodic Indicators of Dialogue Structure: Some Methodological and Theoretical Considerations
Ilana Mushin | Lesley Stirling | Janet Fletcher | Roger Wales

pdf bib
From Elementary Discourse Units to Complex Ones
Holger Schauer

pdf bib
Abstract Anaphora Resolution in Danish
Costanza Navarretta

pdf bib
Using decision trees to select the grammatical relation of a noun phrase
Simon Corston-Oliver

pdf bib
A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-Document Structure
Dragomir Radev

pdf bib
Social Goals in Conversational Cooperation
Guido Boella | Rossana Damiano | Leonardo Lesmo

pdf bib
Dynamic User Level and Utility Measurement for Adaptive Dialog in a Help-Desk System
Preetam Maloor | Joyce Chai

pdf bib
Dialogue Management in the Agreement Negotiation Process: A Model that Involves Natural Reasoning
Mare Koit | Haldur Oim

pdf bib
Document Transformations and Information States
Staffan Larsson | Annie Zaenen

pdf bib
Dialogue and Domain Knowledge Management in Dialogue Systems
Annika Flycht-Eriksson | Arne Jonsson

pdf bib
Flexible Speech Act Based Dialogue Management
Eli Hagen | Fred Popowich

pdf bib
Dialogue Helpsystem based on Flexible Matching of User Query with Natural Language Knowledge Base
Sadao Kurohashi | Wataru Higasa

pdf bib
WIT: A Toolkit for Building Robust and Real-Time Spoken Dialogu Systems
Mikio Nakano | Noboru Miyazaki | Norihito Yasuda | Akira Sugiyama | Jun-ichi Hirasawa | Kohji Dohsaka | Kiyoaki Aikawa

pdf bib
Some Notes on the Complexity of Dialogues
Jan Alexandersson | Paul Heisterkamp



up

bib (full) Second Chinese Language Processing Workshop

pdf bib
Second Chinese Language Processing Workshop

pdf bib
Two Statistical Parsing Models Applied to the Chinese Treebank
Daniel M. Bikel | David Chiang

pdf bib
Sense-Tagging Chinese Corpus
Hsin-Hsi Chen | Chi-Ching Lin

pdf bib
Knowledge Extraction for Identification of Chinese Organization Names
Keh-Jiann Chen | Chao-jan Chert

pdf bib
Using Co-occurrence Statistics as an Information Source for Partial Parsing of Chinese
Elliott Franco Drabek | Qjang Zhou

pdf bib
Sinica Treebank: Design Criteria, Annotation Guidelines, and On-line Interface
Chu-Ren Huang | Feng-Yi Chen | Keh-Jiann Chen | Zhao-ming Gao | Kuang-Yu Chen

pdf bib
Enhancement of a Chinese Discourse Marker Tagger with C4.5
Benjamin K. T’sou | Tom B.Y Lai | Samuel W.K. Chan | Weijun Gao | Xuegang Zhan

pdf bib
Statistically-Enhanced New Word Identification in a Rule-Based Chinese System
Andi Wu | Zixin Jiang

pdf bib
Comparing Lexicalized Treebank Grammars Extracted from Chinese, Korean, and English Corpora
Fei Xia | Chunghye Han | Martha Palmer | Aravind Joshi

pdf bib
The Research of Word Sense Disambiguation Method Based on Co-occurrence Frequency of Hownet
Erhong Yang | Guoqing Zhang | Yongkui Zhang

pdf bib
A Trainable Method for Extracting Chinese Entity Names and Their Relations
Yimin Zhang | Joe F. Zhou

pdf bib
Statistics Based Hybrid Approach to Chinese Base Phrase Identification
Tie-jun Zhao | Mu-yun Yang | Fang Liu | Jian-min Yao | Hao Yu

pdf bib
A Block-Based Robust Dependency Parser for Unrestricted Chinese Text
Ming Zhou

pdf bib
Annotating Information Structures in Chinese Texts Using HowNet
Kok Wee Gan | Ping Wai Wong

pdf bib
Machine Learning Methods for Chinese Web page Categorization
Ji He | Ah-Hwee Tan | Chew-Lim Tan

pdf bib
Semantic Annotation of Chinese Phrases Using Recursive Graph
Donghong Ji

pdf bib
Text Meaning Representation for Chinese
Wanying Jin

pdf bib
How Should a Large Corpus Be Built?-A Comparative Study of Closure in Annotated Newspaper Corpora from Two Chinese Sources, Towards Building a Larger Representative Corpus Merged from Representative Sublanguage Collections
John J. Kovarik

pdf bib
A Clustering Algorithm for Chinese Adjectives and Nouns
Yang Wen | Chunfa Yuan | Changning Huang

pdf bib
Extraction of Chinese Compound Words - An Experimental Study on a Very Large Corpus
Jian Zhang | Jianfeng Gao | Ming Zhou

pdf bib
An Algorithm for Situation Classification of Chinese Verbs
Xiaodan Zhu | Chunfa Yuan | K.F. Wong | Wenjie Li

pdf bib
Zero Anaphors in Chinese Discourse Processing
Chin-Chuan Cheng


up

bib (full) 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf bib
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf bib
Pattern-Based Disambiguation for Natural Language Processing
Eric Brill

pdf bib
What’s Yours and What’s Mine: Determining Intellectual Attribution in Scientific Text
Simone Teufel | Marc Moens

pdf bib
Japanese Dependency Structure Analysis Based on Support Vector Machines
Taku Kudo | Yuji Matsumoto

pdf bib
Coaxing Confidences from an Old Freind: Probabilistic Classifications from Transformation Rule Lists
Radu Florian | John C. Henderson | Grace Ngai

pdf bib
Topic Analysis Using a Finite Mixture Model
Hang Li | Kenji Yamanishi

pdf bib
Sample Selection for Statistical Grammar Induction
Rebecca Hwa

pdf bib
A Uniform Method of Grammar Extraction and Its Applications
Fei Xia | Martha Palmer | Aravind Joshi

pdf bib
Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger
Kristina Toutanvoa | Christopher D. Manning

pdf bib
Error-driven HMM-based Chunk Tagger with Context-dependent Lexicon
GuoDong Zhou | Jian Su

pdf bib
Nonlocal Language Modeling based on Context Co-occurrence Vectors
Sadao Kurohashi | Manabu Ori

pdf bib
Detection of Language (Model) Errors
K.Y. Hung | R.W.P. Luk | D. Yeung | K.F.L. Chung | W. Shu

pdf bib
Cross-lingual Information Retrieval Using Hidden Markov Models
Jinxi Xu | Ralph Weischedel

pdf bib
Query Translation in Chinese-English Cross-Language Information Retrieval
Yibo Zhang | Le Sun | Lin Du | Yufang Sun

pdf bib
Word Alignment of English-Chinese Bilingual Corpus Based on Chucks
Le Sun | Youbing Jin | Lin Du | Yufang Sun

pdf bib
Empirical Term Weighting and Expansion Frequency
Kyoji Umemura | Kenneth W. Church

pdf bib
A Machine Learning Approach to Answering Questions for Reading Comprehension Tests
Hwee Tou Ng | Leong Hwee Teo | Jennifer Lai Pheng Kwan

pdf bib
Automated Construction of Database Interfaces: Intergrating Statistical and Relational Learning for Semantic Parsing
Lappoon R. Tang | Raymond J. Mooney

pdf bib
Automatic WordNet Mapping Using Word Sense Disambiguation
Daniel M. Bikel

pdf bib
A Real-time Integration Of Concept-based Search and Summarization of Chinese Websites
Joe F. Zhou | Weiquan Liu

pdf bib
A Statistical Model for Parsing and Word-Sense Disambiguation
Daniel M. Bikel

pdf bib
Reducing Parsing Complexity by Intra-Sentence Segmentation based on Maximum Entropy Model
Sung Dong Kim | Byoung-Tak Zhang | Yung Taek Kim

pdf bib
An Empirical Study of the Domain Dependence of Supervised Word Disambiguation Systems
Gerard Escudero | Lluis Marquez | German Rigau

pdf bib
Combining Lexical and Formatting Cues for Named Entity Acquisition from the Web
Christian Jacquemin | Caroline Bush

pdf bib
A Query Tool for Syntactically Frame Acquisition
Laura Kallmeyer

pdf bib
Statistical Filtering and Subcategorization Frame Acquisition
Anna Korhonen | Genevieve Gorrell | Diana McCarthy

pdf bib
One Sense per Collocation and Genre/Topic Variations
David Martinez | Eneko Agirre

pdf bib
Using Semantically Motivated Estimates to Help Subcategorization Acquisition
Anna Korhonen

pdf bib
Author Index


up

bib (full) INLG’2000 Proceedings of the First International Conference on Natural Language Generation

pdf bib
INLG’2000 Proceedings of the First International Conference on Natural Language Generation
Michael Elhadad

pdf bib
Evaluation Metrics for Generation
Srinivas Bangalore | Owen Rambow | Steve Whittaker

pdf bib
A Task-based Framework to Evaluate Evaluative Arguments
Giuseppe Carenini

pdf bib
An empirical study of multilingual natural language generation: What Should a Text Planner Do?
Daniel Marcu | Lynn Carlson | Maki Watanabe

pdf bib
Document structure and multilingual authoring
Caroline Brun | Marc Dymetman | Veronika Lux

pdf bib
DTD-driven bilingual document generation
Arantza Casillas | Joseba Abaitua | Raquel Martínez

pdf bib
Towards the Generation of Rebuttals in a Bayesian Argumentation System
Nathalie Jitnah | Ingrid Zukerman | Richard McConachy | Sarah George

pdf bib
A strategy for generating evaluative arguments
Giuseppe Carenini | Johanna Moore

pdf bib
Using Argumentation Strategies in Automated Argument Generation
Ingrid Zukerman | Richard McConachy | Sarah George

pdf bib
An extended architecture for robust generation
Tilman Becker | Anne Kilger | Patrice Lopez | Peter Poller

pdf bib
Reinterpretation of an Existing NLG System in a Generic Generation Architecture
Lynne Cahill | Christy Doran | Roger Evans | Chris Mellish | Daniel Paiva | Mike Reape | Donia Scott | Neil Tipper

pdf bib
An integrated framework for text planning and pronominalisation
Rodger Kibble | Richard Power

pdf bib
Incremental Event Conceptualization and Natural Language Generation in Monitoring Enviroments
Markus Guhe | Christopher Habel | Heike Tappe

pdf bib
The hyperonym problem revisited: Conceptual and lexical hierarchies in language generation
Manfred Stede

pdf bib
Generating Referring Quantified Expressions
James Shaw | Kathleen McKeown

pdf bib
An Empirical Analysis of Constructing Non-restrictive NP Modifiers to Express Semantic Relations
Hua Cheng | Chris Mellish

pdf bib
On identifying sets
Matthew Stone

pdf bib
Content aggregation in natural language hypertext summarization of OLAP and Data Mining Discoveries
Jacques Robin | Eloi L. Favero

pdf bib
Optimising text quality in generation from relational databases
Michael O’Donnell | Alistair Knott | Jon Oberlander | Chris Mellish

pdf bib
Generating a controlled language
Laurence Danlos | Guy Lapalme | Veronika Lux

pdf bib
Multilingual Summary Generation in a Speech-To-Speech Translation System for Multilingual Dialogues
Jan Alexandersson | Peter Poller | Michael Kipp | Ralf Engel

pdf bib
Planning word-order dependant focus assignments
Cornelia Endriss | Ralf Klabunde

pdf bib
Enriching partially-specified representations for text realization using an attribute grammar
Songsak Channarukul | Susan W. McRoy | Syed S. Ali

pdf bib
Coordination and context-dependence in the generation of embodied conversation
Justine Cassell | Matthew Stone | Hao Yan

pdf bib
Generating Vague Descriptions
Kees van Deemter

pdf bib
Capturing the Interaction between Aggregation and Text Planning in Two Generation Systems
Hua Cheng | Chris Mellish

pdf bib
Can text structure be incompatible with rhetorical structure?
Nadjet Bouayad-Agha | Richard Power | Donia Scott

pdf bib
Robust, applied morphological generation
Guido Minnen | John Carroll | Darren Pearce

pdf bib
Integrating a Large-Scale, Reusable Lexicon with a Natural Language Generator
Hongyan Jing | Yael Dahan | Michael Elhadad | Kathy McKeown

pdf bib
Knowledge Acquisition for Natural Language Generation
Ehud Reiter | Roma Robertson | Liesl Osman

pdf bib
From Context to Sentence Form
Sabine Geldof

pdf bib
The CLEF semi-recursive generation algorithm
Rodrigo Reyes

pdf bib
Sentence Generation and Neural Networks
Kathrine Hammervold

pdf bib
Rhetorical Structure in Dialog
Amanda Stent

pdf bib
RSTTool 2.4 - A markup Tool for Rhetorical Structure Theory
Michael O’Donnell

pdf bib
Demonstration of ILEX 3.0
Michael O’Donnell | Alistair Knott | Jon Oberlander | Chris Mellish

pdf bib
A development Environment for an MTT-Based Sentence Generator
Bernd Bohnet | Andreas Langjahr | Leo Wanner

pdf bib
YAG: A Template-Based Generator for Real-Time Systems
Susan W. McRoy | Songsak Channarukul | Syed S. Ali

pdf bib
An Efficient Text Summarizer using Lexical Chains
H. Gregory Silber | Kathleen F. McCoy

pdf bib
Invited Talk: From lexica-aspectual components to syntax
Nomi Erteschik-Shir | T.R. Rapoport

pdf bib
Discussion Panel on Evaluation in Generation Research
Inderjeet Mani




up

pdf (full)
bib (full)
Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content

pdf bib
Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content
Paul Buitelaar | Kôiti Hasida

pdf bib
Semantic Annotation of a Japanese Speech Corpus
John Fry | Francis Bond

pdf bib
Exploring Automatic Word Sense Disambiguation with Decision Lists and the Web
Eneko Agirre | David Martinez

pdf bib
Improving Natural Language Processing by Linguistic Document Annotation
Hideo Watanabe | Katashi Nagao | Michael McCord | Arendse Bernth

pdf bib
Building an Annotated Corpus in the Molecular-Biology Domain
Yuka Tateisi | Tomoko Ohta | Nigel Collier | Chikashi Nobata | Jun-ichi Tsujii

pdf bib
Semantic Annotation for Generation: Issues in Annotating a Corpus to Develop and Evaluate Discourse Entity Realization Algorithms
Massimo Poesio

pdf bib
An Environment for Extracting Resolution Rules of Zero Pronouns from Corpora
Hiromi Nakaiwa

pdf bib
Discourse Structure Analysis for News Video
Yasuhiko Watanabe | Yoshihiro Okada | Sadao Kurohashi | Eiichi Iwanari

pdf bib
Alignment of Sound Track with Text in a TV Drama
Seigo Tanimura | Hiroshi Nakagawa

pdf bib
Semantic Transcoding: Making the WWW More Understandable and Usable with External Annotations
Katashi Nagao | Shingo Hosoya | Yoshinari Shirai | Kevin Squire

pdf bib
From Manual to Semi-Automatic Semantic Annotation: About Ontology-Based Text Annotation Tools
Michael Erdmann | Alexander Maedche | Hans-Peter Schnurr | Steffen Staab




up

pdf (full)
bib (full)
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf bib
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf bib
The current status of FTAG
Anne Abeillé | Marie-Hélène Candito | Alexandra Kinyon

pdf bib
A redefinition of Embedded Push-Down Automata
Miguel A. Alonso | Éric Villemonte de la Clergerie | Manuel Vilares

pdf bib
Practical aspects in compiling tabular TAG parsers
Miguel A. Alonso | Djamé Seddah | Éric Villemonte de la Clergerie

pdf bib
Using TAGs, a Tree Model, and a Language Model for Generation
Srinivas Bangalore | Owen Rambow

pdf bib
Lexik: a maintenance tool for FTAG
Nicolas Barrier | Sébastien Barrier | Alexandra Kinyon

pdf bib
Adapting HPSG-to-TAG compilation to wide-coverage grammars
Tilman Becker | Patrice Lopez

pdf bib
Engineering a Wide-Coverage Lexicalized Grammar
John Carroll | Nicolas Nicolov | Olga Shaumyan | Martine Smets | David Weir

pdf bib
Some remarks on an extension of synchronous TAG
David Chiang | William Schuler | Mark Dras

pdf bib
Bidirectional parsing of TAG without heads
Víctor J. Díaz | Miguel A. Alonso | Vicente Carrillo

pdf bib
Punctuation in a Lexicalized Grammar
Christine Doran

pdf bib
A faster parsing algorithm for Lexicalized Tree-Adjoining Grammars
Jason Eisner | Giorgio Satta

pdf bib
Economy in TAG
Robert Frank

pdf bib
The Sino-Korean light verb construction and lexical argument structure
Chung-hye Han | Owen Rambow

pdf bib
Complexity of Linear Order Computation in Performance Grammar, TAG and HPSG
Karin Harbusch | Gerard Kempen

pdf bib
Relationship between strong and weak generative power of formal systems
Aravind K. Joshi

pdf bib
An alternative description of extractions in TAG
Sylvain Kahane | Marie-Hélène Candito | Yannick de Kercadio

pdf bib
How to solve some failures of LTAG
Sylvain Kahane

pdf bib
Scrambling in German and the non-locality of local TDGs
Laura Kallmeyer

pdf bib
Contextual Tree Adjoining Grammars
Martin Kappes

pdf bib
Even better than Supertags: Introducing Hypertags!
Alexandra Kinyon

pdf bib
Building a class-based verb lexicon using TAGs
Karin Kipper | Hoa Trang Dang | William Schuler | Martha Palmer

pdf bib
LTAG Workbench: A general framework for LTAG
Patrice Lopez

pdf bib
Derivational minimalism in two regular and logical steps
Jens Michaelis | Uwe Mönnich | Frank Morawietz

pdf bib
A logical approach to structure sharing in TAGs
Adi Palm

pdf bib
From intuitionistic proof nets to Interaction Grammars
Guy Perrier

pdf bib
A comparison of the XTAG and CLE Grammars for English
Manny Rayner | Beth Ann Hockey | Frankie James

pdf bib
Practical experiments in parsing using Tree Adjoining Grammars
Anoop Sarkar

pdf bib
Lexicalized grammar and the description of motion events
Matthew Stone | Tonia Bleam | Christine Doran | Martha Palmer

pdf bib
Extending Linear Indexed Grammars
Christian Wartena

pdf bib
A Corpus-based evaluation of syntactic locality in TAGs
Fei Xia | Tonia Bleam

pdf bib
Customizing the XTAG system for efficient grammar development for Korean
Juntae Yoon | Chung-hye Han | Nari Kim | Meesook Kim

pdf bib
Deriving polarity effects
Raffaella Bernardi

pdf bib
Un outil pour calculer des arbres de dépendance à partir d’arbres de dérivation
Lionel Clément

pdf bib
Elementary trees for syntactic and statistical disambiguation
Rodolfo Delmonte | Luminita Chiran | Ciprian Bacalu

pdf bib
How problematic are clitics for S-TAG translation?
Mark Dras | Tonia Bleam

pdf bib
Reuse of plan-based knowledge sources in a uniform TAG-based generation system
Karin Harbusch | Jens Woch

pdf bib
CDL-TAGs: A grammar formalism for flexible and efficient syntactic generation
Anne Kilger | Peter Poller

pdf bib
Predicative LTAG grammars for Term Analysis
Patrice Lopez | David Roussel

pdf bib
Reliability in example-based parsing
Oliver Streiter

pdf bib
LFG-DOT: a probabilistic, constraint-based model for machine translation
Andy Way

pdf bib
Comparing and integrating Tree Adjoining Grammars
Fei Xia | Martha Palmer


up

pdf (full)
bib (full)
Proceedings of the Sixth International Workshop on Parsing Technologies

pdf bib
Proceedings of the Sixth Internatonal Workshop on Parsing Technologies

pdf bib
Automatic Grammar Induction: Combining, Reducing and Doing Nothing
Eric Brill | John C. Henderson | Grace Ngai

This paper surveys three research directions in parsing. First, we look at methods for both automatically generating a set of diverse parsers and combining the outputs of different parsers into a single parse. Next, we will discuss a parsing method known as transformation-based parsing. This method, though less accurate than the best current corpus-derived parsers, is able to parse quite accurately while learning only a small set of easily understood rules, as opposed to the many-megabyte parameter files learned by other techniques. Finally, we review a recent study exploring how people and machines compare at the task of creating a program to automatically annotate noun phrases.

pdf bib
Guides and Oracles for Linear-Time Parsing
Martin Kay

If chart parsing is taken to include the process of reading out solutions one by one, then it has exponential complexity. The stratagem of separating read-out from chart construction can also be applied to other kinds of parser, in particular, to left-comer parsers that use early composition. When a limit is placed on the size of the stack in such a parser, it becomes context-free equivalent. However, it is not practical to profit directly from this observation because of the large state sets that are involved in otherwise ordinary situations. It may be possible to overcome these problems by means of a guide constructed from a weakened version of the initial grammar.

pdf bib
Parsing Techniques for Lexicalized Context-Free Grammars
Giorgio Satta

pdf bib
A Bootstrapping Approach to Parser Development
Izaskun Aldezabal | Koldo Gojenola | Kepa Sarasola

This paper presents a robust parsing system for unrestricted Basque texts. It analyzes a sentence in two stages: a unification-based parser builds basic syntactic units such as NPs, PPs, and sentential complements, while a finite-state parser performs syntactic disambiguation and filtering of the results. The system has been applied to the acquisition of verbal subcategorization information, obtaining 66% recall and 87% precision in the determination of verb subcategorization instances. This information will be later incorporated to the parser, in order to improve its performance.

pdf bib
New Tabular Algorithms for Parsing
Miguel A. Alonso | Jorge Graña | Manuel Vilares | Eric de la Clergerie

We develop a set of new tabular parsing algorithms for Linear Indexed Grammars, including bottom-up algorithms and Earley-like algorithms with and without the valid prefix property, creating a continuum in which one algorithm can in turn be derived from another. The output of these algorithms is a shared forest in the form of a context-free grammar that encodes all possible derivations for a given input string.

pdf bib
Customizable Modular Lexicalized Parsing
R. Basili | M. T. Pazienza | F. M. Zanzotto

Different NLP applications have different efficiency constraints (i.e. quality of the results and throughput) that reflect on each core linguistic component. Syntactic processors are basic modules in some NLP application. A customization that permits the performance control of these components enables their reuse in different application scenarios. Throughput has been commonly improved using partial syntactic processors. On the other hand, specialized lexicons are generally employed to improve the quality of the syntactic material produced by specific parsing (sub)process (e.g. verb argument detection or PP attachment disambiguation) . Building upon the idea of grammar stratification, in this paper a method to push modularity and lexical sensitivity, in parsing, in view of customizable syntactic analysers is presented. A framework for modular parser design is proposed and its main properties are discussed. Parsers (i.e. different parsing module chains) are then presented and their performances are analyzed in an application-driven scenarios.

pdf bib
Range Concatenation Grammars
Pierre Boullier

In this paper we present Range Concatenation Grammars, a syntactic formalism which possesses many attractive features among which we underline here, power and closure properties. For example, Range Concatenation Grammars are more powerful than Linear Context-Free Rewriting Systems though this power is not reached to the detriment of efficiency since its sentences can always be parsed in polynomial time. Range Concatenation Languages are closed both under intersection and complementation and these closure properties may allow to consider novel ways to describe some linguistic processings. We also present a parsing algorithm which is the basis of our current prototype implementation.

pdf bib
Automated Extraction of TAGs from the Penn Treebank
John Chen | K. Vijay-Shanker

The accuracy of statistical parsing models can be improved with the use of lexical information. Statistical parsing using Lexicalized tree adjoining grammar (LTAG), a kind of lexicalized grammar, has remained relatively unexplored. We believe that is largely in part due to the absence of large corpora accurately bracketed in terms of a perspicuous yet broad coverage LTAG. Our work attempts to alleviate this difficulty. We extract different LTAGs from the Penn Treebank. We show that certain strategies yield an improved extracted LTAG in terms of compactness, broad coverage, and supertagging accuracy. Furthermore, we perform a preliminary investigation in smoothing these grammars by means of an external linguistic resource, namely, the tree families of an XTAG grammar, a hand built grammar of English.

pdf bib
From Cases to Rules and Vice Versa: Robust Practical Parsing With Analogy
Alex Chengyu Fang

This article describes the architecture of the Survey Parser and discusses two major components related to the analogy-based parsing of unrestricted English. Firstly, it discusses the automatic generation of a large declarative formal grammar from a corpus that has been syntactically analysed. Secondly, it describes analogy-based parsing that employs both the automatically learned rules and the database of cases to determine the syntactic structure of the input string. Statistics are presented to characterise the performance of the parsing system.

pdf bib
A Transformation-based Parsing Technique With Anytime Properties
Kilian Foth | Ingo Schröder | Wolfgang Menzel

A transformation-based approach to robust parsing is presented, which achieves a strictly monotonic improvement of its current best hypothesis by repeatedly applying local repair steps to a complex multi-level representation. The transformation process is guided by scores derived from weighted constraints. Besides being interruptible, the procedure exhibits a performance profile typical for anytime procedures and holds great promise for the implementation of time-adaptive behaviour.

pdf bib
SOUP: A Parser for Real-world Spontaneous Speech
Marsal Gavaldà

This paper describes the key features of SOUP, a stochastic, chart-based, top-down parser, especially engineered for real-time analysis of spoken language with very large, multi-domain semantic grammars. SOUP achieves flexibility by encoding context-free grammars, specified for example in the Java Speech Grammar Format, as probabilistic recursive transition networks, and robustness by allowing skipping of input words at any position and producing ranked interpretations that may consist of multiple parse trees. Moreover, SOUP is very efficient, which allows for practically instantaneous backend response.

pdf bib
A Recognizer for Minimalist Grammars
Henk Harkema

Minimalist Grammars are a rigorous formalization of the sort of grammars proposed in the linguistic framework of Chomsky’s Minimalist Program. One notable property of Minimalist Grammars is that they allow constituents to move during the derivation of a sentence, thus creating discontinuous constituents. In this paper we will present a bottom-up parsing method for Minimalist Grammars, prove its correctness, and discuss its complexity.

pdf bib
A Neural Network Parser that Handles Sparse Data
James Henderson

Previous work has demonstrated the viability of a particular neural network architecture, Simple Synchrony Networks, for syntactic parsing. Here we present additional results on the performance of this type of parser, including direct comparisons on the same dataset with a standard statistical parsing method, Probabilistic Context Free Grammars. We focus these experiments on demonstrating one of the main advantages of the SSN parser over the PCFG, handling sparse data. We use smaller datasets than are typically used with statistical methods, resulting in the PCFG finding parses for under half of the test sentences, while the SSN finds parses for all sentences. Even on the PCFG ‘s parsed half, the SSN performs better than the PCFG, as measure by recall and precision on both constituents and a dependency-like measure.

pdf bib
A Context-free Approximation of Head-driven Phrase Structure Grammar
Bernd Kiefer | Hans-Ulrich Krieger

We present a context-free approximation of unification-based grammars, such as HPSG or PATR-II. The theoretical underpinning is established through a least fixpoint construction over a certain monotonic function. In order to reach a finite fixpoint, the concrete implementation can be parameterized in several ways , either by specifying a finite iteration depth, by using different restrictors, or by making the symbols of the CFG more complex adding annotations a la GPSG. We also present several methods that speed up the approximation process and help to limit the size of the resulting CF grammar.

pdf bib
Optimal Ambiguity Packing in Context-free Parsers with Interleaved Unification
Alon Lavie | Carolyn Penstein Rosé

Ambiguity packing is a well known technique for enhancing the efficiency of context-free parsers. However, in the case of unification-augmented context-free parsers where parsing is interleaved with feature unification, the propagation of feature structures imposes difficulties on the ability of the parser to effectively perform ambiguity packing. We demonstrate that a clever heuristic for prioritizing the execution order of grammar rules and parsing actions can achieve a high level of ambiguity packing that is provably optimal. We present empirical evaluations of the proposed technique, performed with both a Generalized LR parser and a chart parser, that demonstrate its effectiveness.

pdf bib
Extended Partial Parsing for Lexicalized Tree Grammars
Patrice Lopez

Existing parsing algorithms for Lexicalized Tree Grammars (LTG) formalisms (LTAG, TIG, DTG, ... ) are adaptations of algorithms initially dedicated to Context Free Grammars (CFG). They do not really take into account the fact that we do not use context free rules but partial parsing trees that we try to combine. Moreover the lexicalization raises up the important problem of multiplication of structures, a problem which does not exist in CFG. This paper presents parsing techniques for LTG taking into account these two fundamental features. Our approach focuses on robust and pratical purposes. Our parsing algorithm results in more extended partial parsing when the global parsing fails and in an interesting average complexity compared with others bottom-up algorithms.

pdf bib
Improved Left-corner Chart Parsing for Large Context-free Grammars
Robert C. Moore

We develop an improved form of left-corner chart parsing for large context-free grammars, introducing improvements that result in significant speed-ups more compared to previously-known variants of left corner parsing. We also compare our method to several other major parsing approaches, and find that our improved left-corner parsing method outperforms each of these across a range of grammars. Finally, we also describe a new technique for minimizing the extra information needed to efficiently recover parses from the data structures built in the course of parsing.

pdf bib
Measure for Measure: Parser Cross-fertilization - Towards Increased Component Comparability and Exchange
Stephan Oepen | Ulrich Callmeier

Over the past few years significant progress was accomplished in efficient processing with wide-coverage HPSG grammars. HPSG-based parsing systems are now available that can process medium-complexity sentences (of ten to twenty words, say) in average parse times equivalent to real (i.e. human reading) time. A large number of engineering improvements in current HPSG systems were achieved through collaboration of multiple research centers and mutual exchange of experience, encoding techniques, algorithms, and even pieces of software. This article presents an approach to grammar and system engineering, termed competence & performance profiling, that makes systematic experimentation and the precise empirical study of system properties a focal point in development. Adapting the profiling metaphor familiar from software engineering to constraint-based grammars and parsers, enables developers to maintain an accurate record of system evolution, identify grammar and system deficiencies quickly, and compare to earlier versions or between different systems. We discuss a number of exemplary problems that motivate the experimental approach, and apply the empirical methodology in a fairly detailed discussion of what was achieved during a development period of three years. Given the collaborative nature in setup, the empirical results we present involve research and achievements of a large group of people.

pdf bib
Computing the Most Probable Parse for a Discontinuous Phrase Structure Grammar
Oliver Plaehn

This paper presents a probabilistic extension of Discontinuous Phrase Structure Grammar (DPSG), a formalism designed to describe discontinuous constituency phenomena adequately and perspicuously by means of trees with crossing branches. We outline an implementation of an agenda-based chart parsing algorithm that is capable of computing the Most Probable Parse for a given input sentence for probabilistic versions of both DPSG and Context-Free Grammar. Experiments were conducted with both types of grammars extracted from the NEGRA corpus. In spite of the much greater complexity of DPSG parsing in terms of the number of (partial) analyses that can be constructed for an input sentence, accuracy results from both experiments are comparable. We also briefly hint at future lines of research aimed at more efficient ways of probabilistic parsing with discontinuous constituents.

pdf bib
An Efficient LR Parser Generator for Tree Adjoining Grammars
Carlos A. Prolo

The first published LR algorithm for Tree Adjoining Grammars (TAGs [Joshi and Schabes, 1996]) was due to Schabes and Vijay-Shanker [1990] . Nederhof [1998] showed that it was incorrect (after [Kinyon, 1997]), and proposed a new one. Experimenting with his new algorithm over the XTAG English Grammar [XTAG Research Group, 1998] he concluded that LR parsing was inadequate for use with reasonably sized grammars because the size of the generated table was unmanageable. Also the degree of conflicts is too high. In this paper we discuss issues involved with LR parsing for TAGs and propose a new version of the algorithm that, by maintaining the degree of prediction while deferring the “subtree reduction”, dramatically reduces both the average number of conflicts per state and the size of the parser.

pdf bib
Parsing Scrambling with Path Set: a Graded Grammaticality Approach
Siamak Rezaei

In this work we introduce the notion of path set for parsing free word order languages. The parsing system uses this notion to parse examples of sentences with scrambling. We show that by using path set, the performance constraints on scrambling such as Resource Limitation Principle (RLP) can be represented easily. Our work contrasts with models based on the notion of immediate dominance rule and binary precedence relations. In our work the precedence relations and word order constraints are defined locally for each clause. Our binary precedence relations are examples of fuzzy relations with weights attached to them. As a result, the word order principles in our approach can be violated and each violation contributes to a lowering of the overall acceptability and grammaticality. The work suggests a robust principle-based approach to parsing ambiguous sentences in verb final languages.

pdf bib
On the Use of Grammar Based Language Models for Statistical Machine Translation
Hassan Sawaf | Kai Schütz | Hermann Ney

In this paper, we describe some concepts of language models beyond the usually used standard trigram and use such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target language. One important requirement for the language model is the correct word order, given a certain choice of words, and to score the translations generated by the translation model Pr(f1J/eI1), in view of the syntactic context. In addition to standard m-grams with long histories, we examine the use of Part-of-Speech based models as well as linguistically motivated grammars with stochastic parsing as a special type of language model. Translation results are given on the VERBMOBIL task, where translation is performed from German to English, with vocabulary sizes of 6500 and 4000 words, respectively.

pdf bib
Algebraic Construction of Parsing Schemata
Karl-Michael Schneider

We propose an algebraic method for the design of tabular parsing algorithms which uses parsing schemata [7]. The parsing strategy is expressed in a tree algebra. A parsing schema is derived from the tree algebra by means of algebraic operations such as homomorphic images, direct products, subalgebras and quotient algebras. The latter yields a tabular interpretation of the parsing strategy. The proposed method allows simpler and more elegant correctness proofs by using general theorems and is not limited to left-right parsing strategies, unlike current automaton-based approaches. Furthermore, it allows to derive parsing schemata for linear indexed grammars (LIG) from parsing schemata for context-free grammars by means of a correctness preserving algebraic transformation. A new bottom-up head corner parsing schema for LIG is constructed to demonstrate the method.

pdf bib
A Spanish POS Tagger with Variable Memory
José Triviño | Rafael Morales-Bueno

An implementation of a Spanish POS tagger is described in this paper. This implementation combines three basic approaches: a single word tagger based on decision trees, a POS tagger based on variable memory Markov models, and a feature structures set of tags. Using decision trees for single word tagging allows the tagger to work without a lexicon that lists only possible tags. Moreover, it decreases the error rate because there are no unknown words. The feature structure set of tags is advantageous when the available training corpus is small and the tag set large, which can be the case with morphologically rich languages like Spanish. Finally, variable memory Markov models training is more efficient than traditional full-order Markov models and achieves better accuracy. In this implementation, 98.58% of tokens are correctly classified.

pdf bib
Parsing a Lattice with Multiple Grammars
Fuliang Weng | Helen Meng | Po Chui Luk

Efficiency, memory, ambiguity, robustness and scalability are the central issues in natural language parsing. Because of the complexity of natural language, different parsers may be suited only to certain subgrammars. In addition, grammar maintenance and updating may have adverse effects on tuned parsers. Motivated by these concerns, [25] proposed a grammar partitioning and top-down parser composition mechanism for loosely restricted Context-Free Grammars (CFGs). In this paper, we report on significant progress, i.e., (1) developing guidelines for the grammar partition through a set of heuristics, (2) devising a new mix-strategy composition algorithms for any rule-based grammar partition in a lattice framework, and 3) initial but encouraging parsing results for Chinese and English queries from an Air Travel Information System (ATIS) corpus.

pdf bib
Modular Unification-based Parsers
Rémi Zajac | Jan Amtrup

We present an implementation of the notion of modularity and composition applied to unification based grammars. Monolithic unification grammars can be decomposed into sub-grammars with well defined interfaces. Sub-grammars are applied in a sequential manner at runtime, allowing incremental development and testing of large coverage grammars. The modular approach to grammar development leads us away from the traditional view of parsing a string of input symbols as the recognition of some start symbol, and towards a richer and more flexible view where inputs and outputs share the same structural properties.

pdf bib
Hypergraph Unification-based Parsing for Incremental Speech Processing
Jan Amtrup

pdf bib
Parsing Mildly Context-sensitive RMS
Tilman Becker | Dominik Heckmann

We introduce Recursive Matrix Systems (RMS) which encompass mildly context-sensitive formalisms and present efficient parsing algorithms for linear and context-free variants of RMS. The time complexities are 𝒪(n2h + 1), and 𝒪(n3h) respectively, where h is the height of the matrix. It is possible to represent Tree Adjoining Grammars (TAG [1], MC-TAG [2], and R-TAG [3]) as RMS uniformly.

pdf bib
Property Grammars: a Solution for Parsing with Constraints
Philippe Blache

pdf bib
Grammar Organization for Cascade-based Parsing in Information Extraction
Fabio Ciravegna | Alberto Lavelli

pdf bib
A Bidirectional Bottom-up Parser for TAG
Víctor Díaz | Vicente Carrillo | Miguel Alonso

pdf bib
A Finite-state Parser with Dependency Structure Output
David Elworthy

We show how to augment a finite-state grammar with annotations which allow dependency structures to be extracted. There are some difficulties in determinising the grammar, which is an essential step for computational efficiency, but they can be overcome. The parser also allows syntactically ambiguous structures to be packed into a single representation.

pdf bib
Discriminant Reverse LR Parsing of Context-free Grammars
Jacques Farré

pdf bib
Direct Parsing of Schema-TAGs
Karin Harbusch | Jens Woch

pdf bib
Analysis of Equation Structure using Least Cost Parsing
R. Nigel Horspool | John Aycock

Mathematical equations in LaTeX are composed with tags that express formatting as opposed to structure. For conversion from LaTeX to other word-processing systems, the structure of each equation must be inferred. We show how a form of least cost parsing used with a very general and ambiguous grammar may be used to select an appropriate structure for a LaTeX equation. MathML provides another application for the same technology; it has two alternative tagging schemes - presentation tags to specify formatting and content tags to specify structure. While conversion from content tagging to presentation tagging is straightforward, the converse is not. Our implementation of least cost parsing is based on Earley’s algorithm.

pdf bib
Exploiting Parallelism in Unification-based Parsing
Marcel P. van Lohuizen

Because of the nature of the parsing problem, unification-based parsers are hard to parallelize. We present a parallelization technique designed to cope with these difficulties.

pdf bib
Partial Parsing with Grammatical Features
Natasa Manousopoulou | George Papakonstantinou | Panayotis Tsanakas

This paper describes a rule based method for partial parsing, particularly for noun phrase recognition, which has been used in the development of a noun phrase recognizer for Modern Greek. This technique is based on a cascade of finite state machines, adding to them a characteristic very crucial in the parsing of words with free word order: the simultaneous examination of part of speech and grammatical feature information, which are deemed equally important during the parsing procedure, in contrast with other methodologies.

pdf bib
Uniquely Parsable Accepting Grammar Systems
Carlos Martín-Vide | Victor Mitrana

pdf bib
Chart Parsing as Constraint Propagation
Frank Morawietz

pdf bib
Tree-structured Chart Parsing
Paul W. Placeway

We investigate a method of improving the memory efficiency of a chart parser. Specifically, we propose a technique to reduce the number of active arcs created in the process of parsing. We sketch the differences in the chart algorithm, and provide empirical results that demonstrate the effectiveness of this technique.

pdf bib
A Parsing Methodology for Error Detection
Davide Turcato | Devlan Nicholson | Trude Heift | Janine Toole | Stavroula Tsiplakou

pdf bib
Dependency Model using Posterior Context
Kiyotaka Uchimoto | Masaki Murata | Satoshi Sekine | Hitoshi Isahara

We describe a new model for dependency structure analysis. This model learns the relationship between two phrasal units called bunsetsus as three categories; ‘between’, ‘dependent’, and ‘beyond’, and estimates the dependency likelihood by considering not only the relationship between two bunsetsus but also the relationship between the left bunsetsu and all of the bunsetsus to its right. We implemented this model based on the maximum entropy model. When using the Kyoto University corpus, the dependency accuracy of our model was 88%, which is about 1% higher than that of the conventional model using exactly the same features.

pdf bib
The Editing Distance in Shared Forest
Manuel Vilares | David Cabrero | Francisco J. Ribadas

In an information system indexing can be accomplished by creating a citation based on context-free parses, and matching becomes a natural mechanism to extract patterns. However, the language intended to represent the document can often only be approximately defined, and indices can become shared forests. Queries could also vary from indices and an approximate matching strategy becomes also necessary. We present a proposal intended to prove the applicability of tabulation techniques in this context.