Masaru Tomita

Also published as: M. Tomita


1993

pdf
GLR* – An Efficient Noise-skipping Parsing Algorithm For Context Free Grammars
Alon Lavie | Masaru Tomita
Proceedings of the Third International Workshop on Parsing Technologies

This paper describes GLR*, a parser that can parse any input sentence by ignoring unrecognizable parts of the sentence. In case the standard parsing procedure fails to parse an input sentence, the parser nondeterministically skips some word(s) in the sentence, and returns the parse with fewest skipped words. Therefore, the parser will return some parse(s) with any input sentence, unless no part of the sentence can be recognized at all. The problem can be defined in the following way: Given a context-free grammar G and a sentence S, find and parse S' – the largest subset of words of S, such that S' ∈ L(G). The algorithm described in this paper is a modification of the Generalized LR (Tomita) parsing algorithm [Tomita, 1986] . The parser accommodates the skipping of words by allowing shift operations to be performed from inactive state nodes of the Graph Structured Stack. A heuristic similar to beam search makes the algorithm computationally tractable. There have been several other approaches to the problem of robust parsing, most of which are special purpose algorithms [Carbonell and Hayes, 1984] , [Ward, 1991] and others. Because our approach is a modification to a standard context-free parsing algorithm, all the techniques and grammars developed for the standard parser can be applied as they are. Also, in case the input sentence is by itself grammatical, our parser behaves exactly as the standard GLR parser. The modified parser, GLR*, has been implemented and integrated with the latest version of the Generalized LR Parser/Compiler [Tomita et al , 1988], [Tomita, 1990]. We discuss an application of the GLR* parser to spontaneous speech understanding and present some preliminary tests on the utility of the GLR* parser in such settings.

pdf
Recent Advances in Janus: A Speech Translation System
M. Woszczyna | N. Coccaro | A. Eisele | A. Lavie | A. McNair | T. Polzin | I. Rogina | C. P. Rose | T. Sloboda | M. Tomita | J. Tsutsumi | N. Aoki-Waibel | A. Waibel | W. Ward
Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 21-24, 1993

pdf
Evaluation of MT Systems by TOEFL
Masaru Tomita | Masako Shirai | Junya Tsutsumi | Miki Matsumura | Yuki
Proceedings of the Fifth Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

1991

pdf bib
Proceedings of the Second International Workshop on Parsing Technologies (IWPT ’91)
Masaru Tomita | Martin Kay | Robert Berwick | Eva Hajicova | Aravind Joshi | Ronald Kaplan | Makoto Nagao | Yorick Wilks
Proceedings of the Second International Workshop on Parsing Technologies

February 13-25, 1991

pdf
Probabilistic LR Parsing for General Context-Free Grammars
See-Kiong Ng | Masaru Tomita
Proceedings of the Second International Workshop on Parsing Technologies

To combine the advantages of probabilistic grammars and generalized LR parsing, an algorithm for constructing a probabilistic LR parser given a probabilistic context-free grammar is needed. In this paper, implementation issues in adapting Tomita’s generalized LR parser with graph-structured stack to perform probabilistic parsing are discussed. Wright and Wrigley (1989) has proposed a probabilistic LR-table construction algorithm for non-left-recursive context-free grammars. To account for left recursions, a method for computing item probabilities using the generation of systems of linear equations is presented. The notion of deferred probabilities is proposed as a means for dealing with similar item sets with differing probability assignments.

1990

pdf
The Generalized LR Parser/Compiler V8-4: A Software Package for Practical NL Projects
Masaru Tomita
COLING 1990 Volume 1: Papers presented to the 13th International Conference on Computational Linguistics

1989

pdf bib
Proceedings of the First International Workshop on Parsing Technologies
Masaru Tomita
Proceedings of the First International Workshop on Parsing Technologies

pdf
Massively Parallel Parsing in 𝛷DmDialog: Integrated Architecture for Parsing Speech Inputs
Hiroaki Kitano | Teruko Mitamura | Masaru Tomita
Proceedings of the First International Workshop on Parsing Technologies

This paper describes the parsing scheme in the 𝛷DmDialog speech-to-speech dialog translation system, with special emphasis on the integration of speech and natural language processing. We propose an integrated architecture for parsing speech inputs based on a parallel marker-passing scheme and attaining dynamic participation of knowledge from the phonological-level to the discourse-level. At the phonological level, we employ a stochastic model using a transition matrix and a confusion matrix and markers which carry a probability measure. At a higher level, syntactic/semantic and discourse processing, we integrate a case-based and constraint-based scheme in a consistent manner so that a priori probability and constraints, which reflect linguistic and discourse factors, are provided to the phonological level of processing. A probability/cost-based scheme in our model enables ambiguity resolution at various levels using one uniform principle.

pdf
Parsing 2-Dimensional Language
Masaru Tomita
Proceedings of the First International Workshop on Parsing Technologies

2-Dimensional Context-Free Grammar (2D-CFG) for 2-dimensional input text is introduced and efficient parsing algorithms for 2D-CFG are presented. In 2D-CFG, a grammar rule’s right hand side symbols can be placed not only horizontally but also vertically. Terminal symbols in a 2-dimensional input text are combined to form a rectangular region, and regions are combined to form a larger region using a 2-dimensional phrase structure rule. The parsing algorithms presented in this paper are the 2D-Ear1ey algorithm and 2D-LR algorithm, which are 2-dimensionally extended versions of Earley’s algorithm and the LR(O) algorithm, respectively.

1988

pdf
Towards speech translation systems
Masaru Tomita
Proceedings of the Second Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

pdf
The Universal Parser Compiler and its application to a speech translation system
Masaru Tomita | Marion Kee | Hiroaki Saito | Teruko Mitamura | Hideto Tomabechi
Proceedings of the Second Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

pdf
Graph-structured Stack and Natural Language Parsing
Masaru Tomita
26th Annual Meeting of the Association for Computational Linguistics

pdf
“Linguistic” Sentences and “Real” Sentences
Masaru Tomita
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics

pdf
Parsing Noisy Sentences
Hiroaki Saito | Masaru Tomita
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics

pdf
Application of the Direct Memory Access paradigm to natural language interlaces to knowledge-based systems
Hideto Tomabechi | Masaru Tomita
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics

pdf
Combining Lexicon-Driven Parsing and Phrase-Structure-Based Parsing
Masaru Tomita
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics

1987

pdf
CMU Project
Masaru Tomita | Jaime G. Carbonell
Proceedings of Machine Translation Summit I

pdf
An Efficient Augmented-Context-Free Parsing Algorithm
Masaru Tomita
Computational Linguistics, Formerly the American Journal of Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987

1986

pdf
Parsing Spoken Language: a Semantic Caseframe Approach
Philip J. Hayes | Alexander G. Hauptmann | Jaime G. Carbonell | Masaru Tomita
Coling 1986 Volume 1: The 11th International Conference on Computational Linguistics

pdf
Another Stride Towards Knowledge-Based Machine Translation
Masaru Tomita | Jaime G. Carbonell
Coling 1986 Volume 1: The 11th International Conference on Computational Linguistics

1985


New Approaches to Machine Translation
Jaime G. Carbonell | Masaru Tomita
Proceedings of the first Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages


Feasibility Study of Personal/Interactive Machine Translation Systems
Masaru Tomita
Proceedings of the first Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

1984

pdf
LR Parsers For Natural Languages
Masaru Tomita
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

pdf
Disambiguating Grammatically Ambiguous Sentences By Asking
Masaru Tomita
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics