Abstract
We describe and evaluate experimentally a method to parse a tagged corpus without grammar modeling a natural language on context-free language. This method is based on the following three hypotheses. 1) Part-of-speech sequences on the right-hand side of a rewriting rule are less constrained as to what part-of-speech precedes and follows them than non-constituent sequences. 2) Part-of-speech sequences directly derived from the same non-terminal symbol have similar environments. 3) The most suitable set of rewriting rules makes the greatest reduction of the corpus size. Based on these hypotheses, the system finds a set of constituent-like part-of-speech sequences and replaces them with a new symbol. The repetition of these processes brings us a set of rewriting rules, a grammar, and the bracketed corpus.- Anthology ID:
- 1995.iwpt-1.22
- Volume:
- Proceedings of the Fourth International Workshop on Parsing Technologies
- Month:
- September 20-24
- Year:
- 1995
- Address:
- Prague and Karlovy Vary, Czech Republic
- Editors:
- Eva Hajicova, Bernard Lang, Robert Berwick, Harry Bunt, Bob Carpenter, Ken Church, Aravind Joshi, Ronald Kaplan, Martin Kay, Makoto Nagao, Anton Nijholt, Mark Steedman, Henry Thompson, Masaru Tomita, K. Vijay-Shanker, Yorick Wilks, Kent Wittenburg
- Venues:
- IWPT | WS
- SIG:
- SIGPARSE
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 174–185
- Language:
- URL:
- https://aclanthology.org/1995.iwpt-1.22
- DOI:
- Cite (ACL):
- Shinsuke Mori and Makoto Nagao. 1995. Parsing Without Grammar. In Proceedings of the Fourth International Workshop on Parsing Technologies, pages 174–185, Prague and Karlovy Vary, Czech Republic. Association for Computational Linguistics.
- Cite (Informal):
- Parsing Without Grammar (Mori & Nagao, IWPT-WS 1995)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/1995.iwpt-1.22.pdf