Giorgio Satta

2022

pdf abs
Tractable Parsing for CCGs of Bounded Degree
Lena Katharina Schiffer | Marco Kuhlmann | Giorgio Satta
Computational Linguistics, Volume 48, Issue 3 - September 2022

Unlike other mildly context-sensitive formalisms, Combinatory Categorial Grammar (CCG) cannot be parsed in polynomial time when the size of the grammar is taken into account. Refining this result, we show that the parsing complexity of CCG is exponential only in the maximum degree of composition. When that degree is fixed, parsing can be carried out in polynomial time. Our finding is interesting from a linguistic perspective because a bounded degree of composition has been suggested as a universal constraint on natural language grammar. Moreover, ours is the first complexity result for a version of CCG that includes substitution rules, which are used in practical grammars but have been ignored in theoretical work.

2019

pdf abs
Ordered Tree Decomposition for HRG Rule Extraction
Daniel Gildea | Giorgio Satta | Xiaochang Peng
Computational Linguistics, Volume 45, Issue 2 - June 2019

We present algorithms for extracting Hyperedge Replacement Grammar (HRG) rules from a graph along with a vertex order. Our algorithms are based on finding a tree decomposition of smallest width, relative to the vertex order, and then extracting one rule for each node in this structure. The assumption of a fixed order for the vertices of the input graph makes it possible to solve the problem in polynomial time, in contrast to the fact that the problem of finding optimal tree decompositions for a graph is NP-hard. We also present polynomial-time algorithms for parsing based on our HRGs, where the input is a vertex sequence and the output is a graph structure. The intended application of our algorithms is grammar extraction and parsing for semantic representation of natural language. We apply our algorithms to data annotated with Abstract Meaning Representations and report on the characteristics of the resulting grammars.

pdf abs
Bottom-Up Unranked Tree-to-Graph Transducers for Translation into Semantic Graphs
Johanna Björklund | Shay B. Cohen | Frank Drewes | Giorgio Satta
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing

We propose a formal model for translating unranked syntactic trees, such as dependency trees, into semantic graphs. These tree-to-graph transducers can serve as a formal basis of transition systems for semantic parsing which recently have been shown to perform very well, yet hitherto lack formalization. Our model features “extended” rules and an arc-factored normal form, comes with an efficient translation algorithm, and can be equipped with weights in a straightforward manner.

2018

pdf abs
Cache Transition Systems for Graph Parsing
Daniel Gildea | Giorgio Satta | Xiaochang Peng
Computational Linguistics, Volume 44, Issue 1 - April 2018

Motivated by the task of semantic parsing, we describe a transition system that generalizes standard transition-based dependency parsing techniques to generate a graph rather than a tree. Our system includes a cache with fixed size m, and we characterize the relationship between the parameter m and the class of graphs that can be produced through the graph-theoretic concept of tree decomposition. We find empirically that small cache sizes cover a high percentage of sentences in existing semantic corpora.

pdf abs
Weighted DAG Automata for Semantic Graphs
David Chiang | Frank Drewes | Daniel Gildea | Adam Lopez | Giorgio Satta
Computational Linguistics, Volume 44, Issue 1 - April 2018

Graphs have a variety of uses in natural language processing, particularly as representations of linguistic meaning. A deficit in this area of research is a formal framework for creating, combining, and using models involving graphs that parallels the frameworks of finite automata for strings and finite tree automata for trees. A possible starting point for such a framework is the formalism of directed acyclic graph (DAG) automata, defined by Kamimura and Slutzki and extended by Quernheim and Knight. In this article, we study the latter in depth, demonstrating several new results, including a practical recognition algorithm that can be used for inference and learning with models defined on DAG automata. We also propose an extension to graphs with unbounded node degree and show that our results carry over to the extended formalism.

pdf abs
On the Complexity of CCG Parsing
Marco Kuhlmann | Giorgio Satta | Peter Jonsson
Computational Linguistics, Volume 44, Issue 3 - September 2018

We study the parsing complexity of Combinatory Categorial Grammar (CCG) in the formalism of Vijay-Shanker and Weir (1994). As our main result, we prove that any parsing algorithm for this formalism will take in the worst case exponential time when the size of the grammar, and not only the length of the input sentence, is included in the analysis. This sets the formalism of Vijay-Shanker and Weir (1994) apart from weakly equivalent formalisms such as Tree Adjoining Grammar, for which parsing can be performed in time polynomial in the combined size of grammar and input sentence. Our results contribute to a refined understanding of the class of mildly context-sensitive grammars, and inform the search for new, mildly context-sensitive versions of CCG.

pdf abs
Sequence-to-sequence Models for Cache Transition Systems
Xiaochang Peng | Linfeng Song | Daniel Gildea | Giorgio Satta
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we present a sequence-to-sequence based approach for mapping natural language sentences to AMR semantic graphs. We transform the sequence to graph mapping problem to a word sequence to transition action sequence problem using a special transition system called a cache transition system. To address the sparsity issue of neural AMR parsing, we feed feature embeddings from the transition state to provide relevant local information for each decoder state. We present a monotonic hard attention model for the transition framework to handle the strictly left-to-right alignment between each transition state and the current buffer input focus. We evaluate our neural transition model on the AMR parsing task, and our parser outperforms other sequence-to-sequence approaches and achieves competitive results in comparison with the best-performing models.

2017

pdf abs
An Incremental Parser for Abstract Meaning Representation
Marco Damonte | Shay B. Cohen | Giorgio Satta
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Abstract Meaning Representation (AMR) is a semantic representation for natural language that embeds annotations related to traditional tasks such as named entity recognition, semantic role labeling, word sense disambiguation and co-reference resolution. We describe a transition-based parser for AMR that parses sentences left-to-right, in linear time. We further propose a test-suite that assesses specific subtasks that are helpful in comparing AMR parsers, and show that our parser is competitive with the state of the art on the LDC2015E86 dataset and that it outperforms state-of-the-art parsers for recovering named entities and handling polarity.

We develop parsing oracles for two transition-based dependency parsers, including the arc-standard parser, solving a problem that was left open in (Goldberg and Nivre, 2013). We experimentally show that using these oracles during training yields superior parsing accuracies on many languages.

pdf abs
A New Parsing Algorithm for Combinatory Categorial Grammar
Marco Kuhlmann | Giorgio Satta
Transactions of the Association for Computational Linguistics, Volume 2

We present a polynomial-time parsing algorithm for CCG, based on a new decomposition of derivations into small, shareable parts. Our algorithm has the same asymptotic complexity, O(n6), as a previous algorithm by Vijay-Shanker and Weir (1993), but is easier to understand, implement, and prove correct.

pdf
A Polynomial-Time Dynamic Oracle for Non-Projective Dependency Parsing
Carlos Gómez-Rodríguez | Francesco Sartorio | Giorgio Satta
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf abs
Efficient Parsing for Head-Split Dependency Trees
Giorgio Satta | Marco Kuhlmann
Transactions of the Association for Computational Linguistics, Volume 1

Head splitting techniques have been successfully exploited to improve the asymptotic runtime of parsing algorithms for projective dependency trees, under the arc-factored model. In this article we extend these techniques to a class of non-projective dependency trees, called well-nested dependency trees with block-degree at most 2, which has been previously investigated in the literature. We define a structural property that allows head splitting for these trees, and present two algorithms that improve over the runtime of existing algorithms at no significant loss in coverage.

pdf
A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy
Francesco Sartorio | Giorgio Satta | Joakim Nivre
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Approximate PCFG Parsing Using Tensor Decomposition
Shay B. Cohen | Giorgio Satta | Michael Collins
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf
Proceedings of the 11th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+11)
Giorgio Satta | Chung-Hye Han
Proceedings of the 11th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+11)

pdf
Heuristic Cube Pruning in Linear Time
Andrea Gesmundo | Giorgio Satta | James Henderson
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
Tree-Adjoining Grammars Are Not Closed Under Strong Lexicalization
Marco Kuhlmann | Giorgio Satta
Computational Linguistics, Volume 38, Issue 3 - September 2012

2011

pdf
Splittability of Bilexical Context-Free Grammars is Undecidable
Mark-Jan Nederhof | Giorgio Satta
Computational Linguistics, Volume 37, Issue 4 - December 2011

pdf
Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
Pierluigi Crescenzi | Daniel Gildea | Andrea Marino | Gianluca Rossi | Giorgio Satta
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Prefix Probability for Probabilistic Synchronous Context-Free Grammars
Mark-Jan Nederhof | Giorgio Satta
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Dynamic Programming Algorithms for Transition-Based Dependency Parsers
Marco Kuhlmann | Carlos Gómez-Rodríguez | Giorgio Satta
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Computation of Infix Probabilities for Probabilistic Context-Free Grammars
Mark-Jan Nederhof | Giorgio Satta
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf
Exact Inference for Generative Probabilistic Non-Projective Dependency Parsing
Shay B. Cohen | Carlos Gómez-Rodríguez | Giorgio Satta
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf
Prefix Probabilities for Linear Context-Free Rewriting Systems
Mark-Jan Nederhof | Giorgio Satta
Proceedings of the 12th International Conference on Parsing Technologies

2010

pdf
Parsing and Translation Algorithms Based on Weighted Extended Tree Transducers
Andreas Maletti | Giorgio Satta
Proceedings of the 2010 Workshop on Applications of Tree Automata in Natural Language Processing

pdf
Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems
Carlos Gómez-Rodríguez | Marco Kuhlmann | Giorgio Satta
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Complexity, Parsing, and Factorization of Tree-Local Multi-Component Tree-Adjoining Grammar
Rebecca Nesson | Giorgio Satta | Stuart M. Shieber
Computational Linguistics, Volume 36, Issue 3 - September 2010

pdf
Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two
Benoît Sagot | Giorgio Satta
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf
The Importance of Rule Restrictions in CCG
Marco Kuhlmann | Alexander Koller | Giorgio Satta
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf
Parsing Algorithms based on Tree Automata
Andreas Maletti | Giorgio Satta
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf
Synchronous Rewriting in Treebanks
Laura Kallmeyer | Wolfgang Maier | Giorgio Satta
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf
Treebank Grammar Techniques for Non-Projective Dependency Parsing
Marco Kuhlmann | Giorgio Satta
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf
Optimal Reduction of Rule Length in Linear Context-Free Rewriting Systems
Carlos Gómez-Rodríguez | Marco Kuhlmann | Giorgio Satta | David Weir
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf
An Optimal-Time Binarization Algorithm for Linear Context-Free Rewriting Systems with Fan-Out Two
Carlos Gómez-Rodríguez | Giorgio Satta
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf
A Polynomial-Time Parsing Algorithm for TT-MCTAG
Laura Kallmeyer | Giorgio Satta
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf
Optimal k-arization of Synchronous Tree-Adjoining Grammar
Rebecca Nesson | Giorgio Satta | Stuart M. Shieber
Proceedings of ACL-08: HLT

The EVALITA 2007 Parsing Task has been the first contest among parsing systems for Italian. It is the first attempt to compare the approaches and the results of the existing parsing systems specific for this language using a common treebank annotated using both a dependency and a constituency-based format. The development data set for this parsing competition was taken from the Turin University Treebank, which is annotated both in dependency and constituency format. The evaluation metrics were those standardly applied in CoNLL and PARSEVAL. The results of the parsing results are very promising and higher than the state-of-the-art for dependency parsing of Italian. An analysis of such results is provided, which takes into account other experiences in treebank-driven parsing for Italian and for other Romance languages (in particular, the CoNLL X & 2007 shared tasks for dependency parsing). It focuses on the characteristics of data sets, i.e. type of annotation and size, parsing paradigms and approaches applied also to languages other than Italian.

2007

pdf
On the Complexity of Non-Projective Data-Driven Dependency Parsing
Ryan McDonald | Giorgio Satta
Proceedings of the Tenth International Conference on Parsing Technologies

pdf
Guided Learning for Bidirectional Sequence Classification
Libin Shen | Giorgio Satta | Aravind Joshi
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf
Factoring Synchronous Grammars by Sorting
Daniel Gildea | Giorgio Satta | Hao Zhang
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf
Cross-Entropy and Estimation of Probabilistic Context-Free Grammars
Anna Corazza | Giorgio Satta
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf
Estimation of Consistent Probabilistic Context-free Grammars
Mark-Jan Nederhof | Giorgio Satta
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

2005

pdf
Some Computational Complexity Results for Synchronous Context-Free Grammars
Giorgio Satta | Enoch Peserico
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf
Probabilistic Parsing Strategies
Mark-Jan Nederhof | Giorgio Satta
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf
An Alternative Method of Training Probabilistic LR Parsers
Mark-Jan Nederhof | Giorgio Satta
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf
Generalized Multitext Grammars
I. Dan Melamed | Giorgio Satta | Benjamin Wellington
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf
Kullback-Leibler Distance between Probabilistic Context-Free Grammars and Probabilistic Finite Automata
Mark-Jan Nederhof | Giorgio Satta
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf abs
Probabilistic Parsing as Intersection
Mark-Jan Nederhof | Giorgio Satta
Proceedings of the Eighth International Conference on Parsing Technologies

We show that a well-known algorithm to compute the intersection of a context-fre language and a regular language can be extended to apply to a probabilistic context-free grammar and a probabilistic finite automaton, provided the two probabilistic models are combined through multiplication. The result is a probabilistic context-free grammar that contains joint information about the original grammar and automaton.

pdf abs
Partially Ordered Multiset Context-free Grammars and Free-word-order Parsing
Mark-Jan Nederhof | Giorgio Satta | Stuart Shieber
Proceedings of the Eighth International Conference on Parsing Technologies

We present a new formalism, partially ordered multiset context-free grammars (poms-CFG), along with an Earley-style parsing algorithm. The formalism, which can be thought of as a generalization of context-free grammars with partially ordered right-hand sides, is of interest in its own right, and also as infrastructure for obtaining tighter complexity bounds for more expressive context-free formalisms intended to express free or multiple word-order, such as ID/LP grammars. We reduce ID/LP grammars to poms-grammars, thereby getting finer-grained bounds on the parsing complexity of ID/LP grammars. We argue that in practice, the width of attested ID/LP grammars is small, yielding effectively polynomial time complexity for ID/LP grammar parsing.

2002

pdf
Parsing non-recursive CFGs
Mark-Jan Nederhof | Giorgio Satta
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

2000

pdf
Left-To-Right Parsing and Bilexical Context-Free Grammars
Mark-Jan Nederhof | Giorgio Satta
1st Meeting of the North American Chapter of the Association for Computational Linguistics

pdf
A faster parsing algorithm for Lexicalized Tree-Adjoining Grammars
Jason Eisner | Giorgio Satta
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf
Parsing Techniques for Lexicalized Context-Free Grammars
Giorgio Satta
Proceedings of the Sixth International Workshop on Parsing Technologies

1999

pdf
Efficient Parsing for Bilexical Context-Free Grammars and Head Automaton Grammars
Jason Eisner | Giorgio Satta
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics

1998

pdf
Prefix Probabilities from Stochastic Tree Adjoining Grammars
Mark-Jan Nederhof | Anoop Sarkar | Giorgio Satta
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf
Restrictions on Tree Adjoining Languages
Giorgio Satta | William Schuler
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf
Book Reviews: Parsing with Principles and Classes of Information
Giorgio Satta
Computational Linguistics, Volume 24, Number 1, March 1998 - Special Issue on Word Sense Disambiguation

pdf
Optimality Theory and the Generative Complexity of Constraint Violability
Robert Frank | Giorgio Satta
Computational Linguistics, Volume 24, Number 2, June 1998

pdf
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)
Anne Abeillé | Tilman Becker | Giorgio Satta | K. Vijay-Shanker
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

pdf
Prefix probabilities for linear indexed grammars
Mark-Jan Nederhof | Anoop Sarkar | Giorgio Satta
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

pdf
Prefix Probabilities from Stochastic Free Adjoining Grammars
Mark-Jan Nederhof | Anoop Sarkar | Giorgio Satta
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf
Restrictions on Tree Adjoining Languages
Giorgio Satta | William Schuler
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

1997

pdf
String Transformation Learning
Giorgio Satta | John C. Henderson
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics

1996

pdf
Synchronous Models of Language
Owen Rambow | Giorgio Satta
34th Annual Meeting of the Association for Computational Linguistics

pdf
Efficient Tabular LR Parsing
Mark-Jan Nederhof | Giorgio Satta
34th Annual Meeting of the Association for Computational Linguistics

pdf
Efficient Transformation-Based Parsing
Giorgio Satta | Eric Brill
34th Annual Meeting of the Association for Computational Linguistics

1994

pdf
Tree-Adjoining Grammar Parsing and Boolean Matrix Multiplication
Giorgio Satta
Computational Linguistics, Volume 20, Number 2, June 1994

pdf
An Extended Theory of Head-Driven Parsing
Mark-Jan Nederhof | Giorgio Satta
32nd Annual Meeting of the Association for Computational Linguistics

1992

pdf
Recognition of Linear Context-Free Rewriting Systems
Giorgio Satta
30th Annual Meeting of the Association for Computational Linguistics

pdf
Book Reviews: Generalized LR Parsing
Giorgio Satta
Computational Linguistics, Volume 18, Number 3, September 1992, Special Issue on Inheritance: II

1991

pdf
Bidirectional Parsing of Lexicalized Tree Adjoining Grammars
Alberto Lavelli | Giorgio Satta
Fifth Conference of the European Chapter of the Association for Computational Linguistics

pdf abs
Stochastic Context-Free Grammars for Island-Driven Probabilistic Parsing
Anna Corazza | Renato De Mori | Roberto Gretter | Giorgio Satta
Proceedings of the Second International Workshop on Parsing Technologies

In automatic speech recognition the use of language models improves performance. Stochastic language models fit rather well the uncertainty created by the acoustic pattern matching. These models are used to score theories corresponding to partial interpretations of sentences. Algorithms have been developed to compute probabilities for theories that grow in a strictly left-to-right fashion. In this paper we consider new relations to compute probabilities of partial interpretations of sentences. We introduce theories containing a gap corresponding to an uninterpreted signal segment. Algorithms can be easily obtained from these relations. Computational complexity of these algorithms is also derived.