SUPPLEMENTARY MATERIAL FOR PAPER 
"Improving Transition-Based Dependency Parsing with Buffer Transitions"
by
Daniel Fernández-González (Universidade de Vigo, Spain, danifg@uvigo.es) and
Carlos Gómez-Rodríguez (Universidade da Coruña, Spain, carlos.gomez@udc.es),
published in the proceedings of EMNLP 2012.

This archive contains an implementation of the parsers described in the paper, together with the feature
models used in the experiments and all the settings and parameters needed to reproduce them.

The arc-eager parser with the Left Buffer Arc transition (NE+LBA), the arc-eager parser with the Right Buffer Arc
transition (NE+RBA), the arc-eager parser with the Left Non-projective Buffer Arc transition (NE+LNBA) and the arc-eager 
parser with the Right Non-projective Buffer Arc transition (NE+RNBA) were implemented using Maltparser. 

MaltParser is a system for data-driven dependency parsing, which can be used to induce a parsing model from treebank 
data and to parse new data using an induced model. MaltParser is developed by Johan Hall, Jens Nilsson and 
Joakim Nivre at Växjö University and Uppsala University, Sweden. It contains the Arc-eager parser (NE) implemented 
by Joakim Nivre.

We include a modified version of MaltParser to which we have added the implementation of the algorithms with buffer transitions
described in the paper. The full source code is provided, and the code for the parsers with buffer transitions is located in the
package org.maltparser.parser.algorithm.moreTransition. The code in this package has been written by the authors of the paper 
and can be used and redistributed following the same license terms as for the rest of MaltParser 1.4.1 (malt-1.4.1/LICENSE).

In order to use the NE+LBA parser, the NE+RBA parser, the NE+LNBA parser or the NE+RNBA parser, please follow the following instructions.
First, we need to train a model using a training treebank, a feature model and a classifier. The 
command is as follows:

	java -jar  maltParser/dist/malt/malt.jar -c <model> -i <train_treebank> -a mt -m learn 
		-F <feature_model> -l <classifier> -rr <root_handling> -lba true|false 
                -rba true|false -lnba true|false -rnba true|false


where:
	<model> is the induced model.
	<train_treebank> is the treebank in CONLL format used to train the model.
	<feature_model> is a feature model in XML format.
	<classifier> can be "liblinear" or "libsvm".
	<root_handling> specifies how dependents of the special root node are handled. The posible options are:
		* STRICT: Root dependents not attached during parsing (attached with default label afterwards), reduction of unattached 
                  tokens not permissible.
		* RELAXED: Root dependents not attached during parsing (attached with default label afterwards), reduction of unattached 
                  tokens permissible.
		* NORMAL: Root dependents attached by RightArc transition during parsing (unattached tokens attached with default label afterwards).
 	


Then, we can use the model previously trained to parse a text. The command is as follows:
			
	java -jar  maltParser/dist/malt/malt.jar -c <model> -i <test_treebank> -a mt -m parse 
		-F <feature_model> -l <classifier> -o <output> -rr <root_handling> -lba true|false 
                -rba true|false -lnba true|false -rnba true|false

Now we can use a test treebank in CoNLL format and a model to obtain the parsed test treebank, also in CoNLL format.
See the MaltParser documentation for more information about flags and parameters.

=========================================================================================================================

The experiments reported in the EMNLP paper "Improving Transition-Based Dependency Parsing with Buffer Transitions" 
were performed using version 1.4.1 of MaltParser (included in this archive with the new algorithms added). 
Tables below show the exact settings used for NE, NE+LBA, NE+RBA, NE+LNBA and NE+RNBA parsers that
can be used to reproduce all the experiments. The specified settings must be used both in the learning phase (-m learn) 
and in the parsing phase (-m parse). XML files (included in the "Features" folder) specify the feature models used for 
each combination of parser and language. Default settings were used for all MaltParser parameters that are not explicitly mentioned
here. 

 ------------
| Parser = NE|
 --------------------------------------------------------------------------------------------
|Language   |	Flags							        |  Features  |
 ============================================================================================
|Arabic     | -a mt -rr normal -l libsvm 			                |  AraNE.xml |
|Chinese    | -a mt -rr relaxed -l liblinear 					|  ChiNE.xml |
|Czech      | -a mt -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000      |  CzeNE.xml |
|Danish     | -a mt -rr relaxed -l libsvm 					|  DanNE.xml |
|German	    | -a mt -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000 	|  GerNE.xml |
|Portuguese | -a mt -rr normal -l libsvm -d POSTAG -s Input[0] -T 1000 	        |  PorNE.xml |
|Swedish    | -a mt -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 1000  	|  SweNE.xml |
|Turkish    | -a mt -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 100 	|  TurNE.xml |
 --------------------------------------------------------------------------------------------

 ----------------
| Parser = NE+LBA|
 --------------------------------------------------------------------------------------------------------
|Language   |	Flags							                |  Features	 |
 ========================================================================================================
|Arabic     | -a mt -lba true -rr normal -l libsvm 			                |  AraNE+LBA.xml |
|Chinese    | -a mt -lba true -rr relaxed -l liblinear 					|  ChiNE+LBA.xml |
|Czech      | -a mt -lba true -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000    |  CzeNE+LBA.xml |
|Danish     | -a mt -lba true -rr relaxed -l libsvm 					|  DanNE+LBA.xml |
|German	    | -a mt -lba true -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000 	|  GerNE+LBA.xml |
|Portuguese | -a mt -lba true -rr normal -l libsvm -d POSTAG -s Input[0] -T 1000 	|  PorNE+LBA.xml |
|Swedish    | -a mt -lba true -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 1000  	|  SweNE+LBA.xml |
|Turkish    | -a mt -lba true -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 100 	|  TurNE+LBA.xml |
 --------------------------------------------------------------------------------------------------------


----------------
| Parser = NE+RBA|
 --------------------------------------------------------------------------------------------------------
|Language   |	Flags							                |  Features	 |
 ========================================================================================================
|Arabic     | -a mt -rba true -rr normal -l libsvm 			                |  AraNE+RBA.xml |
|Chinese    | -a mt -rba true -rr relaxed -l liblinear 					|  ChiNE+RBA.xml |
|Czech      | -a mt -rba true -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000    |  CzeNE+RBA.xml |
|Danish     | -a mt -rba true -rr relaxed -l libsvm 					|  DanNE+RBA.xml |
|German	    | -a mt -rba true -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000 	|  GerNE+RBA.xml |
|Portuguese | -a mt -rba true -rr normal -l libsvm -d POSTAG -s Input[0] -T 1000 	|  PorNE+RBA.xml |
|Swedish    | -a mt -rba true -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 1000  	|  SweNE+RBA.xml |
|Turkish    | -a mt -rba true -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 100 	|  TurNE+RBA.xml |
 --------------------------------------------------------------------------------------------------------


 -----------------
| Parser = NE+LNBA|
 ---------------------------------------------------------------------------------------------------------
|Language   |	Flags							                |  Features	  |
 =========================================================================================================
|Arabic     | -a mt -lnba true -rr normal -l libsvm 			                |  AraNE+LNBA.xml |
|Chinese    | -a mt -lnba true -rr relaxed -l liblinear 				|  ChiNE+LNBA.xml |
|Czech      | -a mt -lnba true -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000   |  CzeNE+LNBA.xml |
|Danish     | -a mt -lnba true -rr relaxed -l libsvm 					|  DanNE+LNBA.xml |
|German	    | -a mt -lnba true -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000 	|  GerNE+LNBA.xml |
|Portuguese | -a mt -lnba true -rr normal -l libsvm -d POSTAG -s Input[0] -T 1000 	|  PorNE+LNBA.xml |
|Swedish    | -a mt -lnba true -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 1000  	|  SweNE+LNBA.xml |
|Turkish    | -a mt -lnba true -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 100 	|  TurNE+LNBA.xml |
 ---------------------------------------------------------------------------------------------------------


 -----------------
| Parser = NE+RNBA|
 ---------------------------------------------------------------------------------------------------------
|Language   |	Flags							                |  Features	  |
 =========================================================================================================
|Arabic     | -a mt -rnba true -rr normal -l libsvm 			                |  AraNE+RNBA.xml |
|Chinese    | -a mt -rnba true -rr relaxed -l liblinear 				|  ChiNE+RNBA.xml |
|Czech      | -a mt -rnba true -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000   |  CzeNE+RNBA.xml |
|Danish     | -a mt -rnba true -rr relaxed -l libsvm 					|  DanNE+RNBA.xml |
|German	    | -a mt -rnba true -rr normal -l liblinear -d CPOSTAG -s Input[0] -T 1000 	|  GerNE+RNBA.xml |
|Portuguese | -a mt -rnba true -rr normal -l libsvm -d POSTAG -s Input[0] -T 1000 	|  PorNE+RNBA.xml |
|Swedish    | -a mt -rnba true -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 1000  	|  SweNE+RNBA.xml |
|Turkish    | -a mt -rnba true -rr relaxed -l libsvm -d CPOSTAG -s Input[0] -T 100 	|  TurNE+RNBA.xml |
 ---------------------------------------------------------------------------------------------------------


