
-----------------------------------------------------------------------------
                              TEDEVAL -- README
-----------------------------------------------------------------------------
                          UniPar Development Team
                            Uppsala University
                                 Sweden
-----------------------------------------------------------------------------
                               June 2011
-----------------------------------------------------------------------------


-------------
General Info
-------------

This tarball contains supplementary material for the EMNLP publication

 	Evaluation Dependency Parsing: 
	Robust and Heuristics-Free Cross-Annotation Evaluation.
By
	Reut Tsarfaty
	Joakim Nivre
	Evelina Andersson 
From 
	Uppsala University
	Sweden

--------------------
Content Description
--------------------

This tarball includes the input, output, parameter settings and intermediate files from an execution of the evaluation pipeline depicted in Figure 1 on a subset of the PTB containing 10 trees.

1/ input files for the evaluation procedure, conll format
2/ conversion of all conll files into functional trees
3/ ftree generalization files for cross-experiment evaluation
4/ TEDEVAL output files
5/ TEDEVAL log files
6/ Statistical Significance test output
7/ Parameter settings for the LTH conversion
8/ The emnlp paper

This README file provides, together with the TEDEVAL software, this tarball provides the necessary information to replicate our results.

1/ files description
2/ TEDEVAL usage
3/ Command line execution examples
4/ Download links
5/ Contact info

-----------------
Files Description
-----------------

functional_gold.conll       
functional_parse.conll
lexical_gold.conll
lexical_parse.conll

functional_gold.conll.ftree                           ftrees for the dependency graphs in functional_gold.conll
functional_parse.conll.ftree                          ftrees for the dependency graphs in functional_parse.conll
lexical_gold.conll.ftree                              ftrees for the dependency graphs in lexical_gold.conll
lexical_parse.conll.ftree                             ftrees for the dependency graphs in lexical_parse.conll

functional_output_lexical_output.gen                  the generalization of functional_gold.conll.ftree and lexical_gold.conll.ftree

functional_output1.ted                                the result from TEDEVAL single experiment scenario, when the files functional_parse.conll.ftree and functional_gold.conll.ftree are used
lexical_output1.ted                                   the result from TEDEVAL single experiment scenario, when the files lexical_parse.conll.ftree and lexical_gold.conll.ftree are used
functional_output_lexical_output_0_generalized.ted    the result from TEDEVAL multiple experiments scenario, when the files functional_parse.conll.ftree and functional_output_lexical_output.gen are used
functional_output_lexical_output_1_generalized.ted    the result from TEDEVAL multiple experiments scenario, when the files lexical_parse.conll.ftree and functional_output_lexical_output.gen are used

functional_output1.log                                logfile from the execution of TEDEVAL single experiment scenario
lexical_output1.log                                   logfile from the execution of TEDEVAL single experiment scenario
functional_output_lexical_output_0_generalized.log    logfile from the execution of TEDEVAL multiple experiments scenario
functional_output_lexical_output_1_generalized.log    logfile from the execution of TEDEVAL multiple experiments scenario

stat_sign_test_res.ted_stat                           the result from statistical significance testing on functional_output_lexical_output_0_generalized.ted and  functional_output_lexical_output_1_generalized.ted


------
Usage
------

TEDEVAL-usage:
--------------

Usage:
  java -jar tedeval.jar -h for more help and options

help                      ( -h) : Show options
version                   ( -v) : Show version

Single experiment evaluation:
gold_file                  (-g) : path to gold-parses file (single experiment)
parsed_file                (-p) : path to parse-hypotheses (single experiment)
output_file                (-o) : path to result file (single experiment)

Pairwise Experiment evaluation:
first_gold_file           (-g1) : path to gold-parses file (the first of two experiments)
first_parsed_file         (-p1) : path to parse-hypotheses (the first of two experiments)
second_gold_file          (-g2) : path to gold-parses file (the second of two experiments)
second_parsed_file        (-p2) : path to parse-hypotheses (the second of two experiments)
first_result_file         (-o1) : path to result file (the first of two experiments)
second_result_file        (-o2) : path to result file (the second of two experiments)

file_format           (-format) : the default is labeled ptb-like bracketed format (ignoring anything after the dash)
conll                             the conll-x format
bracketed                         labeled ptb-like bracketed format (empty elements not allowed, ignoring anything after the dash)

labeling_flag     (-unlabeled ) : use the unlabeled measure instead of the default labeled one
avg_format_flag       (-micro ) : use the micro average instead of the default macro average one


stat-sign-usage:
----------------

Usage:
  java -jar statsigntest.jar -h for more help and options

help                         (-h) : Show options
version                      (-v) : Show version

file 1                       (-i) : Path to the first output-file from tedeval
file 2                       (-i) : Path to the second output-file from tedeval
output_file                  (-o) : path to result file                        
iterations                   (-n) : number of iterations (default: 10 000)

------------------
Example execution
------------------

TEDEVAL execution:

> java -Xmx100m unipar.jar treedistance.TreeDistance -p1 functional_parse.conll.ftree -p2 lexical_parse.conll.ftree -g1 functional_gold.conll.ftree -g2 lexical_gold.conll.ftree -o1 functional_output -o2 lexical_output


stat-sign-execution:

> java -Xmx100m unipar.jar statsigntest.StatSignTest -i functional_output_lexical_output_0_generalized.ted -i functional_output_lexical_output_1_generalized.ted -o stat_sign_test_res


------
Links
------ 

TEDEVAL: 

	http://stp.lingfil.uu.se/~tsarfaty/unipar/

Other:

TED: http://web.science.mq.edu.au/~swan/howtos/treedistance/
LTH: http://nlp.cs.lth.se/software/treebank_converter/
MALT: http://maltparser.org/


--------
Contact
--------

Reut Tsarfaty (reut.tsarfaty@gmail.com)
Joakim Nivre (joakim.nivre@lingfil.uu.se)
Evelina Andersson (evelina.andersson@lingfil.uu.se)

The Department of Linguistics and Philology, Uppsala University, Sweden 
Visiting address: Engelska parken, Thunbergsv. 3 H 
Postal address: Box 635, 751 26 UPPSALA 
