This package contains the downloads associated with

Susan Howlett and Mark Dras (2011)
"Clause Restructuring For SMT Not Absolutely Helpful"
in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

Package last updated: 22 April 2011

This code is provided for your use free of charge and without warranty. Please cite the above publication if you use this code in your research. Please address questions and comments to Suzy Howlett (suzy@showlett.id.au).

==========

Package Contents
----------------

1. README
2. HD10.README - A copy of the README file from the Howlett and Dras (2010) code bundle
3. Code:
      analysis/
         ACL11oracles
         oracle.py
      parsing/
         distributed_parser
         parse_german
         parse_german_lc
      preprocessing/
         Collins_baseline
         Collins_baseline_lc
      reordering/
         Collins_rules.py
         Collins_rules_test.py
4. Notes:
      notes/
         ger_tiger_10pc.txt
         ger_tiger_25pc.txt
         ger_tiger_half.txt
         ger_tiger_half_lc.txt
         ger_tiger_lc.txt
         moses-diff.txt
         oracles-output.txt
5. Experiment Management System configuration files:
      configs/
         [40 configuration files: see below]

==========

README Contents
---------------
1. System setup
   1.1  Moses
        1.1.1  Modifications for our cluster
   1.2  Howlett and Dras (2010) system
   1.3  Berkeley parser
2. Data
   2.1  Replicating Collins et al. (2005)
   2.2  Other experiments
3. Changes from Howlett and Dras (2010)
   3.1  Modifications to scripts
   3.2  Additional scripts
   3.3  Additional parsing models
   3.4  Experiment configuration files

==========

1. SYSTEM SETUP

1.1  Moses

We use revision 3799 from the Moses subversion respository, with the SRILM toolkit. For installation instructions, see the Moses website: http://www.statmt.org/moses/.


1.1.1  Modifications for our cluster

The experiments in this paper were run across a cluster using TORQUE for job scheduling. The TORQUE qsub command is different from that used in Moses; the accompanying file notes/moses-diff.txt contains the diff between our baseline and the repository revision. These changes should not affect the performance of the system.

Our cluster setup also affects the configuration files for the Moses Experiment Management System, specifically the "qsub-settings" variables. These settings instruct the cluster to assign 8 or 16 CPUs for each job. This number of CPUs was not actually needed; this was a hack for our setup to ensure not too many jobs were assigned to each cluster node. These changes should also not affect the results of the systems.


1.2  Howlett and Dras (2010) system

Much of the code is reused from our earlier work, Howlett and Dras (2010), cited in this paper. The scripts are free-standing and no installation is required. This code bundle includes a copy of the README file from the earlier paper's code package, which describes the usage of each script.

From the earlier paper's code package, we use the scripts analysis/oracle.py, parsing/distributed_parser, parsing/parse_german, preprocessing/Collins_baseline, and reordering/Collins_rules.py. We have made small changes to some of the scripts, and created some additional scripts based on them. This package contains our new scripts and modified versions, along with copies of the original scripts that we used unchanged. We outline our changes in Section (3).

We use the Howlett and Dras (2010) parsing model, ger_tiger.gr. In addition, we create five new parsing models by the same method. These are ger_tiger_lc.gr (lowercased), ger_tiger_half.gr (using 50% parsing data), ger_tiger_half_lc.gr (using 50% parsing data, lowercased), ger_tiger_25pc.gr (using 25% parsing data) and ger_tiger_10pc.gr (using 10% parsing data).


1.3   Berkeley parser

We use BerkeleyParser.jar from revision 14 of the Berkeley parser subversion repository, as in our earlier work.


==========

2. DATA

2.1  Replicating Collins et al. (2005)

A copy of the data used in Collins et al. (2005) was provided by Michael Collins. This data came already tokenised and lowercased.


2.2  Other experiments

All remaining experiments use data from the 2009 and 2010 Workshops on Statistical Machine Translation, available from http://www.statmt.org/wmt09/translation-task.html and http://www.statmt.org/wmt10/translation-task.html. From the 2009 Workshop, we use the parallel corpus training data (training-parallel.tar) and additional development sets (additional-dev.tgz). From the 2010 Workshop we use the development sets (dev.tgz).


==========

3. CHANGES FROM HOWLETT AND DRAS (2010)

3.1  Modifications to scripts

parsing/distributed_parser
parsing/parse_german
We modified these scripts to be able to specify the parsing model and number of batches for parsing as command-line arguments.

reordering/Collins_rules.py
We changed the label() and function() methods to convert their return values to uppercase. This enables us to use the same script to reorder the output of our lowercased parsing models.


3.2  Additional scripts

parsing/parse_german_lc
This script is similar to parsing/parse_german except it processes the Collins et al. (2005) data, which is plain text, tokenised and lowercased. The primary differences are that tokenisation and unwrapping from SGML format are not required, and *lrb* and *rrb* are used instead of *LRB* and *RRB* to match the lowercased parsing model.

preprocessing/Collins_baseline_lc
Likewise, this script is similar to preprocessing/Collins_baseline except in that it processes the Collins et al. (2005) data. The primary differences are that the TRAIN/TUNE/TEST distinction is no longer needed, plus the same changes used in creating parsing/parse_german_lc.

analysis/ACL11oracles
This script calls the analysis/oracle.py script repeatedly to produce the outputs of all of the oracle experiments we run in the paper.

notes/oracles-output.txt
For reference, this file contains the output from our run of the analysis/ACL11oracles script.


3.3  Additional parsing models

notes/ger_tiger_lc.txt
notes/ger_tiger_half.txt
notes/ger_tiger_half_lc.txt
notes/ger_tiger_10pc.txt
notes/ger_tiger_25pc.txt
These five files describe the training method and performance figures for the three additional parsing models we created. The grammars themselves are not included, as they can be quite large (up to 7.5MB each). The grammars may be downloaded from http://www.showlett.id.au/.


3.4  Experiment configuration files

We use the Moses Experiment Management System (EMS) to run the experiments in the paper. The 40 files in the configs directory are the configuration files used.

configs/config.baseline-collins
Replicating the baseline system of Collins et al. (2005). Evaluation is with the multi-bleu script only.

configs/config.reordered-collins
configs/config.reordered-collins-half
Replicating the reordered system of Collins et al. (2005), using the full, lowercased parsing model and the 50% data, lowercased parsing model, respectively. These systems can only be run after the corresponding baseline, as they reuse its language model and recasing model.Evaluation is with the multi-bleu script only.

configs/config.baseline-wmt09*
The various configurations of the baseline system tried in the paper. Each configuration file runs the evaluation on both Europarl and news test sets (test2008, newstest2009) using both NIST BLEU and multi-bleu scripts.

configs/config.reordered-wmt09*
The reordered systems corresponding to each of the baseline systems above. They similarly run the evaluation on both test2008 and newstest2009 with both NIST BLEU and multi-bleu, and can only be run after the corresponding baselines, as they reuse their language models and recasing models.

configs/config.reordered-wmt09*-half
configs/config.reordered-wmt09*-25pc
configs/config.reordered-wmt09*-10pc
As for configs/config.reordered-wmt09*, except these experiments use the 50%, 25% or 10% data parsing model instead of the full parsing model.

configs/config.oracles
This configuration file evaluates all of the oracle outputs (generated by analysis/ACL11oracles) with the multi-bleu script only.

