This is the code accompanying the EMNLP submission titled:
"Is Supervised Syntactic Parsing Beneficial for Language Understanding?"

The following three scripts are the entry points for exploiting the relevant functionality: 

1. *starter_data.py*: preprocessing and serialization for all datasets used in our experiments

2. *starter.py*: the script executing training/evaluation for all transformer-based models

3. *topology.py*: the script responsible for generating representations with different transformers and measuring the topological similarities between them (with linear CKA)

For each of these three scripts there is a relevant section in *config.py*, specifying the details of execution (which datasets, which transformers, etc.). It is recommended to comment out configuration sections not relevant for current execution. For example, if you're running *starter.py*, then you should comment out the sections "TOPOLOGY ANALYSIS" and "DATA PREPROCESSING" in config.py and configure only the parameters in the "MODELING, TRAINING, OPTIMIZATION, and EVALUATION" section. 