This folder contains three sub-folders corresponding to each of the 
prediction models used in inferring the learning curve from monolingual 
corpus of source and target languages. 
    1. Baseline - constant mean model 
    2. Ridge - a linear regression model with an L2 regularization
    3. Lasso - a linear regression model with an L1 regularization

Each sub-folder contains two files: 
    a. cumul-predictions.txt
    b. evaluation.txt

The format of the 'cumul-predictions.txt' file is as follows:
CONF_NAME   GOLD_10K	PRED_10K    GOLD_10K-PRED_10K	GOLD_75K    PRED_75K	GOLD_75K-PRED_75K   GOLD_500K	PRED_500K   GOLD_500K-PRED_500K

where, CONF_NAME corresponds to the string identifying the configuration and the test set whose learning curve is being predicted
    GOLD_*  represents the values given by the gold learning curve at a particular sample size
    PRED_*  represents the values given by the prediction model trained for the specific anchor size
and GOLD_*-PRED_* is the error in the prediction. 

The evaluation metric used is the average Root mean squared error (given in
Eq. 4 of Section 4). The evaluation of each of the prediction models can be found
in the evaluation.txt file. 
