Model Details
• Developed by researchers at University of Waterloo, 2020
• It is based on pre-trained contextual embeddings

Intended Use
• Intended to be used for identifying highly-accurate substitute candidates

Metrics
• All-ranking task: We use best and best-mode which validate the quality of the model’s best prediction and both oot(out-of-ten) and oot-mode to evaluate the coverage of the gold substitute candidate list 
by the 10-top predictions. We also use Precision@1 in order to have a complete comparison with previous state-of-the-art model
• Candidate ranking task: For measuring the performance of the model we use the GAP score which is a variant of the MAP(Mean Average Precision).
Training Data
• LS07 and CoInCo dataset, training data split.
Evaluation Data
• LS07 and CoInCo dataset, test data split.

Ethical Considerations
•  Our model can be used as a data augmentation tool to provide artificial training data (simple paraphrases) for tasks where the lack of sufficient training data may hurt the performance of the model.
•  However, there are potential risks of  over-relying on any lexical substitution tool. In particular, there is a risk that no matter how efficient a lexical substitution model is that it can unintentionally change the  meaning of the original text thus leading to erroneous conclusions.
