
	CogALex-V Shared Task on the Corpus-Based Identification of Semantic Relations
	Organizers: Emmanuele Chersoni (The Hong Kong Polytechnic University)
              Luca Iacoponi (Amazon)
              Rong Xiang (The Hong Kong Polytechnic University)


TASK

See the instructions at: https://sites.google.com/view/cogalex-2020/home/shared-task#h.p_yU2jiFJtgja4

SCRIPTS:

baseline_method_rnd.py:
  - Baseline model using random guess (Using Chinese as an example)
  - "pred_chinese_data_random.txt" is generated as the output

baseline_method_svm.py
  - Baseline model using word embedding vectors and SVM (Using Chinese as an example)
  - "pred_chinese_data_svm.txt" is generated as the output
  - Embedding vectors can be found at: https://fasttext.cc/docs/en/pretrained-vectors.html

evaluation.py:
  - Official evaluation script, which can be used for subtask 1.
  - To learn how to use it, type: python evaluation.py --help
  - Example: python evaluation.py ./validgold_chinese_data.txt ./pred_chinese_data_random.txt


FILES AND FORMATS

example/:
  example/validgold_chinese_data.txt:
    - answers for Chinese validation set
  example/pred_chinese_data_random.txt:
    - baseline predictions (random guesses)
    - also serves as example of the expected system output
  example/pred_chinese_data_svm.txt:
    - baseline predictions (word embedding plus SVM)

train_XXXXXX_data.txt:
  - TAB-delimited text files for training set
  - columns: word1, word2, relation (SYN, ANT, HYP, RANDOM)

valid_XXXXXX_data.txt:
  - TAB-delimited text files for validation set
  - columns: word1, word2

validgold_XXXXXX_data.txt:
  - gold standard answers for validation set
  - TAB-delimited text files with answers
  - columns: word1, word2, relation


REFERENCES

• Glavas and Vulic, 2018. Discriminating between lexico-semantic relations with the specialization tensor model. Proceedings of NAACL.
• McRae et al., 2012. Semantic and associative relations: Examining a tenuous dichotomy. The Adolescent Brain: Learning, Reasoning and Decision-Making, edited by Reyna et al., pp. 39-66, APA.
• Murphy, 2003. Semantic relations and the lexicon: antonymy, synonymy and other paradigms. Cambridge University Press.
• Schulte im Walde, 2020. Distinguishing between paradigmatic semantic relations across word classes: human ratings and distributional similarity. Journal of Language Modelling, vol. 8, n. 1, pp. 53-101.
• Ruder et al., 2019. A survey of crosslingual word embedding models. Journal of Artificial Intelligence Research, vol. 65, pp. 569-631.
• Santus, 2016. Making sense: From word distribution to meaning. PhD Thesis, The Hong Kong Polytechnic University.
• Santus et al., 2016. The CogALex-V shared task on the corpus-based identification of semantic relations. Proceedings of the COLING Workshop on the Cognitive Aspects of the Lexicon.
• Yu et al., 2020. Hypernymy detection for low-resource languages via meta learning. Proceedings of ACL.
