@inproceedings{cong-2022-psycholinguistic,
    title = "Psycholinguistic Diagnosis of Language Models' Commonsense Reasoning",
    author = "Cong, Yan",
    editor = "Bosselut, Antoine  and
      Li, Xiang  and
      Lin, Bill Yuchen  and
      Shwartz, Vered  and
      Majumder, Bodhisattwa Prasad  and
      Lal, Yash Kumar  and
      Rudinger, Rachel  and
      Ren, Xiang  and
      Tandon, Niket  and
      Zouhar, Vil{\'e}m",
    booktitle = "Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2022.csrr-1.3/",
    doi = "10.18653/v1/2022.csrr-1.3",
    pages = "17--22",
    abstract = "Neural language models have attracted a lot of attention in the past few years. More and more researchers are getting intrigued by how language models encode commonsense, specifically what kind of commonsense they understand, and why they do. This paper analyzed neural language models' understanding of commonsense pragmatics (i.e., implied meanings) through human behavioral and neurophysiological data. These psycholinguistic tests are designed to draw conclusions based on predictive responses in context, making them very well suited to test word-prediction models such as BERT in natural settings. They can provide the appropriate prompts and tasks to answer questions about linguistic mechanisms underlying predictive responses. This paper adopted psycholinguistic datasets to probe language models' commonsense reasoning. Findings suggest that GPT-3{'}s performance was mostly at chance in the psycholinguistic tasks. We also showed that DistillBERT had some understanding of the (implied) intent that{'}s shared among most people. Such intent is implicitly reflected in the usage of conversational implicatures and presuppositions. Whether or not fine-tuning improved its performance to human-level depends on the type of commonsense reasoning."
}Markdown (Informal)
[Psycholinguistic Diagnosis of Language Models’ Commonsense Reasoning](https://preview.aclanthology.org/ingest-emnlp/2022.csrr-1.3/) (Cong, CSRR 2022)
ACL