@inproceedings{zhao-etal-2024-large-language,
    title = "Large Language Models are In-context Teachers for Knowledge Reasoning",
    author = "Zhao, Jiachen  and
      Yao, Zonghai  and
      Yang, Zhichao  and
      Yu, Hong",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.findings-emnlp.961/",
    doi = "10.18653/v1/2024.findings-emnlp.961",
    pages = "16470--16486",
    abstract = "In this work, we study in-context teaching(ICT), where a teacher provides in-context example rationales to teach a student to reasonover unseen cases. Human teachers are usually required to craft in-context demonstrations, which are costly and have high variance. We ask whether a large language model (LLM) can serve as a more effective in-context teacher for itself or otherLLMs, compared to humans. Inspired by the Encoding Specificity Hypothesis from human episodic memory, we hypothesize thatin-context exemplars crafted by the teacher should match the training data of the student. This hypothesis motivates us to propose Self-Explain where an LLM{'}s self-elicited explanations are used as in-context demonstrations for prompting it as they are generalized fromthe model{'}s training examples. Self-Explain is shown to significantly outperform using human-crafted exemplars and other baselines.Furthermore, we reveal that for ICT, rationales from different teacher LLMs or human experts that more resemble the student LLM{'}s self-explanations are better in-context demonstrations. This supports our encoding specificity hypothesis. We then propose Teach-Back that aligns a teacher LLM with the student to enhance the ICT performance. For example, Teach-Back enables a 7B model to teach the much larger GPT-3.5 in context, surpassing human teachers by around 5{\%} in test accuracy on medical question answering."
}Markdown (Informal)
[Large Language Models are In-context Teachers for Knowledge Reasoning](https://preview.aclanthology.org/ingest-emnlp/2024.findings-emnlp.961/) (Zhao et al., Findings 2024)
ACL