@inproceedings{hardy-2025-measuring,
    title = "Measuring Teaching with {LLM}s",
    author = "Hardy, Michael",
    editor = "Wilson, Joshua  and
      Ormerod, Christopher  and
      Beiting Parrish, Magdalen",
    booktitle = "Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers",
    month = oct,
    year = "2025",
    address = "Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States",
    publisher = "National Council on Measurement in Education (NCME)",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.aimecon-main.40/",
    pages = "367--384",
    ISBN = "979-8-218-84228-4",
    abstract = "This paper introduces custom Large Language Models using sentence-level embeddings to measure teaching quality. The models achieve human-level performance in analyzing classroom transcripts, outperforming average human rater correlation. Aggregate model scores align with student learning outcomes, establishing a powerful new methodology for scalable teacher feedback. Important limitations discussed."
}Markdown (Informal)
[Measuring Teaching with LLMs](https://preview.aclanthology.org/ingest-emnlp/2025.aimecon-main.40/) (Hardy, AIME-Con 2025)
ACL
- Michael Hardy. 2025. Measuring Teaching with LLMs. In Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers, pages 367–384, Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States. National Council on Measurement in Education (NCME).