@inproceedings{kuhnle-copestake-2018-deep,
    title = "Deep learning evaluation using deep linguistic processing",
    author = "Kuhnle, Alexander  and
      Copestake, Ann",
    editor = "Bisk, Yonatan  and
      Levy, Omer  and
      Yatskar, Mark",
    booktitle = "Proceedings of the Workshop on Generalization in the Age of Deep Learning",
    month = jun,
    year = "2018",
    address = "New Orleans, Louisiana",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/iwcs-25-ingestion/W18-1003/",
    doi = "10.18653/v1/W18-1003",
    pages = "17--23",
    abstract = "We discuss problems with the standard approaches to evaluation for tasks like visual question answering, and argue that artificial data can be used to address these as a complement to current practice. We demonstrate that with the help of existing `deep' linguistic processing technology we are able to create challenging abstract datasets, which enable us to investigate the language understanding abilities of multimodal deep learning models in detail, as compared to a single performance value on a static and monolithic dataset."
}Markdown (Informal)
[Deep learning evaluation using deep linguistic processing](https://preview.aclanthology.org/iwcs-25-ingestion/W18-1003/) (Kuhnle & Copestake, Gen-Deep 2018)
ACL