@inproceedings{zeng-etal-2024-multimodal,
    title = "Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal {LLM}s",
    author = "Zeng, Fengzhu  and
      Li, Wenqian  and
      Gao, Wei  and
      Pang, Yan",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.findings-emnlp.613/",
    doi = "10.18653/v1/2024.findings-emnlp.613",
    pages = "10467--10484",
    abstract = "Detecting multimodal misinformation, especially in the form of image-text pairs, is crucial. Obtaining large-scale, high-quality real-world fact-checking datasets for training detectors is costly, leading researchers to use synthetic datasets generated by AI technologies. However, the generalizability of detectors trained on synthetic data to real-world scenarios remains unclear due to the distribution gap. To address this, we propose learning from synthetic data for detecting real-world multimodal misinformation through two model-agnostic data selection methods that match synthetic and real-world data distributions. Experiments show that our method enhances the performance of a small MLLM (13B) on real-world fact-checking datasets, enabling it to even surpass GPT-4V."
}Markdown (Informal)
[Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs](https://preview.aclanthology.org/ingest-emnlp/2024.findings-emnlp.613/) (Zeng et al., Findings 2024)
ACL