@inproceedings{danilova-soderfeldt-2025-classifying,
    title = "Classifying Textual Genre in Historical Magazines (1875-1990)",
    author = {Danilova, Vera  and
      S{\"o}derfeldt, Ylva},
    editor = "Kazantseva, Anna  and
      Szpakowicz, Stan  and
      Degaetano-Ortlieb, Stefania  and
      Bizzoni, Yuri  and
      Pagel, Janis",
    booktitle = "Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)",
    month = may,
    year = "2025",
    address = "Albuquerque, New Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2025.latechclfl-1.15/",
    doi = "10.18653/v1/2025.latechclfl-1.15",
    pages = "160--171",
    ISBN = "979-8-89176-241-1",
    abstract = "Historical magazines are a valuable resource for understanding the past, offering insights into everyday life, culture, and evolving social attitudes. They often feature diverse layouts and genres. Short stories, guides, announcements, and promotions can all appear side by side on the same page. Without grouping these documents by genre, term counts and topic models may lead to incorrect interpretations.This study takes a step towards addressing this issue by focusing on genre classification within a digitized collection of European medical magazines in Swedish and German. We explore 2 scenarios: 1) leveraging the available web genre datasets for zero-shot genre prediction, 2) semi-supervised learning over the few-shot setup. This paper offers the first experimental insights in this direction.We find that 1) with a custom genre scheme tailored to historical dataset characteristics it is possible to effectively utilize categories from web genre datasets for cross-domain and cross-lingual zero-shot prediction, 2) semi-supervised training gives considerable advantages over few-shot for all models, particularly for the historical multilingual BERT."
}Markdown (Informal)
[Classifying Textual Genre in Historical Magazines (1875-1990)](https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2025.latechclfl-1.15/) (Danilova & Söderfeldt, LaTeCHCLfL 2025)
ACL
- Vera Danilova and Ylva Söderfeldt. 2025. Classifying Textual Genre in Historical Magazines (1875-1990). In Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025), pages 160–171, Albuquerque, New Mexico. Association for Computational Linguistics.