@inproceedings{shmidman-etal-2024-msbert,
    title = "{M}s{BERT}: A New Model for the Reconstruction of Lacunae in {H}ebrew Manuscripts",
    author = "Shmidman, Avi  and
      Shmidman, Ometz  and
      Gershuni, Hillel  and
      Koppel, Moshe",
    editor = "Pavlopoulos, John  and
      Sommerschield, Thea  and
      Assael, Yannis  and
      Gordin, Shai  and
      Cho, Kyunghyun  and
      Passarotti, Marco  and
      Sprugnoli, Rachele  and
      Liu, Yudong  and
      Li, Bin  and
      Anderson, Adam",
    booktitle = "Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)",
    month = aug,
    year = "2024",
    address = "Hybrid in Bangkok, Thailand and online",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.ml4al-1.2/",
    doi = "10.18653/v1/2024.ml4al-1.2",
    pages = "13--18",
    abstract = "Hebrew manuscripts preserve thousands of textual transmissions of post-Biblical Hebrew texts from the first millennium. In many cases, the text in the manuscripts is not fully decipherable, whether due to deterioration, perforation, burns, or otherwise. Existing BERT models for Hebrew struggle to fill these gaps, due to the many orthographical deviations found in Hebrew manuscripts. We have pretrained a new dedicated BERT model, dubbed MsBERT (short for: Manuscript BERT), designed from the ground up to handle Hebrew manuscript text. MsBERT substantially outperforms all existing Hebrew BERT models regarding the prediction of missing words in fragmentary Hebrew manuscript transcriptions in multiple genres, as well as regarding the task of differentiating between quoted passages and exegetical elaborations. We provide MsBERT for free download and unrestricted use, and we also provide an interactive and user-friendly website to allow manuscripts scholars to leverage the power of MsBERT in their scholarly work of reconstructing fragmentary Hebrew manuscripts."
}Markdown (Informal)
[MsBERT: A New Model for the Reconstruction of Lacunae in Hebrew Manuscripts](https://preview.aclanthology.org/ingest-emnlp/2024.ml4al-1.2/) (Shmidman et al., ML4AL 2024)
ACL