@inproceedings{ljubesic-erjavec-2016-corpus,
    title = "Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of {S}lovene",
    author = "Ljube{\v{s}}i{\'c}, Nikola  and
      Erjavec, Toma{\v{z}}",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Declerck, Thierry  and
      Goggi, Sara  and
      Grobelnik, Marko  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Mazo, Helene  and
      Moreno, Asuncion  and
      Odijk, Jan  and
      Piperidis, Stelios",
    booktitle = "Proceedings of the Tenth International Conference on Language Resources and Evaluation ({LREC}'16)",
    month = may,
    year = "2016",
    address = "Portoro{\v{z}}, Slovenia",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://preview.aclanthology.org/ingest-emnlp/L16-1242/",
    pages = "1527--1531",
    abstract = "In this paper we present a tagger developed for inflectionally rich languages for which both a training corpus and a lexicon are available. We do not constrain the tagger by the lexicon entries, allowing both for lexicon incompleteness and noisiness. By using the lexicon indirectly through features we allow for known and unknown words to be tagged in the same manner. We test our tagger on Slovene data, obtaining a 25{\%} error reduction of the best previous results both on known and unknown words. Given that Slovene is, in comparison to some other Slavic languages, a well-resourced language, we perform experiments on the impact of token (corpus) vs. type (lexicon) supervision, obtaining useful insights in how to balance the effort of extending resources to yield better tagging results."
}Markdown (Informal)
[Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of Slovene](https://preview.aclanthology.org/ingest-emnlp/L16-1242/) (Ljubešić & Erjavec, LREC 2016)
ACL