@inproceedings{erjavec-etal-2010-jos,
    title = "The {JOS} Linguistically Tagged Corpus of {S}lovene",
    author = "Erjavec, Toma{\v{z}}  and
      Fi{\v{s}}er, Darja  and
      Krek, Simon  and
      Ledinek, Nina",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Odijk, Jan  and
      Piperidis, Stelios  and
      Rosner, Mike  and
      Tapias, Daniel",
    booktitle = "Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10)",
    month = may,
    year = "2010",
    address = "Valletta, Malta",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://preview.aclanthology.org/ingest-emnlp/L10-1087/",
    abstract = "The JOS language resources are meant to facilitate developments of HLT and corpus linguistics for the Slovene language and consist of the morphosyntactic specifications, defining the Slovene morphosyntactic features and tagset; two annotated corpora (jos100k and jos1M); and two web services (a concordancer and text annotation tool). The paper introduces these components, and concentrates on jos100k, a 100,000 word sampled balanced monolingual Slovene corpus, manually annotated for three levels of linguistic description. On the morphosyntactic level, each word is annotated with its morphosyntactic description and lemma; on the syntactic level the sentences are annotated with dependency links; on the semantic level, all the occurrences of 100 top nouns in the corpus are annotated with their wordnet synset from the Slovene semantic lexicon sloWNet. The JOS corpora and specifications have a standardised encoding (Text Encoding Initiative Guidelines TEI P5) and are available for research from \url{http://nl.ijs.si/jos/} under the Creative Commons licence."
}Markdown (Informal)
[The JOS Linguistically Tagged Corpus of Slovene](https://preview.aclanthology.org/ingest-emnlp/L10-1087/) (Erjavec et al., LREC 2010)
ACL
- Tomaž Erjavec, Darja Fišer, Simon Krek, and Nina Ledinek. 2010. The JOS Linguistically Tagged Corpus of Slovene. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).