@inproceedings{tsvetkov-wintner-2010-automatic,
    title = "Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content",
    author = "Tsvetkov, Yulia  and
      Wintner, Shuly",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Odijk, Jan  and
      Piperidis, Stelios  and
      Rosner, Mike  and
      Tapias, Daniel",
    booktitle = "Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10)",
    month = may,
    year = "2010",
    address = "Valletta, Malta",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://preview.aclanthology.org/iwcs-25-ingestion/L10-1019/",
    abstract = "Parallel corpora are indispensable resources for a variety of multilingual natural language processing tasks. This paper presents a technique for fully automatic construction of constantly growing parallel corpora. We propose a simple and effective dictionary-based algorithm to extract parallel document pairs from a large collection of articles retrieved from the Internet, potentially containing manually translated texts. This algorithm was implemented and tested on Hebrew-English parallel texts. With properly selected thresholds, precision of 100{\%} can be obtained."
}Markdown (Informal)
[Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content](https://preview.aclanthology.org/iwcs-25-ingestion/L10-1019/) (Tsvetkov & Wintner, LREC 2010)
ACL