@inproceedings{xia-etal-2012-cltc,
    title = "{CLTC}: A {C}hinese-{E}nglish Cross-lingual Topic Corpus",
    author = "Xia, Yunqing  and
      Tang, Guoyu  and
      Jin, Peng  and
      Yang, Xia",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Declerck, Thierry  and
      Do{\u{g}}an, Mehmet U{\u{g}}ur  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Moreno, Asuncion  and
      Odijk, Jan  and
      Piperidis, Stelios",
    booktitle = "Proceedings of the Eighth International Conference on Language Resources and Evaluation ({LREC}'12)",
    month = may,
    year = "2012",
    address = "Istanbul, Turkey",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://preview.aclanthology.org/ingest-emnlp/L12-1197/",
    pages = "532--537",
    abstract = "Cross-lingual topic detection within text is a feasible solution to resolving the language barrier in accessing the information. This paper presents a Chinese-English cross-lingual topic corpus (CLTC), in which 90,000 Chinese articles and 90,000 English articles are organized within 150 topics. Compared with TDT corpora, CLTC has three advantages. First, CLTC is bigger in size. This makes it possible to evaluate the large-scale cross-lingual text clustering methods. Second, articles are evenly distributed within the topics. Thus it can be used to produce test datasets for different purposes. Third, CLTC can be used as a cross-lingual comparable corpus to develop methods for cross-lingual information access. A preliminary evaluation with CLTC corpus indicates that the corpus is effective in evaluating cross-lingual topic detection methods."
}Markdown (Informal)
[CLTC: A Chinese-English Cross-lingual Topic Corpus](https://preview.aclanthology.org/ingest-emnlp/L12-1197/) (Xia et al., LREC 2012)
ACL
- Yunqing Xia, Guoyu Tang, Peng Jin, and Xia Yang. 2012. CLTC: A Chinese-English Cross-lingual Topic Corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 532–537, Istanbul, Turkey. European Language Resources Association (ELRA).