- Anthology ID:
- W14-0401
- Volume:
- Proceedings of the 9th Web as Corpus Workshop (WaC-9)
- Month:
- April
- Year:
- 2014
- Address:
- Gothenburg, Sweden
- Venue:
- WAC
- SIG:
- SIGWAC
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1–8
- Language:
- URL:
- https://aclanthology.org/W14-0401
- DOI:
- 10.3115/v1/W14-0401
- Cite (ACL):
- Adrien Barbaresi. 2014. Finding Viable Seed URLs for Web Corpora: A Scouting Approach and Comparative Study of Available Sources. In Proceedings of the 9th Web as Corpus Workshop (WaC-9), pages 1–8, Gothenburg, Sweden. Association for Computational Linguistics.
- Cite (Informal):
- Finding Viable Seed URLs for Web Corpora: A Scouting Approach and Comparative Study of Available Sources (Barbaresi, WAC 2014)
- PDF:
- https://preview.aclanthology.org/auto-file-uploads/W14-0401.pdf
- Code
- adbar/flux-toolchain