Abstract
We build a tool to assist in content creation by mining the web for information relevant to a given topic. This tool imitates the process of essay writing by humans: searching for topics on the web, selecting content frag-ments from the found document, and then compiling these fragments to obtain a coherent text. The process of writing starts with automated building of a table of content by obtaining the list of key entities for the given topic extracted from web resources such as Wikipedia. Once a table of content is formed, each item forms a seed for web mining. The tool builds a full-featured structured Word document with table of content, section structure, images and captions and web references for all mined text fragments. Two linguistic technologies are employed: for relevance verification, we use similarity computed as a tree similarity between parse trees for a seed and candidate text fragment. For text coherence, we use a measure of agreement between a given and consecutive paragraph by tree kernel learning of their discourse trees. The tool is available at http://animatronica.io/submit.html.- Anthology ID:
- C16-2042
- Volume:
- Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editor:
- Hideo Watanabe
- Venue:
- COLING
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 198–202
- Language:
- URL:
- https://aclanthology.org/C16-2042
- DOI:
- Cite (ACL):
- Boris Galitsky. 2016. A Tool for Efficient Content Compilation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, pages 198–202, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- A Tool for Efficient Content Compilation (Galitsky, COLING 2016)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/C16-2042.pdf