PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles
Daniel Ferrés, Horacio Saggion, Francesco Ronzano, Àlex Bravo
- Anthology ID:
- L18-1298
- Volume:
- Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
- Month:
- May
- Year:
- 2018
- Address:
- Miyazaki, Japan
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/L18-1298
- DOI:
- Cite (ACL):
- Daniel Ferrés, Horacio Saggion, Francesco Ronzano, and Àlex Bravo. 2018. PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
- Cite (Informal):
- PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles (Ferrés et al., LREC 2018)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/L18-1298.pdf