PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles
Daniel Ferrés, Horacio Saggion, Francesco Ronzano, Àlex Bravo
- Anthology ID:
- L18-1298
- Volume:
- Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
- Month:
- May
- Year:
- 2018
- Address:
- Miyazaki, Japan
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/L18-1298
- DOI:
- Cite (ACL):
- Daniel Ferrés, Horacio Saggion, Francesco Ronzano, and Àlex Bravo. 2018. PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
- Cite (Informal):
- PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles (Ferrés et al., LREC 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/L18-1298.pdf