Adam Dimeski
2022
Automatic Extraction of Structured Mineral Drillhole Results from Unstructured Mining Company Reports
Adam Dimeski
|
Afshin Rahimi
Proceedings of the Eighth Workshop on Noisy User-generated Text (W-NUT 2022)
Aggregate mining exploration results can help companies and governments to optimise and police mining permits and operations, a necessity for transition to a renewable energy future, however, these results are buried in unstructured text. We present a novel dataset from 23 Australian mining company reports, framing the extraction of structured drillhole information as a sequence labelling task. Our two benchmark models based on Bi-LSTM-CRF and BERT, show their effectiveness in this task with a F1 score of 77% and 87%, respectively. Our dataset and benchmarks are accessible online.