Geohub: A versioned repository for geodata

Georg Semmler and Heinrich Jasper and Helmut Schaeben. ( 2019 )
in: 2019 Ring Meeting, ASGA

Abstract

Three dimensional geomodels have taken a long way in the last years. Started as a tool to visualize geology they have become a foundation to make decisions and a basis for numerical simulations. This drastically increases the requirements in accuracy, validity and reproducibility in those models to allow others to reproduce the conclusions based on those models. Measuring and evaluating the accuracy and validity of a geomodel is only a partially solved problem, but certainly the quality and accuracy of used raw data has a large influence here. As consequence each geomodel should reference all data that where involved in its creating. This is a classical domain for databases. Existing database systems for three dimensional geodata, like Geoscience in Space and Time (GST), PostGIS or Rasdaman only store a subset of geometric (but spatially referenced) information. In particularly it is necessary to store all kind of data that are involved in creating a specific model to get a reproducible representation of a geomodel. To supplement the geometric representation of the final geomodel additional information like images (geological profiles), (geological maps), geophysical datasets, borehole data etc. are included. In order to contribute to these challenges we propose a database system that is able to store, in addition to the final geomodel, all related raw data and how they are processed to reach the final model. Obviously no system can provide support for all conceivable variants of geodata. To provide coverage for as much different data types as possible we only assume two things about the data: (1) It needs to be georeferenced (2) It needs to be file based. Based on this foundation we are building a (user) extensible database system that is able to handle many different data formats. This system implements basic functionality like storing versioned files and we provide extension points to handle some data formats. To reproduce the computation of the final model, we design the system such that the process to build the model can be described in a machine readable way (at least as far as possible). Similar to the data storage this part will only provide basic functionality (like performing some abstract operation). Users are responsible for providing this abstract machine readable operation description. In the long term those descriptions will be used to automate at least some of the steps of constructing the geomodel.

Download / Links

BibTeX Reference

@INPROCEEDINGS{SemmlerRM2019,
    author = { Semmler, Georg and Jasper, Heinrich and Schaeben, Helmut },
     title = { Geohub: A versioned repository for geodata },
 booktitle = { 2019 Ring Meeting },
      year = { 2019 },
 publisher = { ASGA },
  abstract = { Three dimensional geomodels have taken a long way in the last years. Started as a tool to visualize geology they have become a foundation to make decisions and a basis for numerical simulations. This drastically increases the requirements in accuracy, validity and reproducibility in those models to allow others to reproduce the conclusions based on those models. Measuring and evaluating the accuracy and validity of a geomodel is only a partially solved problem, but certainly the quality and accuracy of used raw data has a large influence here. As consequence each geomodel should reference all data that where involved in its creating. This is a classical domain for databases. Existing database systems for three dimensional geodata, like Geoscience in Space and Time (GST), PostGIS or Rasdaman only store a subset of geometric (but spatially referenced) information. In particularly it is necessary to store all kind of data that are involved in creating a specific model to get a reproducible representation of a geomodel. To supplement the geometric representation of the final geomodel additional information like images (geological profiles), (geological maps), geophysical datasets, borehole data etc. are included. In order to contribute to these challenges we propose a database system that is able to store, in addition to the final geomodel, all related raw data and how they are processed to reach the final model. Obviously no system can provide support for all conceivable variants of geodata. To provide coverage for as much different data types as possible we only assume two things about the data: (1) It needs to be georeferenced (2) It needs to be file based. Based on this foundation we are building a (user) extensible database system that is able to handle many different data formats. This system implements basic functionality like storing versioned files and we provide extension points to handle some data formats. To reproduce the computation of the final model, we design the system such that the process to build the model can be described in a machine readable way (at least as far as possible). Similar to the data storage this part will only provide basic functionality (like performing some abstract operation). Users are responsible for providing this abstract machine readable operation description. In the long term those descriptions will be used to automate at least some of the steps of constructing the geomodel. }
}