It was found to be necessary to convert the velocity model generation code to Python because:

  • R code was using too much RAM.
  • R code was slow.
  • Using less RAM was taking too much effort and freeing RAM is slow.
  • Looping over smaller subsets was too slow, difficult to work with.
  • Everything is global, wasting RAM and messy.
  • Libraries duplicating RAM of input data, not releasing it.
  • Combining a 3GB dataframe after splitting it made it use 16GB RAM.
  • Taking the first 20 million elements from a dataframe took ages and RAM usage blew up.

Outcomes

  • Most if not all code runs in Python, mostly using numpy and gdal.
  • Code runs in minutes instead of days.
  • Can run on a machine with 32GB RAM (16GB should be enough too).
  • Can run with larger input datasets.
  • Can create finer resolutions than 100m.
  • Results easily viewed/interrogated in QGIS project. Easily shared.

Currently all processing elements have had Python algorithms/process found/investigated except MVN.

Steps

  • s65 Terrain model interpolation given values (core completed).
  • s65/s66 Geology model polygon interpolation (algorithm found 10,000s times faster).
  • s66 Rest of geology model datasets, eg store slope as TIFF.
  • s66 Rest of geology model processing rules.
  • s66 Synchronise outputs between geology and terrain (run with same parameters).
  • s66 Combination of geology/terrain.
  • s66 Store all outputs.
  • s66 Argument version for running complete workflow.
  • s66 Create QGIS project including outputs as layers, open street map.
  • s66 Convert VSPR to Python (loading and pre-processing measured sites).
  • s66 Posterior model modification to Python using already available clustering code.
  • s66 Run at 50m, 10m resolution.
  • s66 Compare results.
  • s67 Look into converting MVN code to Python (find main algorithms/ processes required).
  • s67 Convert if algorithms/processes found and reasonable in Python.
  • s67 Implement MVN in Python.
  • s67 Point based calculation as well as raster based.
  • s67 Investigate sources of differences (Vs30 Data Reproducibility/Verification).
  • No labels