It was found to be necessary to convert the velocity model generation code to Python because:
- R code was using too much RAM.
- R code was slow.
- Using less RAM was taking too much effort and freeing RAM is slow.
- Looping over smaller subsets was too slow, difficult to work with.
- Everything is global, wasting RAM and messy.
- Libraries duplicating RAM of input data, not releasing it.
- Combining a 3GB dataframe after splitting it made it use 16GB RAM.
- Taking the first 20 million elements from a dataframe took ages and RAM usage blew up.
Outcomes
- Most if not all code runs in Python, mostly using numpy and gdal.
- Code runs in minutes instead of days.
- Can run on a machine with 32GB RAM (16GB should be enough too).
- Can run with larger input datasets.
- Can create finer resolutions than 100m.
- Results easily viewed/interrogated in QGIS project. Easily shared.
Currently all processing elements have had Python algorithms/process found/investigated except MVN.
Steps
- Terrain model interpolation given values (core completed).
- Geology model polygon interpolation (algorithm found 10,000s times faster).
- Rest of geology model datasets, eg store slope as TIFF.
- Rest of geology model processing rules.
- Synchronise outputs between geology and terrain (run with same parameters).
- Combination of geology/terrain.
- Store all outputs.
- Argument version for running complete workflow.
- Create QGIS project including outputs as layers, open street map.
- Convert VSPR to Python (loading and pre-processing measured sites).
- Look into converting MVN code to Python (find main algorithms/ processes required).
- Convert if algorithms/processes found and reasonable in Python.
- Implement MVN in Python.
- Run at 50m, 10m resolution.
- Compare results.