Page History

Background

Create a model to estimate the wall clock time for LF, HF or BB run

Tasks:

Collect, combine and transform metadata (done)
Investigate and test a model -> Neural network (done)
Streamline training and usage of NN (done)
Add support for number of cores parameter (mostly done, require real data to test)
Change submit scripts (submit_emod3d.py....) to use pre-trained model
- Change submit scripts to python3
  - submit_emod3d (done, currently testing that it still works)
  - Created python3 virtual environment for maui and mahuika
  Change all submit script dependencies to be python 2 and 3 compatible (done, currently testing)
Create script to estimate full run time for a folder of srf/vms
Train actual model once maui data is available, i.e. after current cybershake run
(Uncertainty?)
(Visualisation?)

Documentation

Also exists as Readme.md in slurm_gm_workflow/estimation/

Notes

All of the estimation code is written in python3, the only exception is the write_jsons.py script used for metadata collection

Usage

Estimation is done using the functions inside estimate_WC.py, which load the pre-trained neural network and then run the estimation.

...

def convert_to_wct(core_hours):
    pass
    
def get_wct(core_hours, overestimate_factor=0.1):
    pass

Creating a pre-trained model

Building a pre-trained model consists of a two main steps, collect and format the data and then training the neural network

Colleting the metadata

1) To create the metadata files for a given Run, use the write_jsons.py (with the -sj flag) script, which will create a folder names "jsons" in each fault simulation folder. This jsons folder then contains a "all_sims.json" file, which has all the metadata for the faults realisations runs. E.g.

...

3) Steps 1 and 2 can be repeated for as many run folders as wanted. I would suggest putting all the resulting .csv files into a single directory, for easier loading when training the neural network model.

Training the model

The training of the different models (LF, HF, BB) is done using the train_model.py script, which takes a config file and the input data (either as directory or single input files). The input data is the saved dataframes from the previous step.

...

Child pages

Versions Compared

Old Version 2

New Version Current

Key