1.Prepare Data:
To run the install script, the Models must be under certain Folder and structure
Cybershake └── version ├── Data │ ├── Sources │ └── VMs └── Runs
2. Prepare config:
You can use the default one or create your own.
The default one looks like this:
cat $gmsim/workflow/scripts/cybershake/cybershake_config.json { "global_root" : "/nesi/transit/nesi00213" , "stat_file_path" : "/nesi/transit/nesi00213/StationInfo/non_uniform_whole_nz_with_real_stations-hh400_17062017.ll", "v_1d_mod" : "/nesi/transit/nesi00213/VelocityModel/Mod-1D/Cant1D_v2-midQ_leer.1d", "dt" : 0.02, "hf_dt" : 0.005 }
If you wish to use different parameters, please make a copy of this file and edit it.
cp $gmsim/workflow/scripts/cybershake/cybershake_config.json /my/version/of/runs/config.json
meaning/usage of parameters:
"global_root" : usually the location of the project, where we put our binary(EMOD3D,tools) are. Modification to this is NOT recommend
"stat_file_path" : the absolute path to the station files (vs30 will be assuming the same basename)
"v_1d_mod" : the absolute path to a 1D velocity file, this is used to install_bb and run HF.
"dt" ; dt used for EMOD3D
Optional
if you want to specify special hf vs30 ref add in:
"hf_stat_vs_ref" : "/nesi/transit/nesi00213/StationInfo/cantstations_v1pt2.hfvs30ref"
3. Install
After files are in place, run the install script
THREE arguments needed, 1st if the root folder, which contains the Data and Runs folder. 2nd is the config file we created in step 2. 3rd is a file that contains a list of VMs
$gmsim/workflow/scripts/cybershake/install_cybershake.sh $gmsim/RunFolder/cybershake/v18p5 $gmsim/workflow/scripts/cybershake/cybershake_config.json $gmsim/RunFolder/cybershake/v18p5/list_all
keep in mind the 3nd argument must be a file that constains a list of VMs,
something like this:
Opotiki02 Opotiki03 OpouaweUruti Orakeikorako Orakonui Oruakukuru Oruawharo Otakiri Otaraia OtokoTotoF7
4. Create a screen socket
Running scripts on a screen socket and avoid the need of having the terminal open constantly (which means you can disconnect from Kupe but have the script still running on it)
screen -S your_prefered_name_for_socket
To detach a socket, use Ctrl+A+D
To Terminate a socket, use Ctrl+D
to show all available socket created before, use --list
screen -list There is a screen on: 289787.cybershake_v18p6 (Detached) 1 Socket in /var/run/uscreens/S-ykh22.
To resume to a specific socket, use -r
screen -r 289787.cybershake_v18p6 or screen -r 289787
5. Run the simulation in auto
run the auto submission script with an period/interval.
The script takes THREE arguments, 1st is the path to sim_root folder (which is the same as you passed to install script), 2nd is the interval between loops in seconds, 3rd is the config file used to install in step 3.
Important: run this script on your local machine.
clone the git repository:
git clone git@github.com:ucgmsim/slurm_gm_workflow.git ~/
then run the remote daemon script:
$~/slurm_gm_workflow/scripts/cybershake/run_queue_and_auto_submit_remote.sh $gmsim/RunFolder/cybershake/v18p5 60 $gmsim/workflow/scripts/cybershake/cybershake_config.json
60 means run the script every 1min. please adjust this accordingly.
Note: this script will keep running in a loop until it is killed by Ctrl-C. Or until the screen socket is terminated(if you followed step 4)
If you are running the script in a 'screen' socket, press Ctrl+A+D to detach it, so you can continue next step within the same terminal (and not worrying about disconnecting)
6. Monitor Simulation Status
Monitor the status of each simulation by running query script.
python $gmsim/workflow/scripts/management/query_mgmt_db.py $gmsim/RunFolder/cybershake/v18p5
it should show you something like this:
run_name | process | status | job-id | last_modified __________________________________________________________________________________________ 2012p075555 | merge_ts | in-queue | 2198889 | 2018-05-29 04:34:39 2012p075555 | winbin_aio | created | None | 2018-05-29 04:34:39 2012p075555 | BB | created | None | 2018-05-29 04:34:39 2012p075555 | IM_calculation | created | None | 2018-05-29 04:34:39 2012p075555 | HF | completed | 2198881 | 2018-05-29 21:29:21 2012p075555 | EMOD3D | failed | 2198858 | 2018-05-29 04:43:40 2012p713691 | merge_ts | created | None | 2018-05-29 04:34:40 2012p713691 | winbin_aio | created | None | 2018-05-29 04:34:40 2012p713691 | BB | created | None | 2018-05-29 04:34:40 2012p713691 | IM_calculation | created | None | 2018-05-29 04:34:40 2012p713691 | HF | completed | 2198882 | 2018-05-29 21:29:21 2012p713691 | EMOD3D | failed | 2198860 | 2018-05-29 04:44:49 2012p764736 | merge_ts | created | None | 2018-05-29 04:34:40 2012p764736 | winbin_aio | created | None | 2018-05-29 04:34:40 2012p764736 | HF | created | None | 2018-05-29 04:34:40 2012p764736 | BB | created | None | 2018-05-29 04:34:40 2012p764736 | IM_calculation | created | None | 2018-05-29 04:34:40 2012p764736 | EMOD3D | failed | 2198862 | 2018-05-29 04:44:49 2012p781523 | merge_ts | created | None | 2018-05-29 04:34:40 2012p781523 | winbin_aio | created | None | 2018-05-29 04:34:40 2012p781523 | BB | created | None | 2018-05-29 04:34:40
use -e to show only the failed runs(with the errors)
python $gmsim/workflow/scripts/management/query_mgmt_db.py /nesi/nobackup/nesi00213/test_auto_submit -e Run_name: 2012p075555 Process: EMOD3D Status: failed Job-ID: 2198858 Last_Modified: 2018-05-29 04:43:40 Error: Task removed from squeue without completion Run_name: 2012p713691 Process: EMOD3D Status: failed Job-ID: 2198860 Last_Modified: 2018-05-29 04:44:49 Error: Task removed from squeue without completion