1.To run the install script, the Models must be under certain Folder and structure
Cybershake ├── Data │ ├── Sources │ └── VMs └── Runs
2.after files are in place, run the install script
two arguments needed, 1st if the root folder, which contains the Data and Runs folder. 2nd is a file that contains a list of VMs
$gmsim/workflow/scripts/cybershake/install_cybershake.sh $gmsim/RunFolder/cybershake/v18p5 $gmsim/RunFolder/cybershake/v18p5/list_all
keep in mind the 2nd argument must be a file that constains a list of VMs,
something like this:
Opotiki02 Opotiki03 OpouaweUruti Orakeikorako Orakonui Oruakukuru Oruawharo Otakiri Otaraia OtokoTotoF7
3. run the auto submission script with an period/interval.
*please refer to the bottom of this page for how to use screen sockets
The script takes two arguments, 1st is the path to sim_root folder (which is the same as you passed to install script), 2nd is the interval between loops in seconds.
$gmsim/workflow/scripts/cybershake/run_queue_and_auto_submit.sh $gmsim/RunFolder/cybershake/v18p5 60
60 means run the script every 1min. please adjust this accordingly.
Note: this script will keep running in a loop until it is killed by Ctrl-C
4. Monitor the status of each simulation by running query script.
python $gmsim/workflow/scripts/management/query_mgmt_db.py $gmsim/RunFolder/cybershake/v18p5
it should show you something like this:
run_name | process | status | job-id | last_modified __________________________________________________________________________________________ 2012p075555 | merge_ts | in-queue | 2198889 | 2018-05-29 04:34:39 2012p075555 | winbin_aio | created | None | 2018-05-29 04:34:39 2012p075555 | BB | created | None | 2018-05-29 04:34:39 2012p075555 | IM_calculation | created | None | 2018-05-29 04:34:39 2012p075555 | HF | completed | 2198881 | 2018-05-29 21:29:21 2012p075555 | EMOD3D | failed | 2198858 | 2018-05-29 04:43:40 2012p713691 | merge_ts | created | None | 2018-05-29 04:34:40 2012p713691 | winbin_aio | created | None | 2018-05-29 04:34:40 2012p713691 | BB | created | None | 2018-05-29 04:34:40 2012p713691 | IM_calculation | created | None | 2018-05-29 04:34:40 2012p713691 | HF | completed | 2198882 | 2018-05-29 21:29:21 2012p713691 | EMOD3D | failed | 2198860 | 2018-05-29 04:44:49 2012p764736 | merge_ts | created | None | 2018-05-29 04:34:40 2012p764736 | winbin_aio | created | None | 2018-05-29 04:34:40 2012p764736 | HF | created | None | 2018-05-29 04:34:40 2012p764736 | BB | created | None | 2018-05-29 04:34:40 2012p764736 | IM_calculation | created | None | 2018-05-29 04:34:40 2012p764736 | EMOD3D | failed | 2198862 | 2018-05-29 04:44:49 2012p781523 | merge_ts | created | None | 2018-05-29 04:34:40 2012p781523 | winbin_aio | created | None | 2018-05-29 04:34:40 2012p781523 | BB | created | None | 2018-05-29 04:34:40
use -e to show only the failed runs(with the errors)
python $gmsim/workflow/scripts/management/query_mgmt_db.py /nesi/nobackup/nesi00213/test_auto_submit -e Run_name: 2012p075555 Process: EMOD3D Status: failed Job-ID: 2198858 Last_Modified: 2018-05-29 04:43:40 Error: Task removed from squeue without completion Run_name: 2012p713691 Process: EMOD3D Status: failed Job-ID: 2198860 Last_Modified: 2018-05-29 04:44:49 Error: Task removed from squeue without completion
Create a screen socket
*running scripts on a screen socket and avoid the need of having the terminal open constantly (which means you can disconnect from Kupe but have the script still running on it)
screen -S your_prefered_name_for_socket
To detach a socket, use Ctrl+A+D
To Terminate a socket, use Ctrl+D
to show all available socket created before, use --list
screen -list There is a screen on: 289787.cybershake_v18p6 (Detached) 1 Socket in /var/run/uscreens/S-ykh22.
To resume to a specific socket, use -r
screen -r 289787.cybershake_v18p6 or screen -r 289787