The purpose of this document is to describe the various functionalities and outputs of the Slurm job management database.
Creation of database
This is called automatically as part of install.sh – to manually create a db you can use the below
python create_mgmt_db.py <path_to_run_folder> [list of realisations]
e.g.
python create_mgmt_db.py ~/Documents/scratch/test_18p5/ test123 test_realiastion1
Updating entries in database
Uses the same path name as used to create the db, rather than the absolute path of the db.
Can only progress the status, aka must move in a linear fashion. If a step fails it should advance to failed and a new entry created.
usage: update_mgmt_db.py [-h] [-r RUN_NAME] [-j JOB] [-e ERROR]
run_folder {EMOD3D,post_EMOD3D,HF,BB,IM_calculation}
{created,in-queue,running,completed,failed}
positional arguments:
run_folder folder to the collection of runs on Kupe
{EMOD3D,post_EMOD3D,HF,BB,IM_calculation}
{created,in-queue,running,completed,failed}
optional arguments:
-h, --help show this help message and exit
-r RUN_NAME, --run_name RUN_NAME
name of run to be updated
-j JOB, --job JOB – Job number on supercomputer
-e ERROR, --error ERROR – text notes about why the run failed
e.g.
python update_mgmt_db.py ~/Documents/scratch/test_18p5/ HF in-queue --j 3 --run_name test123
python update_mgmt_db.py ~/Documents/scratch/test_18p5/ HF running --j 3
python update_mgmt_db.py ~/Documents/scratch/test_18p5/ HF failed --j 3 --error 'Hit wall clock limit 5000'
Querying status of database
e.g.
Code Block | ||
---|---|---|
| ||
slurm_gm_workflow/scripts/management$ python query_mgmt_db.py ~/Documents/scratch/test_18p5/
run_name | process | status | last_modified
_______________________________________________________________________________
test123 | BB | in-queue | 2018-05-16 03:53:55
test123 | IM_calculation | in-queue | 2018-05-16 03:53:55
test123 | post_EMOD3D | running | 2018-05-16 04:30:01
test123 | EMOD3D | completed | 2018-05-16 03:58:15
test123 | HF | failed | 2018-05-16 22:56:41
test_realiastion1 | EMOD3D | created | 2018-05-16 03:34:26
test_realiastion1 | post_EMOD3D | created | 2018-05-16 03:34:26
test_realiastion1 | HF | created | 2018-05-16 03:34:26
test_realiastion1 | BB | created | 2018-05-16 03:34:26
test_realiastion1 | IM_calculation | created | 2018-05-16 03:34:26 |
Code Block | ||
---|---|---|
| ||
slurm_gm_workflow/scripts/management$ python query_mgmt_db.py ~/Documents/scratch/test_18p5/ test123
run_name | process | status | last_modified
_______________________________________________________________________________
test123 | BB | in-queue | 2018-05-16 03:53:55
test123 | IM_calculation | in-queue | 2018-05-16 03:53:55
test123 | post_EMOD3D | running | 2018-05-16 04:30:01
test123 | EMOD3D | completed | 2018-05-16 03:58:15
test123 | HF | failed | 2018-05-16 22:56:41 |
slurm_gm_workflow/scripts/management$ python query_mgmt_db.py ~/Documents/scratch/test_18p5/ --error
Run_name: test123
Process: EMOD3D
Status: completed
Last_Modified: 2018-05-16 03:58:15
Error: Demo error
Run_name: test123
Process: HF
Status: failed
Last_Modified: 2018-05-16 22:56:41
Error: hit wall clock limit 5000
Inserting new tasks into database
Insert a new entry into the database with the status created for the given run_name
python insert_mgmt_db.py ~/Documents/scratch/test_18p5/ run_name {EMOD3D,post_EMOD3D,HF,BB,IM_calculation}
e.g.
python insert_mgmt_db.py ~/Documents/scratch/test_18p5/ test123 EMOD3D