create a file constains a list of vm models to run(since it is currently 1 to N srf)
cd /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/Vms ls | split -l 10 - list_vm |
this should output something like this:
-rw-rw---- 1 ykh22 nesi-users 94 Oct 2 03:20 list_vma -rw-rw---- 1 ykh22 nesi-users 91 Oct 2 03:20 list_vmb -rw-rw---- 1 ykh22 nesi-users 84 Oct 2 03:20 list_vmc -rw-rw---- 1 ykh22 nesi-users 108 Oct 2 03:20 list_vmd -rw-rw---- 1 ykh22 nesi-users 82 Oct 2 03:20 list_vme -rw-rw---- 1 ykh22 nesi-users 62 Oct 2 03:20 list_vmf -rw-rw---- 1 ykh22 nesi-users 105 Oct 2 03:20 list_vmg -rw-rw---- 1 ykh22 nesi-users 96 Oct 2 03:20 list_vmh -rw-rw---- 1 ykh22 nesi-users 51 Oct 2 03:20 list_vmi |
run install_cybershake.py with the path to the list of vm models and the path to install to
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/install_cybershake.sh $gmsim/RunFolder/Cybershake/v17p9/Data/list_vma /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9 |
this should create all the simulation folders in the list_vm*
Albury AlpineF2K AlpineK2T Ashley AwatNEVer AwatNEVerCl AwatereNE AwatereSW Barefell Brothers !!!!SIM_DIR:/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury Generation of model params has been skipped. Re-directing related params to files under /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/VMs/Albury /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury : 750 **************************************************************************************************** **************************************************************************************************** Producing statcords and FD_STATLIST. It may take a minute or two /nesi/projects/nesi00213/StationInfo/non_uniform_whole_nz_with_real_stations-hh400_17062017.ll From: /nesi/projects/nesi00213/StationInfo/non_uniform_whole_nz_with_real_stations-hh400_17062017.ll To: /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/fd_rt01-h0.400.statcords /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/fd_rt01-h0.400.ll Done !!!!SIM_DIR:/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/AlpineF2K Generation of model params has been skipped. Re-directing related params to files under /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/VMs/AlpineF2K /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/AlpineF2K Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/AlpineF2K : 750 **************************************************************************************************** **************************************************************************************************** Producing statcords and FD_STATLIST. It may take a minute or two /nesi/projects/nesi00213/StationInfo/non_uniform_whole_nz_with_real_stations-hh400_17062017.ll From: /nesi/projects/nesi00213/StationInfo/non_uniform_whole_nz_with_real_stations-hh400_17062017.ll To: /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/AlpineF2K/fd_rt01-h0.400.statcords /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/AlpineF2K/fd_rt01-h0.400.ll Done ... ... ... !!!!SIM_DIR:/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers Generation of model params has been skipped. Re-directing related params to files under /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/VMs/Brothers /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers : 750 **************************************************************************************************** **************************************************************************************************** Producing statcords and FD_STATLIST. It may take a minute or two /nesi/projects/nesi00213/StationInfo/non_uniform_whole_nz_with_real_stations-hh400_17062017.ll From: /nesi/projects/nesi00213/StationInfo/non_uniform_whole_nz_with_real_stations-hh400_17062017.ll To: /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers/fd_rt01-h0.400.statcords /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers/fd_rt01-h0.400.ll Done |
run submit_cybershake_emod3d.sh ( this will submit EMOD3D for all the simulation will the maximum WCT estimated)
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/submit_cybershake_emod3d.sh /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/list_vma |
IMPORTANT!! : make sure the list_vm is the same as the one used in install_cybershake. (or has been installed properly by any means)
submitting EMOD3D for: Albury AlpineF2K AlpineK2T Ashley AwatNEVer AwatNEVerCl AwatereNE AwatereSW Barefell Brothers ============================== submitting for Albury ============================== nx=286 ny=272 nz=105 sim_duration=55 num_procs=512 Maximum: 0:06:07.273212 Average: 0:00:51.445334 Minimum: 0:00:00 Loadleveler script run_emod3d_Albury_HYP01-01_S1244.ll written Submitting run_emod3d_Albury_HYP01-01_S1244.ll Loadleveler script run_emod3d_Albury_HYP01-01_S1254.ll written Submitting run_emod3d_Albury_HYP01-01_S1254.ll Loadleveler script run_emod3d_Albury_HYP01-01_S1264.ll written Submitting run_emod3d_Albury_HYP01-01_S1264.ll .. .. .. ============================== submitting for Brothers ============================== nx=372 ny=356 nz=113 sim_duration=69 num_procs=512 Maximum: 0:14:04.156170 Average: 0:01:58.244117 Minimum: 0:00:00 Loadleveler script run_emod3d_Brothers_HYP01-02_S1244.ll written Submitting run_emod3d_Brothers_HYP01-02_S1244.ll Loadleveler script run_emod3d_Brothers_HYP01-02_S1254.ll written Submitting run_emod3d_Brothers_HYP01-02_S1254.ll Loadleveler script run_emod3d_Brothers_HYP01-02_S1264.ll written Submitting run_emod3d_Brothers_HYP01-02_S1264.ll Loadleveler script run_emod3d_Brothers_HYP02-02_S1274.ll written Submitting run_emod3d_Brothers_HYP02-02_S1274.ll Loadleveler script run_emod3d_Brothers_HYP02-02_S1284.ll written Submitting run_emod3d_Brothers_HYP02-02_S1284.ll Loadleveler script run_emod3d_Brothers_HYP02-02_S1294.ll written Submitting run_emod3d_Brothers_HYP02-02_S1294.ll |
run test_emod3d.sh to determine which simulation have finished its EMOD3D jobs.
the script takes 2 arguments: 1. the path to the Runs folder. 2. the list of vms (so it will not run for all the unnecessary runs)
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/test_cybershake_emod3d.sh /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/list_vma 2>&1 | tee /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/test_emod3d_vma.log |
this will output the test result on the screen as well as dumping them into a log file, namely "/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/test_emod3d_vma.log"
(the part of the script after 2>&1 is to redirect the output to both the screen and a file using 'tee' )
Note:change the file name and location depending on your own requirement.
running test for Albury Albury_HYP01-01_S1244: EMOD3D completed Albury_HYP01-01_S1254: EMOD3D completed Albury_HYP01-01_S1264: EMOD3D completed ==================== Albury finished ==================== |
after all EMOD3D finished, run submit_cybershake_post_emod.sh
IMPORTANT:this will submit post_emod3d for all of the listed vm in list_vm. so if not all emod3d finished, it will be better to submit post_emod3d for each simulation individually.
5.1 If only some of the runs are finished, and the user prefer to submit the post_emod3d for specific runs only. cd to the specific folder and execute ./submit_post_emod3d.sh and select auto submit
run check_cybershake_post_emod.sh to check which simulation have finished
script takes 2 args, 1.path to Runs folder, 2. the list of vms (so it will not run for all the unnecessary runs)
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/test_cybershake_post_emod3d.sh /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/list_vma 2>&1 | tee /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/test_post_emod3d_vma.log |
this will output the test result on the screen as well as dumping them into a log file, namely "/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/test_post_emod3d_vma.log"
(the part of the script after 2>&1 is to redirect the output to both the screen and a file using 'tee' )
Note:change the file name and location depending on your own requirement.
running test for Ashley Ashley_HYP01-03_S1244: post_emod3d finished Ashley_HYP01-03_S1254: post_emod3d finished Ashley_HYP02-03_S1264: post_emod3d finished Ashley_HYP02-03_S1274: post_emod3d finished Ashley_HYP03-03_S1284: post_emod3d finished Ashley_HYP03-03_S1294: post_emod3d finished ==================== Ashley finished ==================== |
post-emod3d has built-in resume functionality. So if the job failed to finish, you can resubmit again and it will start from where it ended.
To maximize the efficentcy, its better to adjust the WCT to a proper length, instead of needing to check multiple times and submit multiple times.
make sure the job submitted has already finished by using llq
(the new ll script appends the rup_model name to the job name, so using a specific command will be able to test if a specific job is still on load-level queue or not.
To show all jobs with job name belong to user 'ykh22'
llq -l -u ykh22 | grep 'Job Name:' |
pipe it to grep to determine if a job is completed.
lets say we are looking for AlpineF2K_HYP06-21_S1404
llq -l -u ykh22 | grep 'Job Name: postprocess' | grep 'AlpineF2K_HYP06-21_S1404' |
it will be empty if the job is not in queue, otherwise it should show on screen
Job Name: postprocess_AlpineF2K_HYP06-21_S1404 |
check the completed count of Vel files by using `ls` and `wc`
ls LF/AlpineF2K_HYP10-10_S1514/Vel/ | wc 6657 6657 98076 |
than compare it with the station count within the domain
cat fd_rt01-h0.400.ll | wc 8550 25650 271714 |
for this example, we have 8850 stations and only 2219 station finished (6657 / 3).
so its safe to assume that if we give it more than 4~4.5 times of WCT, it should finish with next submission.
change the WCT in "the templates".( So that all jobs submitted afterwards will use the WCT)
# @ wall_clock_limit = 0:20:00 |
to
# @ wall_clock_limit = 1:30:00 |
re-submit job for all srf in that simulation
echo "1" | ./submit_post_emod3d.sh |
run install_bb_cybershake.sh to setup the parameters(Mod-1D) for hf and bb runs.
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/install_bb_cybershake.sh /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/list_vma /nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/install_bb_cybershake_selection.txt |
the code takes 3 arguments, 1. the path to Runs folder, 2. the list of vm(the same list used as previous steps), 3.the input for the install_bb.sh (this can be changed if different Mod-1D is chosen)
installing BB for: Albury AlpineF2K AlpineK2T Ashley AwatNEVer AwatNEVerCl AwatereNE AwatereSW Barefell Brothers ============================== installing BB for Albury ============================== devel Info: Old version of params.py supporting singular kappa and sdrop **************************************************************************************************** EMOD3D HF/BB Preparationi Ver.devel **************************************************************************************************** ==================================================================================================== Do you want site-specific computation? (To use a universal 1D profile, Select 'No') ==================================================================================================== 1. Yes 2. No Enter the number you wish to select (1-2):==================================================================================================== Select one of 1D Velocity models (from /nesi/projects/nesi00213/VelocityModel/Mod-1D) ==================================================================================================== 1. /nesi/projects/nesi00213/VelocityModel/Mod-1D/Cant1D_v1-midQ.1d 2. /nesi/projects/nesi00213/VelocityModel/Mod-1D/Cant1D_v1.1d 3. /nesi/projects/nesi00213/VelocityModel/Mod-1D/Cant1D_v2-midQ.1d 4. /nesi/projects/nesi00213/VelocityModel/Mod-1D/Cant1D_v2-midQ_leer.1d 5. /nesi/projects/nesi00213/VelocityModel/Mod-1D/banks.1d 6. /nesi/projects/nesi00213/VelocityModel/Mod-1D/foothills.1d 7. /nesi/projects/nesi00213/VelocityModel/Mod-1D/foothills_v2.1d 8. /nesi/projects/nesi00213/VelocityModel/Mod-1D/plains.1d Enter the number you wish to select (1-8):/nesi/projects/nesi00213/VelocityModel/Mod-1D/Cant1D_v2-midQ_leer.1d Info: You have specified multiple SRF files. A single hf_kappa(=0.045) and hf_sdrop(=50) specified in params.py will be used for all SRF files. If you need to specific hf_kappa and hf_sdrop value for each SRF, add hf_kappa_list and hf_sdrop_list to params_base.py ==================================================================================================== - Vel. Model 1D: Cant1D_v2-midQ_leer - hf_sim_bin: hb_high_v5.4.5_np2mm+ - hf_rvfac: 0.8 - hf_sdrop: 50 - hf_kappa: 0.045 - srf file: /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/Sources/Albury/Srf/Albury_HYP01-01_S1244.srf /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/LF/Albury_HYP01-01_S1244/params_uncertain.py /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Albury_HYP01-01_S1244/params_bb_uncertain.py [Errno 17] File exists Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045 : 750 Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/BB/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045 : 750 ==================================================================================================== - Vel. Model 1D: Cant1D_v2-midQ_leer - hf_sim_bin: hb_high_v5.4.5_np2mm+ - hf_rvfac: 0.8 - hf_sdrop: 50 - hf_kappa: 0.045 - srf file: /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/Sources/Albury/Srf/Albury_HYP01-01_S1254.srf /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/LF/Albury_HYP01-01_S1254/params_uncertain.py /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Albury_HYP01-01_S1254/params_bb_uncertain.py [Errno 17] File exists Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045 : 750 Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/BB/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045 : 750 ==================================================================================================== - Vel. Model 1D: Cant1D_v2-midQ_leer - hf_sim_bin: hb_high_v5.4.5_np2mm+ - hf_rvfac: 0.8 - hf_sdrop: 50 - hf_kappa: 0.045 - srf file: /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/Sources/Albury/Srf/Albury_HYP01-01_S1264.srf /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/LF/Albury_HYP01-01_S1264/params_uncertain.py /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Albury_HYP01-01_S1264/params_bb_uncertain.py [Errno 17] File exists Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045 : 750 Permission /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Albury/BB/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045 : 750 ... ... ... ... ... |
run submit_cybershake_hf.sh
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/submit_cybershake_hf.sh /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/list_vma |
============================== submitting for Brothers ============================== MPI Note: rand_reset is not defined in params_base_bb.py. We assume rand_reset=True ['/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Brothers_HYP01-02_S1244', '/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Brothers_HYP01-02_S1254', '/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Brothers_HYP01-02_S1264', '/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Brothers_HYP02-02_S1274', '/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Brothers_HYP02-02_S1284', '/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/Brothers/HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/Brothers_HYP02-02_S1294'] Also submit the job for you? 1. Yes 2. No Enter the number you wish to select (1-2):Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1244 Loadleveler script run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1244_20171003_040524.ll written Submitting run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1244_20171003_040524.ll Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1254 Loadleveler script run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1254_20171003_040524.ll written Submitting run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1254_20171003_040524.ll Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1264 Loadleveler script run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1264_20171003_040524.ll written Submitting run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP01-02_S1264_20171003_040524.ll Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1274 Loadleveler script run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1274_20171003_040524.ll written Submitting run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1274_20171003_040524.ll Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1284 Loadleveler script run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1284_20171003_040524.ll written Submitting run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1284_20171003_040524.ll Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1294 Loadleveler script run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1294_20171003_040524.ll written Submitting run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__Brothers_HYP02-02_S1294_20171003_040524.ll ============================== |
run test_cybershake_hf.sh.
script takes 2 args, 1.path to Runs folder, 2. the list of vms (so it will not run for all the unnecessary runs)
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/test_cybershake_hf.sh /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/list_vma 2>&1 | tee /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/test_hf_vma.log |
this will output the test result on the screen as well as dumping them into a log file, namely "/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/test_hf_vma.log"
(the part of the script after 2>&1 is to redirect the output to both the screen and a file using 'tee' )
Note:change the file name and location depending on your own requirement.
running test for Albury Albury_HYP01-01_S1244: HF finished Albury_HYP01-01_S1254: HF finished Albury_HYP01-01_S1264: HF finished ==================== Albury finished ==================== |
make sure the job submitted has already finished by looking at llq.
lets say we are looking for AlpineF2K_HYP06-21_S1404
llq -l -u ykh22 | grep 'Job Name: run_hf_mpi' | grep 'AlpineF2K_HYP06-21_S1404' |
it will be empty if the job is not in queue, otherwise it should show on screen
Job Name: run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__AlpineF2K_HYP06-21_S1404 |
check the completed count of Acc files by using `ls` and `wc`
ls HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/AlpineF2K_HYP10-10_S1514/Acc/ | wc 6657 6657 98076 |
than compare it with the station count within the domain
cat fd_rt01-h0.400.ll | wc 8550 25650 271714 |
for this example, we have 8850 stations and only 2219 station finished (6657 / 3).
so its safe to assume that if we give it more than 4~4.5 times of WCT, it should finish with next submission.
change the WCT in "the templates".( So that all jobs submitted afterwards will use the WCT)
# @ wall_clock_limit = 1:00:00 |
to
# @ wall_clock_limit = 4:30:00 |
re-submit job for all srf in that simulation
echo "1" | ./submit_hf.sh |
IMPORTANT:before running batch bb submission, make sure all LF and HF for all runs under the list_vm are done.
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/submit_cybershake_bb.sh /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/list_vma |
1.1 If only specific run's LF and HF are finished and user prefer to run BB for that specific run only. cd to the simulation folder and run ./submit_bb.sh.
run test_cybershake_bb.sh to test which runs finished
script takes 2 args, 1.path to Runs folder, 2. the list of vms (so it will not run for all the unnecessary runs)
/nesi/projects/nesi00213/RunFolder/Cybershake/workflow/devel/cybershake/test_cybershake_bb.sh /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Data/list_vma 2>&1 | tee /nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/test_bb_vma.log |
this will output the test result on the screen as well as dumping them into a log file, namely "/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/test_hf_vma.log"
(the part of the script after 2>&1 is to redirect the output to both the screen and a file using 'tee' )
Note:change the file name and location depending on your own requirement.
make sure the job submitted has already finished by looking at llq.
llq -l -u ykh22 | grep 'Job Name: run_bb_mpi' | grep 'AlpineF2K_HYP06-21_S1404' |
Job Name: run_bb_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__AlpineF2K_HYP06-21_S1404 |
check the completed count of Vel files by using `ls` and `wc`
ls HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/AlpineF2K_HYP10-10_S1514/Vel/ | wc 6657 6657 98076 |
than compare it with the station count within the domain
cat fd_rt01-h0.400.ll | wc 8550 25650 271714 |
for this example, we have 8850 stations and only 2219 station finished (6657 / 3).
so its safe to assume that if we give it more than 4~4.5 times of WCT, it should finish with next submission.
change the WCT in "the templates".( So that all jobs submitted afterwards will use the WCT)
# @ wall_clock_limit = 1:00:00 |
# @ wall_clock_limit = 4:30:00 |
re-submit job for all srf in that simulation
echo "1" | ./submit_bb.sh |
if you wish to view all jobs you submitted
llq -u username -f %jn %id %st |
this will show all jobs "username" submitted (with the job name, jobid, and job status)
the script below can be used to parallel download files using rsync.
!!! the folder tree must first be created using.
-av -f"+ */" -f"- *" $source_dir $des_dir |
!!! must be modified. its using 'find' to return a list of folders, and parse it to download_rsync using 'xargs -o -n1 -P$threadnumber'
find LF -type d -print0 | xargs -0 -n1 -P12 -I% ~/gm_sim_workflow/devel/cybershake/download_rsyn.sh ykh22@fitzroy.nesi.org.nz:/nesi/projects/nesi00213/RunFolder/Cybershake/v17p9/Runs/AlpineF2K/% /nesi/projects/nesi00213/RunFolder/Cybershake/17p9/backup/AlpineF2K/LF |