Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. make sure the job submitted has already finished by using llq
    (the new ll script appends the rup_model name to the job name, so using a specific command will be able to test if a specific job is still on load-level queue or not.

    To show all jobs with job name belong to user 'ykh22'
    Code Block
    llq -l -u ykh22 | grep 'Job Name:'

    pipe it to grep to determine if a job is completed.
    lets say we are looking for AlpineF2K_HYP06-21_S1404

    Code Block
    llq -l -u ykh22 | grep 'Job Name: postprocess' | grep 'AlpineF2K_HYP06-21_S1404'

    it will be empty if the job is not in queue, otherwise it should show on screen

    Code Block
    Job Name: postprocess_AlpineF2K_HYP06-21_S1404
  2. check the completed count of Vel files by using `ls` and `wc`

    Code Block
    ls LF/AlpineK2TAlpineF2K_HYP10-10_S1514/Vel/ | wc
          6657    6657   98076  

    than compare it with the station count within the domain

    Code Block
    cat fd_rt01-h0.400.ll | wc
        8550   25650  271714

    for this example, we have 8850 stations and only 2219 station finished (6657 / 3).
    so its safe to assume that if we give it more than 4~4.5 times of WCT, it should finish with next submission.

  3. change the WCT in "the templates".( So that all jobs submitted afterwards will use the WCT)

    Code Block
    titleoriginal post_emod3d_mpi.ll.template
     # @ wall_clock_limit     = 0:20:00

    to

    Code Block
    # @ wall_clock_limit     = 1:30:00
  4. re-submit job for all srf in that simulation

    Code Block
    echo "1" | ./submit_post_emod3d.sh

...

  1. make sure the job submitted has already finished by looking at llq.
    lets say we are looking for AlpineF2K_HYP06-21_S1404

    Code Block
     llq -l -u ykh22 | grep 'Job Name: run_hf_mpi' | grep 'AlpineF2K_HYP06-21_S1404'

    it will be empty if the job is not in queue, otherwise it should show on screen

    Code Block
    Job Name: run_hf_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__AlpineF2K_HYP06-21_S1404
  2. check the completed count of Acc files by using `ls` and `wc`

    Code Block
    ls HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/AlpineK2TAlpineF2K_HYP10-10_S1514/Acc/ | wc
          6657    6657   98076

    than compare it with the station count within the domain

    Code Block
    cat fd_rt01-h0.400.ll | wc
        8550   25650  271714

    for this example, we have 8850 stations and only 2219 station finished (6657 / 3).

    so its safe to assume that if we give it more than 4~4.5 times of WCT, it should finish with next submission.

  3. change the WCT in "the templates".( So that all jobs submitted afterwards will use the WCT)

    Code Block
    # @ wall_clock_limit     = 1:00:00 

    to

    Code Block
    # @ wall_clock_limit     = 4:30:00
  4. re-submit job for all srf in that simulation

    Code Block
    echo "1" | ./submit_hf.sh

...

  1. make sure the job submitted has already finished by looking at llq.

    Code Block
    llq -l -u ykh22 | grep 'Job Name: run_bb_mpi' | grep 'AlpineF2K_HYP06-21_S1404'
    Code Block
    Job Name: run_bb_mpi_Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045__AlpineF2K_HYP06-21_S1404
  2. check the completed count of Vel files by using `ls` and `wc`

    Code Block
    ls HF/Cant1D_v2-midQ_leer_hfnp2mm+_rvf0p8_sd50_k0p045/AlpineK2TAlpineF2K_HYP10-10_S1514/Vel/ | wc
          6657    6657   98076

    than compare it with the station count within the domain

    Code Block
    cat fd_rt01-h0.400.ll | wc
        8550   25650  271714

    for this example, we have 8850 stations and only 2219 station finished (6657 / 3).

    so its safe to assume that if we give it more than 4~4.5 times of WCT, it should finish with next submission.

     

  3. change the WCT in "the templates".( So that all jobs submitted afterwards will use the WCT)

    Code Block
    # @ wall_clock_limit     = 1:00:00
    Code Block
    # @ wall_clock_limit     = 4:30:00
  4. re-submit job for all srf in that simulation

    Code Block
    echo "1" | ./submit_bb.sh

...