You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 15 Next »

1.Prepare Data:

To run the install script, the Models must be under certain Folder and structure

Cybershake
└── version
	├── Data
	│	├── Sources
	│	└── VMs
	└── Runs

2. Prepare config:

You can use the default one or create your own.
The default one looks like this:

cat $gmsim/workflow/cybershake/cybershake_config.json

{
    "global_root" : "/nesi/transit/nesi00213" ,
    "stat_file_path" : "/nesi/transit/nesi00213/StationInfo/non_uniform_whole_nz_with_real_stations-hh400_17062017.ll",
    "v_1d_mod"  :   "/nesi/transit/nesi00213/VelocityModel/Mod-1D/Cant1D_v2-midQ_leer.1d",
    "dt"    : 0.02
}

If you wish to use different parameters, please make a copy of this file and edit it.

cp $gmsim/workflow/cybershake/cybershake_config.json /my/version/of/runs/config.json

meaning/usage of parameters:

"global_root" : usually the location of the project, where we put our binary(EMOD3D,tools) are. Modification to this is NOT recommend

"stat_file_path" : the absolute path to the station files (vs30 will be assuming the same basename)

"v_1d_mod" : the absolute path to a 1D velocity file, this is used to install_bb and run HF.

"dt" ;  dt used for EMOD3D


3. Install

After files are in place, run the install script

THREE arguments needed, 1st if the root folder, which contains the Data and Runs folder. 2nd is the config file we created in step 2. 3rd is a file that contains a list of VMs

$gmsim/workflow/scripts/cybershake/install_cybershake.sh $gmsim/RunFolder/cybershake/v18p5 $gmsim/workflow/scripts/cybershake/cybershake_config.json $gmsim/RunFolder/cybershake/v18p5/list_all

keep in mind the 3nd argument must be a file that constains a list of VMs,

something like this:

Opotiki02
Opotiki03
OpouaweUruti
Orakeikorako
Orakonui
Oruakukuru
Oruawharo
Otakiri
Otaraia
OtokoTotoF7

4. Create a screen socket

Running scripts on a screen socket and avoid the need of having the terminal open constantly (which means you can disconnect from Kupe but have the script still running on it)

screen -S your_prefered_name_for_socket

To detach a socket, use Ctrl+A+D

To Terminate a socket, use Ctrl+D

to show all available socket created before, use --list

screen -list
There is a screen on:
    289787.cybershake_v18p6    (Detached)
1 Socket in /var/run/uscreens/S-ykh22.

To resume to a specific socket, use -r

screen -r 289787.cybershake_v18p6
or
screen -r 289787

 

5. Run the simulation in auto

run the auto submission script with an period/interval.

The script takes THREE arguments, 1st is the path to sim_root folder (which is the same as you passed to install script), 2nd is the interval between loops in seconds, 3rd is the config file used to install in step 3.

$gmsim/workflow/scripts/cybershake/run_queue_and_auto_submit.sh $gmsim/RunFolder/cybershake/v18p5 60 $gmsim/workflow/scripts/cybershake/cybershake_config.json

60 means run the script every 1min. please adjust this accordingly.

Note: this script will keep running in a loop until it is killed by Ctrl-C. Or until the screen socket is terminated(if you followed step 4)

If you are running the script in a 'screen' socket, press Ctrl+A+D to detach it, so you can continue next step within the same terminal (and not worrying about disconnecting)

6. Monitor Simulation Status

  Monitor the status of each simulation by running query script.

python $gmsim/workflow/scripts/management/query_mgmt_db.py $gmsim/RunFolder/cybershake/v18p5

it should show you something like this:

                 run_name |         process |     status |   job-id |        last_modified
__________________________________________________________________________________________
              2012p075555 |        merge_ts |   in-queue |  2198889 |  2018-05-29 04:34:39
              2012p075555 |      winbin_aio |    created |     None |  2018-05-29 04:34:39
              2012p075555 |              BB |    created |     None |  2018-05-29 04:34:39
              2012p075555 |  IM_calculation |    created |     None |  2018-05-29 04:34:39
              2012p075555 |              HF |  completed |  2198881 |  2018-05-29 21:29:21
              2012p075555 |          EMOD3D |     failed |  2198858 |  2018-05-29 04:43:40
              2012p713691 |        merge_ts |    created |     None |  2018-05-29 04:34:40
              2012p713691 |      winbin_aio |    created |     None |  2018-05-29 04:34:40
              2012p713691 |              BB |    created |     None |  2018-05-29 04:34:40
              2012p713691 |  IM_calculation |    created |     None |  2018-05-29 04:34:40
              2012p713691 |              HF |  completed |  2198882 |  2018-05-29 21:29:21
              2012p713691 |          EMOD3D |     failed |  2198860 |  2018-05-29 04:44:49
              2012p764736 |        merge_ts |    created |     None |  2018-05-29 04:34:40
              2012p764736 |      winbin_aio |    created |     None |  2018-05-29 04:34:40
              2012p764736 |              HF |    created |     None |  2018-05-29 04:34:40
              2012p764736 |              BB |    created |     None |  2018-05-29 04:34:40
              2012p764736 |  IM_calculation |    created |     None |  2018-05-29 04:34:40
              2012p764736 |          EMOD3D |     failed |  2198862 |  2018-05-29 04:44:49
              2012p781523 |        merge_ts |    created |     None |  2018-05-29 04:34:40
              2012p781523 |      winbin_aio |    created |     None |  2018-05-29 04:34:40
              2012p781523 |              BB |    created |     None |  2018-05-29 04:34:40


use -e to show only the failed runs(with the errors)

python $gmsim/workflow/scripts/management/query_mgmt_db.py /nesi/nobackup/nesi00213/test_auto_submit -e

 Run_name: 2012p075555
 Process: EMOD3D
 Status: failed
 Job-ID: 2198858
 Last_Modified: 2018-05-29 04:43:40
 Error: Task removed from squeue without completion 

 Run_name: 2012p713691
 Process: EMOD3D
 Status: failed
 Job-ID: 2198860
 Last_Modified: 2018-05-29 04:44:49
 Error: Task removed from squeue without completion 
  • No labels