In order to improve debugging capabilities and declutter the screen when running auto_submit on a cybershake run, we wish to implement logging.

Logging levels

The default available levels of logging are as follows:

  • Critical - messages relating to fatal issues
  • Error - messages relating to serious, but not fatal issues
  • Warning - possible issues
  • Info - status reports for user information
  • Debug - every significant action will have a debug message, these will not be printed to stdout by default though

Message format

The default logging message will be of the format

"%(asctime)s - %(module)s.%(funcName)s - %(message)s"

The logging message for tasks related to a realisation/process type pairing use the following format:

"%(levelname)8s -- %(asctime)s - %(module)s.%(funcName)s - {realisation}.{process_type} - %(message)s"

Where realisation and process_type are finalised at logger instantiation.


The following substitutions made by the python logger:

asctimeThe time the message is logged. In the format YYYY-MM-DD HH:MM:SS,sss
moduleThe module the message is logged from
filenameThe file the message is logged from

funcName

The function the message is logged from
messageThe message being logged

Log file location

By default log files for auto_submit.py and queue_monitor.py will be placed in the cybershake root directory, with the name [calling script]_log_[timestamp].txt

This can be overridden by using the argument --log_file [file_location] when starting auto_submit or queue_monitor. This may be given as an absolute path, or relative to the cybershake root directory

Any existing files will be appended to and not deleted.

The end to end tests  place these files in the test directory. --no_clean_up should be passed to it to preserve these files.

Usage

To add logging to a script in slurm_gm_workflow, shared_workflow.workflow_logger should be imported.

The function get_logger(name) should have a unique string passed to it. It will return a logger that will print messages with level greater than debug to stdout. It will also ignore any messages with a level beginning with NOPRINT, as defined in workflow_logger.

If file logging is desired then the logger should be passed to add_general_file_handler(logger, file_path).

get_basic_logger() will return a logger that prints messages with level greater than debug to stdout. It is intended to be used as a default if a function is called without passing a logger.

Examples of message sources

More concrete examples of what will be logged are as follows:

LevelTypes of message
CriticalInvalid n_runs argument, duplicate db update file name, Unable to load workflow config
ErrorManagement db unknown status code found, HF/BB version invalid
WarningMetadata storage passed invalid proc_type
Infoestimation model loading, tasks that are being skipped
DebugTask submission, Dependency checking


  • Dependency checking, task checking, job numbers etc

  • Allow levels of log messages (i.e. DEBUG, INFO, WARNING, ERROR)

  • Include timestamp and source (i.e. which module/function/class)

Example log output

Example output from a log file is available below. It was taken from the main auto submit thread of the run_cybershake automation wrapper.

   DEBUG -- 2019-06-17 17:04:16,855 - MgmtDB._check_dependancy_met - Considering task ProcessType.clean_up for realisation AlpineF2K_HYP01-01_S1244. Completed tasks as follows: []
   DEBUG -- 2019-06-17 17:04:16,855 - MgmtDB._check_dependancy_met - (11, 'AlpineF2K_HYP01-01_S1244') has remaining deps: [6]
    INFO -- 2019-06-17 17:04:16,856 - auto_submit.run_main_submit_loop - Number of runnable tasks: 2006
    INFO -- 2019-06-17 17:04:16,857 - auto_submit.run_main_submit_loop - Tasks to run this iteration: Hossack_HYP240-501_S3634-EMOD3D, Hossack_HYP240-501_S3634-HF, Hossack_HYP235-501_S3584-EMOD3D, Hossack_HYP235-501_S3584-HF, Hossack_HYP54-501_S1774-EMOD3D, Hossack_HYP54-501_S1774-HF, Hossack_HYP106-501_S2294-EMOD3D, Hossack_HYP106-501_S2294-HF, Hossack_HYP275-501_S3984-EMOD3D, Hossack_HYP275-501_S3984-HF, Hossack_HYP118-501_S2414-EMOD3D, Hossack_HYP118-501_S2414-HF
   DEBUG -- 2019-06-17 17:04:16,858 - main auto submit - auto_submit.submit_task - Hossack_HYP240-501_S3634.EMOD3D - Submit EMOD3D arguments: Namespace(account='nesi00213', auto=True, machine='maui', ncore=160, rel_dir='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP240-501_S3634', retries=0, srf='Hossack_HYP240-501_S3634', write_directory='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP240-501_S3634')
   DEBUG -- 2019-06-17 17:04:16,873 - main auto submit - submit_emod3d.main - Hossack_HYP240-501_S3634.EMOD3D - params.srf_file /nesi/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Data/Sources/Hossack/Srf/Hossack_HYP240-501_S3634.srf
   DEBUG -- 2019-06-17 17:04:16,873 - main auto submit - submit_emod3d.main - Hossack_HYP240-501_S3634.EMOD3D - not set_params_only
   DEBUG -- 2019-06-17 17:04:16,875 - main auto submit - submit_emod3d.main - Hossack_HYP240-501_S3634.EMOD3D - retries has been set, but no check-pointing files exist. not scaling wct
   DEBUG -- 2019-06-17 17:04:16,892 - main auto submit - set_runparams.create_run_params - Hossack_HYP240-501_S3634.EMOD3D - emod3d_version not found as a key in e3d file. Ignoring variable. Value is 3.0.4.
   DEBUG -- 2019-06-17 17:04:16,982 - main auto submit - auto_submit.submit_task - Hossack_HYP240-501_S3634.HF - Submit HF arguments: Namespace(account='nesi00213', auto=True, debug=False, machine='maui', ncore=80, rel_dir='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP240-501_S3634', retries=0, seed=0, site_specific=None, srf='Hossack_HYP240-501_S3634', version='run_hf_mpi', write_directory='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP240-501_S3634')
   DEBUG -- 2019-06-17 17:04:16,992 - main auto submit - submit_hf.main - Hossack_HYP240-501_S3634.HF - version: run_hf_mpi
   DEBUG -- 2019-06-17 17:04:16,994 - main auto submit - model.predict - Hossack_HYP240-501_S3634.HF - The entry is out of bounds. The SVR models will be used for estimation.
   DEBUG -- 2019-06-17 17:04:17,010 - main auto submit - shared.submit_sl_script - Hossack_HYP240-501_S3634.HF - Submitting /scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP240-501_S3634/run_hf_mpi_Hossack_HYP240-501_S3634_20190617_170417.sl on machine maui
   DEBUG -- 2019-06-17 17:04:17,077 - main auto submit - auto_submit.submit_task - Hossack_HYP235-501_S3584.EMOD3D - Submit EMOD3D arguments: Namespace(account='nesi00213', auto=True, machine='maui', ncore=160, rel_dir='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP235-501_S3584', retries=0, srf='Hossack_HYP235-501_S3584', write_directory='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP235-501_S3584')
   DEBUG -- 2019-06-17 17:04:17,088 - main auto submit - submit_emod3d.main - Hossack_HYP235-501_S3584.EMOD3D - params.srf_file /nesi/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Data/Sources/Hossack/Srf/Hossack_HYP235-501_S3584.srf
   DEBUG -- 2019-06-17 17:04:17,088 - main auto submit - submit_emod3d.main - Hossack_HYP235-501_S3584.EMOD3D - not set_params_only
   DEBUG -- 2019-06-17 17:04:17,089 - main auto submit - submit_emod3d.main - Hossack_HYP235-501_S3584.EMOD3D - retries has been set, but no check-pointing files exist. not scaling wct
   DEBUG -- 2019-06-17 17:04:17,105 - main auto submit - set_runparams.create_run_params - Hossack_HYP235-501_S3584.EMOD3D - emod3d_version not found as a key in e3d file. Ignoring variable. Value is 3.0.4.
   DEBUG -- 2019-06-17 17:04:17,233 - main auto submit - auto_submit.submit_task - Hossack_HYP235-501_S3584.HF - Submit HF arguments: Namespace(account='nesi00213', auto=True, debug=False, machine='maui', ncore=80, rel_dir='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP235-501_S3584', retries=0, seed=0, site_specific=None, srf='Hossack_HYP235-501_S3584', version='run_hf_mpi', write_directory='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP235-501_S3584')
   DEBUG -- 2019-06-17 17:04:17,243 - main auto submit - submit_hf.main - Hossack_HYP235-501_S3584.HF - version: run_hf_mpi
   DEBUG -- 2019-06-17 17:04:17,245 - main auto submit - model.predict - Hossack_HYP235-501_S3584.HF - The entry is out of bounds. The SVR models will be used for estimation.
   DEBUG -- 2019-06-17 17:04:17,260 - main auto submit - shared.submit_sl_script - Hossack_HYP235-501_S3584.HF - Submitting /scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP235-501_S3584/run_hf_mpi_Hossack_HYP235-501_S3584_20190617_170417.sl on machine maui
   DEBUG -- 2019-06-17 17:04:17,323 - main auto submit - auto_submit.submit_task - Hossack_HYP54-501_S1774.EMOD3D - Submit EMOD3D arguments: Namespace(account='nesi00213', auto=True, machine='maui', ncore=160, rel_dir='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP54-501_S1774', retries=0, srf='Hossack_HYP54-501_S1774', write_directory='/scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Runs/Hossack/Hossack_HYP54-501_S1774')
   DEBUG -- 2019-06-17 17:04:17,334 - main auto submit - submit_emod3d.main - Hossack_HYP54-501_S1774.EMOD3D - params.srf_file /nesi/nobackup/nesi00213/RunFolder/jpa198/test/cybershake_test/version/Data/Sources/Hossack/Srf/Hossack_HYP54-501_S1774.srf
   DEBUG -- 2019-06-17 17:04:17,334 - main auto submit - submit_emod3d.main - Hossack_HYP54-501_S1774.EMOD3D - not set_params_only
   DEBUG -- 2019-06-17 17:04:17,335 - main auto submit - submit_emod3d.main - Hossack_HYP54-501_S1774.EMOD3D - retries has been set, but no check-pointing files exist. not scaling wct


  • No labels