Empirical Validation Checks

Explains details about how to produce empirical data and check this against simulation data using cybershake_investigation
Ensure that cybershake_investigation is installed ( Insallation )

Generate Empirical Data

There are two ways of generating empirical data using cybershake_investiagtion.
This is based on the simulation data structure you have available, one being a current simulation structure that has just completed the workflow
and the other being extracted data from an archived version of cybershake.
Based on this you can run one of the following scripts to generate empirical data found in cybershake_investigation/scripts.

gen_empirical

Generates empirical data for a simulation structure set of data.
Requires IM_calculation to be complete for all realisations before running.

python cybershake_investigation/scripts/gen_empirical.py      
	<path to the cybershake_root folder>
    <path to the list file>
	<path to the nhm file>
	<Path to the meta_config weight file> (Found in Empirical util, gmm_weights_22p5_meta.yam)
 	<Path to the model_config weight file> (Found in Empirical util, model_config.yaml)
    [--output_dir <Default to cybershake_root / Data / Empirical>]
    [--component <Default rotd50>]
	[--n_procs <To use multiprocessing, default 1>]

gen_empirical_archive

Generates empirical data from an archived version of cybershake.
Requires just Source (Realisation csvs) and IM data from an archived version.

python cybershake_investigation/scripts/gen_empirical_archive.py
    <path to the archive directory>
    <path to the ll file>
    <path to the vs30 file> 
    <path to the z file> 
	<path to the nhm file>
	<path to the meta_config weight file> (Found in Empirical util, gmm_weights_22p5_meta.yam)
 	<path to the model_config weight file> (Found in Empirical util, model_config.yaml)
    <path to the output directory for empirical data>
    [--ss_db <path to the site source db commonly used with GMHazard, reduces site source distance calculations needed>]
    [--component <Default rotd50>]
	[--n_procs <To use multiprocessing, default 1>]

Note: currently on Nesi you can find the site_source_db and the site files in the directory [/nesi/nobackup/nesi00213/baes/CS_investigation]

Compare Data

After empirical data has been generated you can now run the comparison between the simulation data and the empirical data using the following script.

compare_csvs

Script to compare IMs from empirical models to cybershake results. Produces a csv for each realisation where some stations are above a given threshold value of difference.

python cybershake_investigation/scripts/compare_csvs.py
    <path to the root directory of simulation data> (Can be either a simulation or archive structure)
    <path to the empirical directory for comparison>
    <path to the output directory for log ratio files that fail>
	<path to the threshold file for sites> (Found in cybershake_investigation/configs)
    [--component <Default rotd50>]
	[--from_archive <Specify this flag if the root directory is from an archive directory>]
 	[--save_ratio_csvs <Specify this flag if you want to save failed log ratio csvs to the output directory>]

Note: When the component is rotd50, if rotd50 is not found in the simulation dataset then geom will be used as a backup (used for older simulation data)

Child pages