Structure of Codebase 

(from SW Codebase 2019 page : qcore, visualisation and GMSimViz are not listed.

This page is somewhat outdated, but contains some useful background)

Repository Specific Improvements

RepoImprovementNote

slurm_gm_workflow

https://github.com/ucgmsim/slurm_gm_workflow

 

Outstanding issues

  • DB Issue: Fix the lock issue caused by excessive access
  • Remove/update legacy code & parameters and accomodate new environment
  • Deprecate cybershake.json

Improvements

  • Separate repos: workflow automation vs calculation
  • Better logging
  • Automated verification/testing
  • Integrate model (srf/vm) into the workflow (with an option to stop before simulation)
  • Estimation performance optimization
  • Automated Visualisation
  • Error handling
  • Realisation name change: AlpineF2K_HYP01-47_S1244 to AlpineF2K_REL01
  • Site-specific binary workflow

qcore


https://github.com/ucgmsim/qcore

Remove unneeded code/functions

More coherent structure with related functions kept in the same file

Consistent comment styles using doctoring and API doc

Expand automated unit test coverage (less than 10%)

(lat.lon).csv → grid.xml currently not used. Plan for PAGER?


Pre-processing
https://github.com/ucgmsim/Pre-processing

 

Better estimation for model generation

Repo restructure : GMSim_model, NonUniformGrid and archive unused legacy code

Incorporate model generation into management DB (See slurm_gm_workflow)

Automated testing for model generation

NonUniformGrid code has minor issues (but low priority, run yearly)


seisfinder (ver.1/ver.2)

https://github.com/ucgmsim/seisfinder2

Regression tests (after scientific validation)

GM selection

Login and user management

Missing ver.1 features:

  • validation document (using gm_publish)
  • custom name
  • PGV map (upon the selection of an event)
  • All im .csv files into one .csv



Visualisation

https://github.com/ucgmsim/visualization

Clean up

Python 3

Refactor plot_stations.py


empirical_engine

https://github.com/ucgmsim/Empirical_Engine

Integrate into hazard workflow (replacing OpenPSHA, no new functionality, but can streamline empdb creation)


ground failure

https://github.com/ucgmsim/GroundFailure

Clean up

validation

https://github.com/ucgmsim/validation

Mixed effect regression workflow to be version-controlled

Add automation

Improve the code quality


GMSimViz

https://github.com/ucgmsim/GMSimViz

 

Specifying regions of interest

gm_publish

https://github.com/ucgmsim/gm_publish

Decide if seisfinder2 needs this

IM_calculation

https://github.com/ucgmsim/IM_calculation

Include just .000 and .090 for geom only (33% speed up)

Calculate RTVZ and RX

Replace Cython spectra with better Python code


Velocity_Model

https://github.com/ucgmsim/Velocity-Model

-

EMOD3D

https://github.com/ucgmsim/EMOD3D

-


Common Improvements

  • Template for README : Amalgamate README, Codebase wiki page and repo maturity page, and put everything in README.
  • Python 3 and coding & comment style
  • Automated testing and Continuous integration

Stable Release

A repository that satisfies the following criteria will have an official stable release.

  • All the planned functionalities have been developed and tested
  • Codebase has been cleaned up
  • Good comments & documentation (README)
  • Automated testing coverage over 80%

Ideally, we can have the package installed via "pip" command:

pip install qcore

The repositories we should aim to produce stable releases are (ordered by impact/risk analysis)

  • IM_calculation
  • Pre-processing
  • Qcore
  • Slurm_gm_workflow 


Moving Forward: Mid April - End of June

  1. Stable release of IM_calc and Pre-processing repos : Initially small team of 2~3 piloting, polishing up the process
    1. IM_Calc: 
      1. Clean up
      2. Finish automated testing
      3. Comment style, 
      4. README.md 
      5. Release
    2. Pre-processing: 
      1. Better estimation, 
      2. Split the repo : Model_gen, Grid_gen
      3. Clean up
      4. Automated testing
      5. Comment style
      6. README.md
      7. Release

  2. Slurm_gm_workflow & Qcore restructuring
    1. Slurm_gm_workflow: 
      1. Restructure, split the repo. make all .yaml compliant
      2. Remove/update legacy code
      3. Integrate Pre-processing into automated workflow
      4. Logging
      5. Automated verification
      6. Estimation performance
      7. Visualisation
      8. Error handling
      9. Realisation name change

    2. Qcore
      1. Restructure
      2. Clean ujp
      3. Comments in Numpy style
      4. Increase automated unit test coverage
      5. API doc
      6. README


  3. SeisFinder2: Feature completion (3months time frame)
    1. Integration of GM selection
    2. Verification and regression testing etc.
    3. Lite version(?)
  4. README template : Example  https://gist.github.com/PurpleBooth/109311bb0361f32d87a2  : TODO, Changelog etc.
  5. Standard for comment and API doc :NumPy vs Google
  6. Cybershake
    1. Include subduction
    2. Check the performance of new VM and run 1 cycle of Cybershake
    3. HF changes (inc. path duration)
    4. Deagg and determines more relevant faults, rerun Cybershake with less fault, higher res.









  • No labels

3 Comments

  1. Some random thoughts from my part here. Feel free to remove if clutter.

    • How do you version the whole system? ie. you have n repos, which won't have the same version, as they may be improved at different speed.
    • Who will be in charge of the VM code base now? Someone needs to take over that probably. 
    • The gm slurm workflow seems to be huge with respect to other repos. In general, is there a way to reduce the code by improving the architecture?
    • (probably research based) How do you improve HF to be more effective for the cases where the actual fortran code is too slow.
    • How do you version the whole system? ie. you have n repos, which won't have the same version, as they may be improved at different speed. =>  Each repo will be having its own version number when it reaches "stable" state and we roll out a release. Version number will follow YYYYMMDD-X format.
    • Who will be in charge of the VM code base now? Someone needs to take over that probably. → Jonney 
    • The gm slurm workflow seems to be huge with respect to other repos. In general, is there a way to reduce the code by improving the architecture? → We are considering separation of workflow automation+estimation from actual calculation and make separate repos. 
    • (probably research based) How do you improve HF to be more effective for the cases where the actual fortran code is too slow. → ??? Your suggestions welcome 
  2. With regard to versioning the whole system. I think that specifying dependant versions of repositories in the setup.py should allow you to "time-travel" effectively as you know that this version is compatible with this version at this point in time.


    slurm workflow seems to have a lot of moving parts. It's been a huge focus for us in development this year in pair with our efforts on automation.