In order to integrate srf generation into the automated workflow a number of steps must first be completed.

Known steps required

Median task inclusion

As a number of tasks in the automated workflow are dependent on steps that are common to all realisations of an event (e.g. VM generation) median realisation tasks should be added to the workflow database.

These will allow for responsibility of common tasks to be moved from the first realisation to the median

This will require adding an extra set of tasks to the database for the median case.

Median task dependencies

Realisation tasks should be able to specify that the required parent task is to be checked from the median realisation instead of the current one.

This will allow for common tasks to be handled by the median case, and realisations to continue once they're done.

This has implementation yet to be decided. The current plan is to add a REL/MEDIAN enum to qcore/constants to indicate which to check.

Modify realisation file generation scripts

They should have an output argument, either a file for single realisation or a directory for multi realisation. This will allow individual realisations to generate their own realisation files, instead of relying on the median to generate them all. (That way isn't the worst though).

Add HPC scripts to run rel file generation and srf generation

Simple wrapper scripts to generate realisation and srf files should be added to the workflow, to actually run rel/srf generation.

Trim database tasks

Some tasks are not intended to be run for realisations, such as VM rel/file gen, these should not be added to the database except for the median case.

Update run file keywords

Currently the run specification file only allows for a few keywords, (NONE, ALL, ONCE). Once should be updated to only run the median values, and should potentially be renamed MEDIAN. The keyword REL_ONLY should also be added for cases where the median should not be run.

Decisions to be made

Realization installation requirements

As realisation installation may depend on VM perturbation file generation (amoungst others), there may need to be a dummy step to use as a substitute requirement. There may be an alternative way of handling this though.

Handling realisation/median dependency switch

Some sort of flag or other knowledge needs to be added to the automated workflow to inform the dependency checker whether to look at the median or realisation tasks when checking dependencies for a given step.

The two main steps with median dependencies are realisation installation (relies on median VM generation) and vm perturbation file generation (relies on median VM_params generation).

The current two methods of handling this are to make ProcessType requirements specify whether a requirement is to come from the median or realisation. The other way is to have these steps hard coded into the dependency checker.

The first method would have a lot of wasted code, as 90% of tasks are realisation only, while the second method would require things to be hard coded, away from where requirements are set.

Future work

There is the potential for more work to be done after this, potentially having median tasks that rely on all realisations to have completed first, such as various plotting or IM/other data aggregation scripts.

  • No labels