Some of the info here are outdated. For the latest info, see Dashboard update

Runs using [dash], app is located in workflow/dashboard/run_dashboard_app.py, and data collection in run_data_collection.py

Currently shows:

  • Number of free nodes on maui (only current data in db, no history)
  • Queue on maui (only current data in db, no history)
  • Daily core hour usage (daily history)
  • Total core hours usage (daily history)

To Do:

  • Add number of free inode status (script available)
  • Core hour usage tracking per user  (script available)
  • Collect old data and add to db
  • Inodes daily? 
  • Make queue table interactive (i.e. filtering by account/users) 
  • Improve data collection error handling, currently data collection stops as soon as there is any error 
    with the data collection from HPC. Need to make this more robust to only quiet when there is an issue 
    issue logging in (to prevent locking of user account). 
  • Some styling....
  • Whatever else we want to log
  • Some sort of alert when data collection fails
  • Use sreport instead of nn_corehours_usage
  • Make sure physical core hours is logged, not hyperthreaded...


Prototype, currently just running on my local: http://132.181.63.86:8050/     

(Data from the weekend is missing as data-collection failed for some reason)



------------------------------ OLD --------------------------------------------------------------------------

Done:

  • Added DashboardDB class
  • Updated dashboard script to populate DB
    • Currently only supports daily core hour usage and squeue


ToDo:

  • DashboardDB
    • Use sreport to avoid parsing?
    • Add support for more metrics
      • Daily usage per user
      • Daily INode status
      • Current INode status (i.e. "live"(ish))
      • Node capacity?
    • Add support for cybershake progress 
  • Create Dashboard Prototype
    • simple static plots with matplotlib (i.e. just images)
  • Create interactive Dashboard plots with some javascript plotting library

  • No labels