As part of creating more tests for the automated workflow we wished to add tests that would stress test the db.

During development of one of these tests an issue was found relating to db access.

After a certain number of writes to the database the database was unable to be written to, with the error

"SQLite unable to open database file"

When the auto_submit wrapper or queue_monitor/auto_submit pair were restarted they were able to perform one write before the issue reoccurred.

If the file was accessed using a python interpreter it was possible to read and modify the file.

It was found that if the file was moved it was possible to use the file again, however it would eventually stop working in the same way a new database file would.  One attempt where the file was copied resulted in the copied file behaving in the same way as the original when resuming the simulation run, crashing after the first write.

This issue was investigated through multiple avenues including attempting to replicate it using the master branch of the automated workflow and creating a test that would emulate a heavy workload without submitting tasks to slurm.

Neither of these tests were able to replicate the issue.

As this error only occurred in the newly developed branch it was decided that development of the branch would be discontinued and the issue documented in case it reoccurred.

  • No labels