George Datseris, May 29, 2019
Hello everyone,
I’d like to present a new software that was recently released in open beta, called DrWatson, the perfect sidekick to your scientific inquiries.
DrWatson is what I’ll call a “scientific project assistant”. It is software package created to help people “deal” with their simulations, simulation parameters, where are files saved, experimental data, scripts, existing simulations, project source code and in general their scientific projects. Although it is written in Julia, I believe its principles, practices and how it can assist you in managing a project can be applied to any other language.
You can find the full documentation here:
https://juliadynamics.github.io/DrWatson.jl/dev/
In this blogpost I will summarize what this software can do for you and how can it help you by reducing the stress of managing a scientific project. Please be aware that DrWatson is not a database management system.
DrWarson is a scientific project assistant software package. Its core functionality is separated into 4, almost completely independent parts:
are the principles that we based our design decisions.
struct
or other data types.Let me now show two real life examples of using DrWatson.
In simulations it is often the case that we used a named container (like a dictionary) as something to hold our parameters in. For example
c = (T = 100, dt = 0.1, N = 25, mode = “bi”)
then calling the function savename
that DrWatson offers results in
savename(c)
"N=25_T=100_dt=0.1_mode=bi"
while you can also add prefix/suffix
savename(datadir(), c, “dat”)
"path/to/data/N=25_T=100_dt=0.1_mode=bi.dat"
What is important is that the function savename
works for any named container, which includes dictionaries, named tuples or any Julia composite structure (existing or to-be-created in the future).
In addition, it sensibly excludes non-fitting fields (e.g. vector-valued fields) but can be customized to full extent for your custom type.
Let’s say that you have run some simulations with parameters a = 1 or 2, b = “x” or “y” and c = 0.3. You save these simulations in the directory “data”. for the purpose of this example you only save the parameter values but no other further data (like actual results)
You can then call the function
df1 = collect_results("data/")
4×3 DataFrame
│ Row │ a │ b │ c │
│ │ Int64⍰ │ String⍰ │ Float64⍰ │
├─────┼────────┼─────────┼──────────┤
│ 1 │ 1 │ x │ 0.3 │
│ 2 │ 1 │ y │ 0.3 │
│ 3 │ 2 │ x │ 0.3 │
│ 4 │ 2 │ y │ 0.3 │
This is already quite convenient. But where collect_results
really shine is the following. You now run a new simulation with a=4, b=“z”, a new parameter d=0.5 or 0.6 and the parameter c does not exist anymore! You again save these simulations in “data”. Then you call collect_results
again:
df2 = collect_results("data/")
6×4 DataFrame
│ Row │ a │ b │ c │ d │
│ │ Int64⍰ │ String⍰ │ Float64⍰ │ Float64⍰ │
├─────┼────────┼─────────┼──────────┼──────────┤
│ 1 │ 1 │ x │ 0.3 │ missing │
│ 2 │ 1 │ y │ 0.3 │ missing │
│ 3 │ 2 │ x │ 0.3 │ missing │
│ 4 │ 2 │ y │ 0.3 │ missing │
│ 5 │ 4 │ z │ missing │ 0.5 │
│ 6 │ 4 │ z │ missing │ 0.6 │
As you can see the collection scheme has no problem with combining simulations with different parameters. This is especially useful for a scientific project, since it is almost impossible to know in advance all simulations you would have to run by the end of the project!
I hope that DrWatson can help you and I hope you will be interested to discuss during the deRSE2019 meeting! See you all there! If you are eager to contact us about DrWatson before deRSE2019, please use the GitHub repository:
https://github.com/JuliaDynamics/DrWatson.jl
best regards, George Datseris