Alexandra K. Diem

Personal Website.
It's that time of year again where I start to move my anaerobic training indoors onto the #spinningbike , while I continue to run for aerobic training until I can start #langrenn #crosscountryskiing. I spent all of last winter season trying to come up with a #spinning routine that I really like, so now I decided to share it on my blog. Check the link in my bio 🔝 for the playlist plus description of the routine, and a link to my #spotify playlist. Say hi to my first potential foster failures! 😻 These two sisters got out of a situation with 60 (!) cats and are now looking for a forever home. Until then, I get to spoil the hell out of them 😊 #cat #catsofinstagram Not having to worry about anything other than whether the view out of your tent in the morning will be better here or a couple of metres over there... #spaholiday
.
.
.
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #nattinaturen #intersportnorge #salomonwmn #lofoten #moskenesøya #nowaynorway @intersportnorge My #lofoten #blog post is up! Lightning fast this time because I was good and wrote all (most) of the text during the trip. Hit the link in my bio 🔝 for a story about views and non-views on the peaks, being chased by rain clouds, spotting whale bones and live whales, and more.
.
.
.
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #lofoten #moskenesøya #nowaynorway

Managing simulation runs using pandas

At work I run a lot of simulations of the same code, but using slightly different parameters. Sometimes, simulation data can take up quite a bit of space, so often I want to store those data somewhere other than my laptop. I try my best to use sensible folder names, but often, six or so months later, I have to look into several folders to figure out which one contains the data I am looking for. The Python library pandas provides a much more elegant solution to this problem, that only requires me to store meta data in a text file, which I can easily keep on my laptop, that will point me to the folder in question.

The solution is based on the storing simulation parameters in a .cfg file. These look something like this:

[Simulation]
id = simulation0
solver = direct
debug = 0

[Parameter]
N = 2
TOL = 1e-7
rho = 1000 * kg/m**3
K = 1e-7 * m**2/Pa/s
phi = 0.1
beta = 1
qi = 0
qo = 0
tf = 0.5 * s
dt = 0.1 * s
theta = 0.5

Using pandas, we can create scripts that automatically filter this type of meta data for certain parameter values, so that we can quickly figure out, which name we gave to our data folder. This means that now it doesn’t matter anymore what we call our data folders and we can automate the naming process by for example using random numbers (or just simply count from 0).

We need to import the following libraries into Python:

import pandas
import glob
from configparser import ConfigParser

Glob returns a list of all paths fitting a pattern,

files = glob.glob("./data/*.cfg")
files

such that the output looks similar to this:

['./data/simulation1.cfg',
 './data/simulation3.cfg',
 './data/simulation0.cfg',
 './data/simulation2.cfg']

We initialise a ConfigParser to read the .cfg files and tell it which section(s) we are interested in:

config = ConfigParser()
config.optionxform = str
sections = ['Simulation', 'Parameter']

Create a dictionary of dictionaries holding all simulation parameters from the files. The truncated file name serves as the key for each parameter dictionary d in data.

data = {}
for file in files:
    config.read(file)
    d = {}
    for section in sections:
        options = config.items(section)
        for key, value in options:
            d[key] = value
    fname = file.split("/")[-1]
    data[fname] = d

Create a pandas table from the dictionary data

tab = pandas.DataFrame.from_dict(data, orient="index")

Now we can look at the values of a parameter for each file in our table:

tab.TOL
simulation0.cfg    1e-7
simulation1.cfg    1e-7
simulation2.cfg    1e-7
simulation3.cfg    1e-7
Name: TOL, dtype: object
tab.beta
simulation0.cfg      1
simulation1.cfg      1
simulation2.cfg    0.1
simulation3.cfg      1
Name: beta, dtype: object
tab.qi
simulation0.cfg      0
simulation1.cfg      0
simulation2.cfg      0
simulation3.cfg    0.1
Name: qi, dtype: object

We can also filter by parameter values

tab[tab.beta == '0.1']
N TOL rho K phi beta qi qo tf dt theta
simulation2.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 0.1 0 0 0.5 * s 0.1 * s 0.5
tab[tab.qi == '0']
N TOL rho K phi beta qi qo tf dt theta
simulation0.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 1 0 0 0.5 * s 0.1 * s 0.5
simulation1.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.2 1 0 0 0.5 * s 0.1 * s 0.5
simulation2.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 0.1 0 0 0.5 * s 0.1 * s 0.5