Alexandra K. Diem

Personal Website.
"Hiking is an excellent opportunity to scout out backcountry skiing routes while we wait for winter to come back."
During the Besseggen hike we got to check out the first sections of the Jotunheimen Høgruta #hauteroute from afar and I can only imagine what an awesome exciting winter adventure this will make!
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #besseggen #jotunheimen #nowaynorway Big boulder. Benoit for scale.
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #besseggen #jotunheimen #nowaynorway Jotunheimen needs no filter.
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #besseggen #jotunheimen #nowaynorway "Only about 30 switchbacks left!"
📷 @pvlvs
#mitziandfriends #radrace #tourdefriends #tourdefriends3 #tdf #storyoftheday #oneobsession #photooftheday #worldbybike #outsideisfree #fromwhereiride #cycling #womencycling #ladieseditioncc #landevei #utno #vegtur

Managing simulation runs using pandas

At work I run a lot of simulations of the same code, but using slightly different parameters. Sometimes, simulation data can take up quite a bit of space, so often I want to store those data somewhere other than my laptop. I try my best to use sensible folder names, but often, six or so months later, I have to look into several folders to figure out which one contains the data I am looking for. The Python library pandas provides a much more elegant solution to this problem, that only requires me to store meta data in a text file, which I can easily keep on my laptop, that will point me to the folder in question.

The solution is based on the storing simulation parameters in a .cfg file. These look something like this:

id = simulation0
solver = direct
debug = 0

N = 2
TOL = 1e-7
rho = 1000 * kg/m**3
K = 1e-7 * m**2/Pa/s
phi = 0.1
beta = 1
qi = 0
qo = 0
tf = 0.5 * s
dt = 0.1 * s
theta = 0.5

Using pandas, we can create scripts that automatically filter this type of meta data for certain parameter values, so that we can quickly figure out, which name we gave to our data folder. This means that now it doesn’t matter anymore what we call our data folders and we can automate the naming process by for example using random numbers (or just simply count from 0).

We need to import the following libraries into Python:

import pandas
import glob
from configparser import ConfigParser

Glob returns a list of all paths fitting a pattern,

files = glob.glob("./data/*.cfg")

such that the output looks similar to this:


We initialise a ConfigParser to read the .cfg files and tell it which section(s) we are interested in:

config = ConfigParser()
config.optionxform = str
sections = ['Simulation', 'Parameter']

Create a dictionary of dictionaries holding all simulation parameters from the files. The truncated file name serves as the key for each parameter dictionary d in data.

data = {}
for file in files:
    d = {}
    for section in sections:
        options = config.items(section)
        for key, value in options:
            d[key] = value
    fname = file.split("/")[-1]
    data[fname] = d

Create a pandas table from the dictionary data

tab = pandas.DataFrame.from_dict(data, orient="index")

Now we can look at the values of a parameter for each file in our table:

simulation0.cfg    1e-7
simulation1.cfg    1e-7
simulation2.cfg    1e-7
simulation3.cfg    1e-7
Name: TOL, dtype: object
simulation0.cfg      1
simulation1.cfg      1
simulation2.cfg    0.1
simulation3.cfg      1
Name: beta, dtype: object
simulation0.cfg      0
simulation1.cfg      0
simulation2.cfg      0
simulation3.cfg    0.1
Name: qi, dtype: object

We can also filter by parameter values

tab[tab.beta == '0.1']
N TOL rho K phi beta qi qo tf dt theta
simulation2.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 0.1 0 0 0.5 * s 0.1 * s 0.5
tab[tab.qi == '0']
N TOL rho K phi beta qi qo tf dt theta
simulation0.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 1 0 0 0.5 * s 0.1 * s 0.5
simulation1.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.2 1 0 0 0.5 * s 0.1 * s 0.5
simulation2.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 0.1 0 0 0.5 * s 0.1 * s 0.5