Alexandra K. Diem

Personal Website.
Not having to worry about anything other than whether the view out of your tent in the morning will be better here or a couple of metres over there... #spaholiday
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #nattinaturen #intersportnorge #salomonwmn #lofoten #moskenesøya #nowaynorway @intersportnorge My #lofoten #blog post is up! Lightning fast this time because I was good and wrote all (most) of the text during the trip. Hit the link in my bio 🔝 for a story about views and non-views on the peaks, being chased by rain clouds, spotting whale bones and live whales, and more.
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #lofoten #moskenesøya #nowaynorway Everything always was kinda damp and we may only ever have had dry feet after a night in our sleeping bags, but the views and the adventure were worth it. Including learning how to panickingly take down and pack in my tent during gale force winds in the middle of the night. Blog post coming up soon!
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #lofoten #moskenesøya #nowaynorway Neither pictures nor words will ever do this place justice.
#hiking #utpåtur #friluftsliv #turjenter #fjelltur #soveute #fjellsport #allemannsretten #lofoten #moskenesøya #nowaynorway

Managing simulation runs using pandas

At work I run a lot of simulations of the same code, but using slightly different parameters. Sometimes, simulation data can take up quite a bit of space, so often I want to store those data somewhere other than my laptop. I try my best to use sensible folder names, but often, six or so months later, I have to look into several folders to figure out which one contains the data I am looking for. The Python library pandas provides a much more elegant solution to this problem, that only requires me to store meta data in a text file, which I can easily keep on my laptop, that will point me to the folder in question.

The solution is based on the storing simulation parameters in a .cfg file. These look something like this:

id = simulation0
solver = direct
debug = 0

N = 2
TOL = 1e-7
rho = 1000 * kg/m**3
K = 1e-7 * m**2/Pa/s
phi = 0.1
beta = 1
qi = 0
qo = 0
tf = 0.5 * s
dt = 0.1 * s
theta = 0.5

Using pandas, we can create scripts that automatically filter this type of meta data for certain parameter values, so that we can quickly figure out, which name we gave to our data folder. This means that now it doesn’t matter anymore what we call our data folders and we can automate the naming process by for example using random numbers (or just simply count from 0).

We need to import the following libraries into Python:

import pandas
import glob
from configparser import ConfigParser

Glob returns a list of all paths fitting a pattern,

files = glob.glob("./data/*.cfg")

such that the output looks similar to this:


We initialise a ConfigParser to read the .cfg files and tell it which section(s) we are interested in:

config = ConfigParser()
config.optionxform = str
sections = ['Simulation', 'Parameter']

Create a dictionary of dictionaries holding all simulation parameters from the files. The truncated file name serves as the key for each parameter dictionary d in data.

data = {}
for file in files:
    d = {}
    for section in sections:
        options = config.items(section)
        for key, value in options:
            d[key] = value
    fname = file.split("/")[-1]
    data[fname] = d

Create a pandas table from the dictionary data

tab = pandas.DataFrame.from_dict(data, orient="index")

Now we can look at the values of a parameter for each file in our table:

simulation0.cfg    1e-7
simulation1.cfg    1e-7
simulation2.cfg    1e-7
simulation3.cfg    1e-7
Name: TOL, dtype: object
simulation0.cfg      1
simulation1.cfg      1
simulation2.cfg    0.1
simulation3.cfg      1
Name: beta, dtype: object
simulation0.cfg      0
simulation1.cfg      0
simulation2.cfg      0
simulation3.cfg    0.1
Name: qi, dtype: object

We can also filter by parameter values

tab[tab.beta == '0.1']
N TOL rho K phi beta qi qo tf dt theta
simulation2.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 0.1 0 0 0.5 * s 0.1 * s 0.5
tab[tab.qi == '0']
N TOL rho K phi beta qi qo tf dt theta
simulation0.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 1 0 0 0.5 * s 0.1 * s 0.5
simulation1.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.2 1 0 0 0.5 * s 0.1 * s 0.5
simulation2.cfg 2 1e-7 1000 * kg/m**3 1e-7 * m**2/Pa/s 0.1 0.1 0 0 0.5 * s 0.1 * s 0.5