diff --git a/README.Rmd b/README.Rmd
deleted file mode 100644
index 1629914..0000000
--- a/README.Rmd
+++ /dev/null
@@ -1,504 +0,0 @@
----
-title: "Log data from the Multi-Touch Table at the HAUM"
-output: github_document
----
-
-```{r, include = FALSE}
-devtools::load_all("../../../../software/mtt")
-```
-
-The Multi Touch Table at the Herzog-Anton-Ulrich-Museum (HAUM) in
-Braunschweig gives visitors of the Museum the opportunity to interact with
-about 70 artworks and 3 virtual cards containing information about the
-museum and its layout. The table was installed at the museum in October
-2016 and since November 2016 log files from interactions of visitors of the
-museum have been collected. These log files are in an unstructured format
-and cannot be easily analyzed. The purpose of the following document is to
-describe how the data haven been transformed and which decisions have been
-made along the way.
-
-The implementation of the steps described here can be found at:
-https://gitea.iwm-tuebingen.de/R/mtt.
-
-# Data structure
-
-The log files contain lines that indicate the beginning and end of possible
-activities that can be performed when interacting with the artworks on the
-table. The layout of the table looks like pictures have been tossed on a
-large table. Every artwork is visible at the start configuration. People
-can move the pictures on the table, they can be scaled and rotated.
-Additionally, the virtual picture cards can be flipped in order to find
-more information of the artwork on the "back" of the card. One has to press
-a little `i` for more information in one of the bottom corners of the card.
-On the back of the card two to six information cards can be found with a
-teaser text about a certain topic. These topic cards can be opened and a
-hypertext with detailed information opens. Within these hypertexts certain
-technical terms can be clicked for lay people to get more information. This
-also opens up a pop-up. The events encoded in the raw log files therefore
-have the following structure.
-
-```
-"Start Application"     --> Start Application
-"Show Application"
-"Transform start"       --> Move
-"Transform stop"
-"Show Info"             --> Flip Card
-"Show Front"
-"Artwork/OpenCard"      --> Open Topic
-"Artwork/CloseCard"
-"ShowPopup"             --> Open Popup
-"HidePopup"
-```
-
-The right side shows what events can be extracted from these raw lines. The
-"Start Application" is not an event in the original sense since it only
-indicates if the table was started or maybe reset itself. This is not an
-interaction with the table and therefore not interesting in itself. All
-"Start Application" and "Show Application" are therefore excluded from the
-data when further processed and are only in the raw log files.
-
-# Parsing the raw log files
-
-The first step is to parse the raw log files that are stored by the
-application as text files in a rather unstructured format to a format that
-can be read by common statistics software packages. The data are therefore
-transferred to a spread sheet format. The following section describes what
-problems were encountered while doing this.
-
-## Corrupt lines
-
-When reading the files containing the raw logs into R, a warning appears
-that says
-
-```
-Warning messages:
-  incomplete final line found on '2016/2016_11_18-11_31_0.log'
-  incomplete final line found on '2016/2016_11_18-11_38_30.log'
-  incomplete final line found on '2016/2016_11_18-11_40_36.log'
-  ...
-```
-
-When you open these files, it looks like the last line contains some binary
-content. It is unclear why and how this happens. So when reading the data,
-these lines were removed. A warning will be given that indicates how many
-files have been affected.
-
-## Extracted variables from raw log files
-
-The following variables (columns in the data frame) are extracted from the
-raw log file:
-
-* `fileId`: Containing the zero-left-padded file name of the raw log file
-  the data line has been extracted from
-
-* `folder`: The folder names in which the raw log files haven been
-  organized in. For the HAUM data set, the data are sorted by year (folders
-  2016, 2017, 2018, 2019, 2020, 2021, 2022, and 2023).
-
-* `date`: Extracted timestamp from the raw log file in the format
-  `yyyy-mm-dd hh:mm:ss`.
-
-* `timeMs`: Containing a timestamp in Milliseconds that restarts with
-  every new raw log files.
-
-* `event`: Start and stop event tags. See above for possible values.
-
-* `item`: Identifier of the different items. This is a three-digit
-  (left-padded) number. The numbers of the items correspond to the
-  folder names in `/ContentEyevisit/eyevisit_cards_light/` and were
-  orginally taken from the museums catalogue.
-
-* `popup`: Name of the pop-up opened. This is only interesting for
-  "openPopup" events.
-
-* `topic`: The number of the topic card that has been opened at the back of
-  the item card. See below for a more detailed description what these
-  numbers mean.
-
-* `x`: Value of x-coordinate in pixel on the 4K-Display ($3840 \times 2160$).
-
-* `y`: Value of y-coordinate in pixel.
-
-* `scale`: Number in 128 bit that indicates how much the card has been
-  scaled.
-
-* `rotation`: Degree of rotation from start configuration.
-
-<!-- TODO: Nach welchem Zeitintervall resettet sich der Tisch wieder in die
-  Ausgangskonfiguration? -> PM needs to look it up -->
-
-## Variables after "closing of events"
-
-The raw log data consist of start and stop events for each event type.
-After preprocessing four event types are extracted: `move`, `flipCard`,
-`openTopic`, and `openPopup`. Except for the `move` events, which can occur
-at any time when interacting with an item card on the table, the events
-have a hierarchical order: An item card first needs to be flipped
-(`flipCard`), then the topic cards on the back of the card can be opened
-(`openTopic`), and finally pop-ups on these topic cards can be opened
-(`openPopup`). This implies that the event `openPopup` can only be present
-for a certain item, if the card has already been flipped (i.e., an event
-`flipCard` for the same item has already occured).
-
-After preprocessing, the data frame is now in a wide format with columns
-for the start and the stop of each event and contains the following
-variables:
-
-* `fileId.start` / `fileId.stop`: See above.
-
-* `date.start` / `date.stop`: See above.
-
-* `folder`: Containing the folder name (see above).
-
-* `case`: A numerical variable indicating cases in the data. A "case"
-  indicates an interaction interval and could be defined in different ways.
-  Right now a new case begins, when no event occurred when no new path
-  started for 20 seconds or longer.
-
-* `path`: A path is defined as one interaction with one item. A path
-  can either start with a `flipCard` event or when an item has been
-  touched for the first time within this case. A path ends with the
-  item card being flipped close again or with the last movement of the
-  card within this case. One case can contain several paths with the same
-  item when the item is flipped open and flipped close again several
-  times within a short time.
-
-* `glossar`: An indicator variable with values 0/1 that tracks if a pop-up
-  has been opened from the glossar folder. These pop-ups can be assigned to
-  the wrong item since it is not possible to do this algorithmically.
-  It is possible that two items are flipped open that could both link to
-  the same pop-up from a glossar. The indicator variable is left as a
-  variable, so that these pop-ups can be easily deleted from the data.
-  Right now, glossar entries can be ignored completely by setting an
-  argument and this is done by default. Using the pop-ups from the glossar
-  will need a lot more love, before it behaves satisfactorily.
-
-* `event`: Indicating the event. Can take tha values `move`, `flipCard`,
-  `openTopic`, and `openPopup`.
-
-* `item`: Identifier of the different artworks and information cards. This
-  is a three-digit (left-padded) number. See above.
-
-* `timeMs.start` / `timeMs.stop`: See above.
-
-* `duration`: Calculated by $timeMs.stop - timeMs.start$ in Milliseconds.
-  Needs to be adjusted for events spanning more than one log file by a
-  factor of $60,000 \times \text{number of logfiles}$. See below for details.
-
-* `topic`: See above.
-
-* `popup`: See above.
-
-* `x.start` / `x.stop`: See above.
-
-* `y.start` / `y.stop`: See above.
-
-* `distance`: Euclidean distande calculated from $(x.start, y.start)$ and
-  $(x.stop, y.stop)$.
-
-* `scale.start` / `scale.stop`: See above.
-
-* `scaleSize`: Relative scaling of item card, calculated by
-  $\frac{scale.stop}{scale.start}$.
-
-* `rotation.start` / `rotation.stop`: See above.
-
-* `rotationDegree`: Difference of rotation from $rotation.stop$ to
-  $rotation.start$.
-
-## How unclosed events are handled
-
-Events do not necessarily need to be completed. A person can, e.g., leave
-the table and not flip the item card close again. For `flipCard`,
-`openTopic`, and `openPopup` the data frame contains `NA` when the event
-does not complete. For `move` events it happens quite often that a start
-event follows a start event and a stop event follows a stop event.
-Technically a move event cannot *not* be finished and the number of events
-without a start or stop indicate that the time resolution was not
-sufficient to catch all these events accurately. Double start and stop
-`move` events have therefore been deleted from the data set.
-
-## Additional meta data
-
-For the HAUM data, I added meta data on state holidays and school
-vacations. 
-
-This led to the following additional variables:
-
-* `holiday`
-
-* `vacations`
-
-# Problems and how I handled them
-
-This lists some problems with the log data that required decisions. These
-decisions influence the outcome and maybe even the data quality. Hence, I
-tried to document how I handled these problems and explain the decisions I
-made.
-
-## Weird behavior of `timeMs` and neg. `duration` values
-
-`timeMs` resets itself every time a new log file starts. This means that
-the durations of events spanning more than one log file must be adjusted.
-Instead of just calculating $timeMs.stop - timeMs.start$, `timeMs.start`
-must be subtracted from the maximum duration of the log file where the
-event started ($600,000 ms$) and the `timeMs.stop` must be added. If the
-event spans more than two log files, a multiple of $600,000$ must be taken,
-e.g. for three log files it must be: $2 \times 600,000 - timeMs.start +
-timeMs.stop$ and so on.
-
-```{r timems, echo = FALSE, results = FALSE, fig.show = TRUE}
-# Read data
-datraw <- read.table("code/results/raw_logfiles_2024-02-21_16-07-33.csv", sep = ";",
-                     header = TRUE)
-
-plot(timeMs ~ as.factor(fileId), datraw[1:5000,], xlab = "fileId")
-```
-
-The boxplot shows that we have a continuous range of values within one log
-file but that `timeMs` does not increase over log files. I kept
-`timeMs.start` and `timeMs.stop` and also `fileId.start` and `fileId.stop`
-in the data frame, so it is clear when events span more than one log file.
-
-<!--
-Infos from the programmer:
-
-"Bin außerdem gerade den Code von damals durchgegangen. Das Logging läuft
-so: Mit Start der Anwendung wird alle 10 Minuten ein neues Logfile
-erstellt. Die Startzeit, von der aus die Duration berechnet wird, wird
-jeweils neu gesetzt. Duration ist also nicht "Dauer seit Start der
-Anwendung" sondern "Dauer seit Restart des Loggers". Deine Vermutung ist
-also richtig - es sollte keine Durations >10 Minuten geben. Der erste
-Eintrag eines Logfiles kann alles zwischen 0 und 10 Minuten sein (je
-nachdem, ob der Tisch zum Zeitpunkt des neuen Logging-Intervalls in
-Benutzung war). Wenn ein Case also über 2+ Logs verteilt ist, musst du auf
-die Duration jeweils 10 Minuten pro Logfile nach dem ersten addieren, damit
-es passt."
--->
-
-## Left padding of file IDs
-
-The file names of the raw log files are automatically generated and contain
-a timestamp. This timestamp is not well formed. First, it contains an
-incorrect month. The months go from 0 to 11 which means, that the file name
-`2016_11_15-12_12_57.log` was collected on December 15, 2016 at 12:12 pm.
-Another problem is that the file names are not zero left padded, e.g.,
-`2016_11_15-12_2_57.log`. This file was collected on December 15, 2016 at
-12:02 pm and therefore before the file above. But most sorting algorithms,
-will sort these files in the order shown below. In order to preprocess the
-data and close events that belong together, the data need to be sorted by
-events and artworks repeatedly. In order to get them back in the correct
-time order, it is necessary to order them based on three variables:
-`fileId.start`, `date.start` and `timeMs.start`. The file IDs therefore
-need to sort in the correct order (again see below for example). I zero
-left padded the log file names within the data frame using it as an
-identifier. These "file names" do not correspond exactly to the original
-raw log file names. This needs to be kept in mind when doing any kind of
-matching etc.
-
-```
-## what it looked like before left padding
-# 1422  ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_2_57.log 2016-12-15 12:12:56  599671 Transform start     076 076.xml   NA 2092.25 2008.00 0.3000000   13.26874254
-# 1423 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57     621 Transform start     076 076.xml   NA 2092.25 2008.00 0.3000000   13.26523465
-# 1424 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57     677  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997736   13.26239605
-# 1425 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57     774 Transform start     076 076.xml   NA 2092.25 2008.00 0.2999345   13.26239605
-# 1426 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57     850  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997107   13.26223362
-# 1427  ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_2_57.log 2016-12-15 12:12:57  599916  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997771   13.26523465
-
-## what it looks like now
-# 1422 2016_11_15-12_02_57.log 2016-12-15 12:12:56  599671 Transform start     076 076.xml   NA 2092.25 2008.00 0.3000000   13.26874254
-# 1423 2016_11_15-12_02_57.log 2016-12-15 12:12:57  599916  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997771   13.26523465
-# 1424 2016_11_15-12_12_57.log 2016-12-15 12:12:57     621 Transform start     076 076.xml   NA 2092.25 2008.00 0.3000000   13.26523465
-# 1425 2016_11_15-12_12_57.log 2016-12-15 12:12:57     677  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997736   13.26239605
-# 1426 2016_11_15-12_12_57.log 2016-12-15 12:12:57     774 Transform start     076 076.xml   NA 2092.25 2008.00 0.2999345   13.26239605
-# 1427 2016_11_15-12_12_57.log 2016-12-15 12:12:57     850  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997107   13.26223362
-```
-
-## Timestamps repeat
-
-The timestamps in the `date` variable record year, month, day, hour,
-minute and seconds. Since one second is not a very short time interval for
-a move on a touch display, this is not fine grained enough to bring events
-into the correct order, meaning there are events from the same log file
-having the same timestamp and even events from different log files having
-the same timestamp. The log files get written about every 10 minutes
-(which can easily be seen when looking at the file names of the raw log
-files). So in order to get events in the correct order, it is necessary to
-first order by file ID, within file ID then sort by timestamp `date` and
-then within these more coarse grained timestamps sort be `timeMs`. But as
-explained above, `timeMs` can only be sorted within one file ID, since they
-do not increase consistently over log files, but have a new setoff for each
-raw log file.
-
-## x,y-coordinates outside of display range
-
-The display of the Multi-Touch-Table is a 4K-display with 3840 x 2160
-pixels. When you plot the start and stop coordinates, the display is
-clearly distinguishable. However, a lot of points are outside of the
-display range. This can happen, when the art objects are scaled and then
-moved to the very edge of the table. Then it will record pixels outside of
-the table. These are actually valid data points and I will leave them as
-is.
-
-```{r xycoord}
-datlogs <- read.table("code/results/event_logfiles_2024-02-21_16-07-33.csv", sep = ";",
-                      header = TRUE)
-
-par(mfrow = c(1, 2))
-plot(y.start ~ x.start, datlogs)
-abline(v = c(0, 3840), h = c(0, 2160), col = "blue", lwd = 2)
-plot(y.stop ~ x.stop, datlogs)
-abline(v = c(0, 3840), h = c(0, 2160), col = "blue", lwd = 2)
-
-aggregate(cbind(x.start, x.stop, y.start, y.stop) ~ 1, datlogs, mean)
-```
-
-## Pop-ups from glossar cannot be assigned to a specific item
-
-All the information, pictures and texts for the topics and pop-ups are
-stored in `/data/haum/ContentEyevisit/eyevisit_cards_light/<item_number>`.
-Among other things, each folder contains XML-files with the information
-about any technical terms that can be opened from the hypertexts on the
-topic cards. Often these information are item dependent and then the
-corresponding XML-file is in the folder for this item. Sometimes, however,
-more general terms can be opened. In order to avoid multiple files
-containing the same information, these were stored in a folder called
-`glossar` and get accessed from there. The raw log files only contain the
-path to this glossar entry and did not record from which item it was
-accessed. I tried to assign these glossar entries to the correct items. The
-(very heuristic) approach was this:
-
-1. Create a lookup table with all XML-file names (possible pop-ups) from
-   the glossar folder and what items possibly call them. This was stored
-   as an `RData` object for easier handling but should maybe be stored in a
-   more interoperable format.
-
-2. I went through all possible pop-ups in this lookup table and stored the
-   items that are associated with it.
-
-3. I created a sub data frame without move events (since they can never be
-   associated with a pop-up) and went through every line and looked up if
-   an item and a topic card had been opened. If this was the case and a
-   glossar entry came up before the item was closed again, I assigned
-   this item to the glossar entry.
-
-This is heuristic since it is possible that several topic cards from
-different items are opened simultaneously and the glossar pop-up could
-be opened from either one (it could even be more than two, of course). In
-these cases the item that was opened closest to the glossar pop-up has
-been assigned, but this can never be completely error free.
-
-And this heuristic only assigns a little more than half of the glossar
-entries. Since my heuristic only looks for the last item that has been
-opened and if this item is a possible candidate it misses all glossar
-pop-ups where another item has been opened in between. This is still an
-open TODO to write a more elaborate algorithm.
-
-All glossar pop-ups that do not get matched with an item are removed
-from the data set with a warning if the argument `glossar = TRUE` is set.
-Otherwise the glossar entries will be ignored completely.
-
-## Assign a `case` variable based on "time heuristic"
-
-One thing needed in order to work with the data set and use it for machine
-learning algorithms like process mining, is a variable that tries to
-identify a case. A case variable will structure the data frame in a way
-that navigation behavior can actually be investigated. However, we do not
-know if several people are standing around the table interacting with it or
-just one very active person. The simplest way to define a case variable is
-to just use a time limit between events. This means that when the table has
-not been interacted with for, e.g., 20 seconds than it is assumed that a
-person moved on and a new person started interacting with the table. This
-is the easiest heuristic and implemented at the moment. Process mining
-shows that this simple approach works in a way that the correct process
-gets extracted by the algorithm.
-
-In order to investigate user behavior on a more fine grained level, it will
-be necessary to come up with a more elaborate approach. A better, still
-simple approach, could be to use this kind of time limit and additionally
-look at the distance between items interacted with within one time window.
-When items are far apart it seems plausible that more than one person
-interacted with them. Very short time lapses between events on different
-items could also be an indicator that more than one person is interacting
-with the table.
-
-## Assign a `path` variable
-
-The `path` variable is supposed to show one interaction trace with one
-artwork. Meaning it starts when an artwork is touched or flipped and stops
-when it is closed again. It is easy to assign a path from flipping a card
-over opening (maybe several) topics and pop-ups for this artwork card until
-closing this card again. But one would like to assign the same path to
-move events surrounding this interaction. Again, this is not possible in an
-algorithmic way but only heuristically.
-
-Again, I used a time cutoff for this. First, if a `move` event occurs, it
-is checked, if the same item has been flipped less than 20 seconds
-beforehand. If yes, the same path indicator is assigned to this `move`. If
-not, temporarily a new "move indicator" is assigned. Then, a "backward
-pass" is applied, where it is checked if the same item is opened less than
-20 seconds _after_ the event occurs. If yes, that path indicator is
-assigned. For all the remaining moves, a new path number is assigned. This
-corresponds to items being moved without being flipped.
-
-## A `move` event does not record any change
-
-Most of the events in the log files are move events. Additionally, many of
-these move events are recorded but they do not indicate any change, meaning
-the only difference is the timestamp. All other variables indicating moves
-like `x.start` and `x.stop`, `rotation.start` and `rotation.stop` etc. do
-not show _any_ change. They represent about 2/3 of all move events. These
-events are probably short touches of the table without an actual
-interaction. They were therefore removed from the data set.
-
-## Card indices go from 0 to 7 (instead of 0 to 5 as expected)
-
-In the beginning I thought that the number for topics was the index of
-where the card was presented on the back of the item. But this is not
-correct. It is the number of the topic. There are eight topics in total:
-
-```
-Indices for topics:
-0   artist
-1   thema
-2   komposition
-3   leben des kunstwerks
-4   details
-5   licht und farbe
-6   extra info
-7   technik
-```
-On the back of items, there can be between 2 to 6 topic cards. Several of
-these topic cards can be about the same topic, e.g., there can be two topic
-cards assigned to the topic `thema`. It is impossible to find out if the
-same topic card was opened several times or if different topic cards with
-the same topic were opened from the same item. See example below for item
-"001".
-
-```{r topics, echo = FALSE}
-items <- sprintf("%03d", unique(datlogs$item))
-topics <- extract_topics(items, xmlfiles = paste0(items, ".xml"),
-                         xmlpath = "data/haum/ContentEyevisit/eyevisit_cards_light/")
-head(topics)
-```
-
-## New artworks "504" and "505" starting October 2022
-
-When I read in the complete data frame for the first time, all of the
-sudden there were 72 instead of 70 items. It seems like these two
-artworks appear on October 21, 2022.
-
-```{r newitems}
-summary(as.Date(datraw[datraw$item %in% c("504", "505"), "date"]))
-```
-
-The artworks seem to be have updated in general after October 21, 2022. The
-following table shows which items were presented in which years.
-
-```{r years}
-xtabs(~ item + lubridate::year(date.start), datlogs)
-```
-
-It shows that the artworks haven been updated after the Corona pandemic. I
-think, the table was also moved to a different location at that point.
-
diff --git a/README.md b/README.md
index adbbd88..6d6c5cc 100644
--- a/README.md
+++ b/README.md
@@ -1,580 +1,66 @@
-Log data from the Multi-Touch Table at the HAUM
-================
+# Accompanying Analysis Code for the Master Thesis "XXX"
 
-The Multi Touch Table at the Herzog-Anton-Ulrich-Museum (HAUM) in
-Braunschweig gives visitors of the Museum the opportunity to interact
-with about 70 artworks and 3 virtual cards containing information about
-the museum and its layout. The table was installed at the museum in
-October 2016 and since November 2016 log files from interactions of
-visitors of the museum have been collected. These log files are in an
-unstructured format and cannot be easily analyzed. The purpose of the
-following document is to describe how the data haven been transformed
-and which decisions have been made along the way.
+The multi-touch table at the Herzog-Anton-Ulrich-Museum (HAUM) in
+Braunschweig gives visitors of the Museum the opportunity to interact with
+about 70 artworks and 3 virtual cards containing information about the
+museum and its layout. The table was installed at the museum in October
+2016 and since November 2016 log files from interactions of visitors of the
+museum have been collected. The master thesis for which this repository was
+created analyzed data collected between December 14, 2016 and July 5, 2023.
+In total, the data set consists of 39,767 log files containing 6,700,176
+events.
 
-The implementation of the steps described here can be found at:
+The following gives a short overview over the analyses conducted. All
+analysis scripts can be found in the `/code/` folder.
+
+## Preprocessing and Descriptives
+
+The first script `01_preprocessing.R` preprocesses the raw log files by
+first parsing them so they are readable by standard statistics software
+like R or Python and then converting it to event logs. A short R package
+doing the preprocessing and more information can be found at
 <https://gitea.iwm-tuebingen.de/R/mtt>.
 
-# Data structure
+The second script `02_descriptives.R` calculates some descriptive
+statistics and creates plots to get an overall feeling for the data set.
+
+## Conformance Checking
+
+A normative Petri net to test the data quality after the preprocessing is
+created in `03_create-petrinet.py` and the actual data quality check is
+done in `04_conformance-checking.py`. Both scripts are written in Python
+using the pm4py library. For more information and the full documentation go
+to <https://pm4py.fit.fraunhofer.de/>.
+
+The next script `05_check-traces.R` (written in R again) checks the corrupt
+trace found during conformance checking and exports the cleaned data sets
+used for the following analyses.
+
+## Clustering of Items
+
+To answer the first research question in the thesis "Do interaction
+patterns look different for different artworks? (Control-flow perspective)"
+process mining was applied to all paths separately for each item on the
+multi-touch table. Fitness, precision, generalizability, simplicity,
+soundness, number of connecting arcs, number of transitions, number of
+places, number of different variants, and the most frequent variant were
+obtained and saved to a CSV file (Python script `06_infos-items.py`). These
+information were then read into R in the next script
+(`07_item-clustering.R`) and used (together with other features) for
+hierarchical clustering.
+
+## Clustering of Cases
+
+For the second research question "What kind of patterns exist and are there
+typical user behaviors? (Case perspective)" six indicator variables for
+five proposed user navigation types were calculated in
+`08_case-characteristics.R` and then used for hierarchical clustering und
+recursive partitioning to extract the different navigation types in script
+`09_user-navigation.R`. A validation of the results for data from 2018 was
+done in `10_validation.R`. Different variants for the cases for the
+complete data set and the data used for investigating the navigation types
+(all log files from 2019) was done in `11_investigate-variants.R` and the
+found clusters of the navigation types were further investigated with
+process mining techniques in R (`12_dfgs-case-clusters.R`) and Python
+(`13_pm-case-clusters.py`).
 
-The log files contain lines that indicate the beginning and end of
-possible activities that can be performed when interacting with the
-artworks on the table. The layout of the table looks like pictures have
-been tossed on a large table. Every artwork is visible at the start
-configuration. People can move the pictures on the table, they can be
-scaled and rotated. Additionally, the virtual picture cards can be
-flipped in order to find more information of the artwork on the “back”
-of the card. One has to press a little `i` for more information in one
-of the bottom corners of the card. On the back of the card two to six
-information cards can be found with a teaser text about a certain topic.
-These topic cards can be opened and a hypertext with detailed
-information opens. Within these hypertexts certain technical terms can
-be clicked for lay people to get more information. This also opens up a
-pop-up. The events encoded in the raw log files therefore have the
-following structure.
-
-    "Start Application"     --> Start Application
-    "Show Application"
-    "Transform start"       --> Move
-    "Transform stop"
-    "Show Info"             --> Flip Card
-    "Show Front"
-    "Artwork/OpenCard"      --> Open Topic
-    "Artwork/CloseCard"
-    "ShowPopup"             --> Open Popup
-    "HidePopup"
-
-The right side shows what events can be extracted from these raw lines.
-The “Start Application” is not an event in the original sense since it
-only indicates if the table was started or maybe reset itself. This is
-not an interaction with the table and therefore not interesting in
-itself. All “Start Application” and “Show Application” are therefore
-excluded from the data when further processed and are only in the raw
-log files.
-
-# Parsing the raw log files
-
-The first step is to parse the raw log files that are stored by the
-application as text files in a rather unstructured format to a format
-that can be read by common statistics software packages. The data are
-therefore transferred to a spread sheet format. The following section
-describes what problems were encountered while doing this.
-
-## Corrupt lines
-
-When reading the files containing the raw logs into R, a warning appears
-that says
-
-    Warning messages:
-      incomplete final line found on '2016/2016_11_18-11_31_0.log'
-      incomplete final line found on '2016/2016_11_18-11_38_30.log'
-      incomplete final line found on '2016/2016_11_18-11_40_36.log'
-      ...
-
-When you open these files, it looks like the last line contains some
-binary content. It is unclear why and how this happens. So when reading
-the data, these lines were removed. A warning will be given that
-indicates how many files have been affected.
-
-## Extracted variables from raw log files
-
-The following variables (columns in the data frame) are extracted from
-the raw log file:
-
-- `fileId`: Containing the zero-left-padded file name of the raw log
-  file the data line has been extracted from
-
-- `folder`: The folder names in which the raw log files haven been
-  organized in. For the HAUM data set, the data are sorted by year
-  (folders 2016, 2017, 2018, 2019, 2020, 2021, 2022, and 2023).
-
-- `date`: Extracted timestamp from the raw log file in the format
-  `yyyy-mm-dd hh:mm:ss`.
-
-- `timeMs`: Containing a timestamp in Milliseconds that restarts with
-  every new raw log files.
-
-- `event`: Start and stop event tags. See above for possible values.
-
-- `item`: Identifier of the different items. This is a three-digit
-  (left-padded) number. The numbers of the items correspond to the
-  folder names in `/ContentEyevisit/eyevisit_cards_light/` and were
-  orginally taken from the museums catalogue.
-
-- `popup`: Name of the pop-up opened. This is only interesting for
-  “openPopup” events.
-
-- `topic`: The number of the topic card that has been opened at the back
-  of the item card. See below for a more detailed description what these
-  numbers mean.
-
-- `x`: Value of x-coordinate in pixel on the 4K-Display
-  ($3840 \times 2160$).
-
-- `y`: Value of y-coordinate in pixel.
-
-- `scale`: Number in 128 bit that indicates how much the card has been
-  scaled.
-
-- `rotation`: Degree of rotation from start configuration.
-
-<!-- TODO: Nach welchem Zeitintervall resettet sich der Tisch wieder in die
-  Ausgangskonfiguration? -> PM needs to look it up -->
-
-## Variables after “closing of events”
-
-The raw log data consist of start and stop events for each event type.
-After preprocessing four event types are extracted: `move`, `flipCard`,
-`openTopic`, and `openPopup`. Except for the `move` events, which can
-occur at any time when interacting with an item card on the table, the
-events have a hierarchical order: An item card first needs to be flipped
-(`flipCard`), then the topic cards on the back of the card can be opened
-(`openTopic`), and finally pop-ups on these topic cards can be opened
-(`openPopup`). This implies that the event `openPopup` can only be
-present for a certain item, if the card has already been flipped (i.e.,
-an event `flipCard` for the same item has already occured).
-
-After preprocessing, the data frame is now in a wide format with columns
-for the start and the stop of each event and contains the following
-variables:
-
-- `fileId.start` / `fileId.stop`: See above.
-
-- `date.start` / `date.stop`: See above.
-
-- `folder`: Containing the folder name (see above).
-
-- `case`: A numerical variable indicating cases in the data. A “case”
-  indicates an interaction interval and could be defined in different
-  ways. Right now a new case begins, when no event occurred when no new
-  path started for 20 seconds or longer.
-
-- `path`: A path is defined as one interaction with one item A path can
-  either start with a `flipCard` event or when an item has been touched
-  for the first time within this case. A path ends with the item card
-  being flipped close again or with the last movement of the card within
-  this case. One case can contain several paths with the same item when
-  the item is flipped open and flipped close again several times within
-  a short time.
-
-- `glossar`: An indicator variable with values 0/1 that tracks if a
-  pop-up has been opened from the glossar folder. These pop-ups can be
-  assigned to the wrong item since it is not possible to do this
-  algorithmically. It is possible that two items are flipped open that
-  could both link to the same pop-up from a glossar. The indicator
-  variable is left as a variable, so that these pop-ups can be easily
-  deleted from the data. Right now, glossar entries can be ignored
-  completely by setting an argument and this is done by default. Using
-  the pop-ups from the glossar will need a lot more love, before it
-  behaves satisfactorily.
-
-- `event`: Indicating the event. Can take tha values `move`, `flipCard`,
-  `openTopic`, and `openPopup`.
-
-- `item`: Identifier of the different artworks and information cards.
-  This is a three-digit (left-padded) number. See above.
-
-- `timeMs.start` / `timeMs.stop`: See above.
-
-- `duration`: Calculated by $timeMs.stop - timeMs.start$ in
-  Milliseconds. Needs to be adjusted for events spanning more than one
-  log file by a factor of $60,000 \times \text{number of logfiles}$. See
-  below for details.
-
-- `topic`: See above.
-
-- `popup`: See above.
-
-- `x.start` / `x.stop`: See above.
-
-- `y.start` / `y.stop`: See above.
-
-- `distance`: Euclidean distande calculated from $(x.start, y.start)$
-  and $(x.stop, y.stop)$.
-
-- `scale.start` / `scale.stop`: See above.
-
-- `scaleSize`: Relative scaling of item card, calculated by
-  $\frac{scale.stop}{scale.start}$.
-
-- `rotation.start` / `rotation.stop`: See above.
-
-- `rotationDegree`: Difference of rotation from $rotation.stop$ to
-  $rotation.start$.
-
-## How unclosed events are handled
-
-Events do not necessarily need to be completed. A person can, e.g.,
-leave the table and not flip the item card close again. For `flipCard`,
-`openTopic`, and `openPopup` the data frame contains `NA` when the event
-does not complete. For `move` events it happens quite often that a start
-event follows a start event and a stop event follows a stop event.
-Technically a move event cannot *not* be finished and the number of
-events without a start or stop indicate that the time resolution was not
-sufficient to catch all these events accurately. Double start and stop
-`move` events have therefore been deleted from the data set.
-
-## Additional meta data
-
-For the HAUM data, I added meta data on state holidays and school
-vacations.
-
-This led to the following additional variables:
-
-- `holiday`
-
-- `vacations`
-
-# Problems and how I handled them
-
-This lists some problems with the log data that required decisions.
-These decisions influence the outcome and maybe even the data quality.
-Hence, I tried to document how I handled these problems and explain the
-decisions I made.
-
-## Weird behavior of `timeMs` and neg. `duration` values
-
-`timeMs` resets itself every time a new log file starts. This means that
-the durations of events spanning more than one log file must be
-adjusted. Instead of just calculating $timeMs.stop - timeMs.start$,
-`timeMs.start` must be subtracted from the maximum duration of the log
-file where the event started ($600,000 ms$) and the `timeMs.stop` must
-be added. If the event spans more than two log files, a multiple of
-$600,000$ must be taken, e.g. for three log files it must be:
-$2 \times 600,000 - timeMs.start + timeMs.stop$ and so on.
-
-![](README_files/figure-gfm/timems-1.png)<!-- -->
-
-The boxplot shows that we have a continuous range of values within one
-log file but that `timeMs` does not increase over log files. I kept
-`timeMs.start` and `timeMs.stop` and also `fileId.start` and
-`fileId.stop` in the data frame, so it is clear when events span more
-than one log file.
-
-<!--
-Infos from the programmer:
-
-"Bin außerdem gerade den Code von damals durchgegangen. Das Logging läuft
-so: Mit Start der Anwendung wird alle 10 Minuten ein neues Logfile
-erstellt. Die Startzeit, von der aus die Duration berechnet wird, wird
-jeweils neu gesetzt. Duration ist also nicht "Dauer seit Start der
-Anwendung" sondern "Dauer seit Restart des Loggers". Deine Vermutung ist
-also richtig - es sollte keine Durations >10 Minuten geben. Der erste
-Eintrag eines Logfiles kann alles zwischen 0 und 10 Minuten sein (je
-nachdem, ob der Tisch zum Zeitpunkt des neuen Logging-Intervalls in
-Benutzung war). Wenn ein Case also über 2+ Logs verteilt ist, musst du auf
-die Duration jeweils 10 Minuten pro Logfile nach dem ersten addieren, damit
-es passt."
--->
-
-## Left padding of file IDs
-
-The file names of the raw log files are automatically generated and
-contain a timestamp. This timestamp is not well formed. First, it
-contains an incorrect month. The months go from 0 to 11 which means,
-that the file name `2016_11_15-12_12_57.log` was collected on December
-15, 2016 at 12:12 pm. Another problem is that the file names are not
-zero left padded, e.g., `2016_11_15-12_2_57.log`. This file was
-collected on December 15, 2016 at 12:02 pm and therefore before the file
-above. But most sorting algorithms, will sort these files in the order
-shown below. In order to preprocess the data and close events that
-belong together, the data need to be sorted by events and artworks
-repeatedly. In order to get them back in the correct time order, it is
-necessary to order them based on three variables: `fileId.start`,
-`date.start` and `timeMs.start`. The file IDs therefore need to sort in
-the correct order (again see below for example). I zero left padded the
-log file names within the data frame using it as an identifier. These
-“file names” do not correspond exactly to the original raw log file
-names. This needs to be kept in mind when doing any kind of matching
-etc.
-
-    ## what it looked like before left padding
-    # 1422  ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_2_57.log 2016-12-15 12:12:56  599671 Transform start     076 076.xml   NA 2092.25 2008.00 0.3000000   13.26874254
-    # 1423 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57     621 Transform start     076 076.xml   NA 2092.25 2008.00 0.3000000   13.26523465
-    # 1424 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57     677  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997736   13.26239605
-    # 1425 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57     774 Transform start     076 076.xml   NA 2092.25 2008.00 0.2999345   13.26239605
-    # 1426 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57     850  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997107   13.26223362
-    # 1427  ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_2_57.log 2016-12-15 12:12:57  599916  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997771   13.26523465
-
-    ## what it looks like now
-    # 1422 2016_11_15-12_02_57.log 2016-12-15 12:12:56  599671 Transform start     076 076.xml   NA 2092.25 2008.00 0.3000000   13.26874254
-    # 1423 2016_11_15-12_02_57.log 2016-12-15 12:12:57  599916  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997771   13.26523465
-    # 1424 2016_11_15-12_12_57.log 2016-12-15 12:12:57     621 Transform start     076 076.xml   NA 2092.25 2008.00 0.3000000   13.26523465
-    # 1425 2016_11_15-12_12_57.log 2016-12-15 12:12:57     677  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997736   13.26239605
-    # 1426 2016_11_15-12_12_57.log 2016-12-15 12:12:57     774 Transform start     076 076.xml   NA 2092.25 2008.00 0.2999345   13.26239605
-    # 1427 2016_11_15-12_12_57.log 2016-12-15 12:12:57     850  Transform stop     076 076.xml   NA 2092.25 2008.00 0.2997107   13.26223362
-
-## Timestamps repeat
-
-The timestamps in the `date` variable record year, month, day, hour,
-minute and seconds. Since one second is not a very short time interval
-for a move on a touch display, this is not fine grained enough to bring
-events into the correct order, meaning there are events from the same
-log file having the same timestamp and even events from different log
-files having the same timestamp. The log files get written about every
-10 minutes (which can easily be seen when looking at the file names of
-the raw log files). So in order to get events in the correct order, it
-is necessary to first order by file ID, within file ID then sort by
-timestamp `date` and then within these more coarse grained timestamps
-sort be `timeMs`. But as explained above, `timeMs` can only be sorted
-within one file ID, since they do not increase consistently over log
-files, but have a new setoff for each raw log file.
-
-## x,y-coordinates outside of display range
-
-The display of the Multi-Touch-Table is a 4K-display with 3840 x 2160
-pixels. When you plot the start and stop coordinates, the display is
-clearly distinguishable. However, a lot of points are outside of the
-display range. This can happen, when the art objects are scaled and then
-moved to the very edge of the table. Then it will record pixels outside
-of the table. These are actually valid data points and I will leave them
-as is.
-
-``` r
-datlogs <- read.table("code/results/event_logfiles_2024-02-21_16-07-33.csv", sep = ";",
-                      header = TRUE)
-
-par(mfrow = c(1, 2))
-plot(y.start ~ x.start, datlogs)
-abline(v = c(0, 3840), h = c(0, 2160), col = "blue", lwd = 2)
-plot(y.stop ~ x.stop, datlogs)
-abline(v = c(0, 3840), h = c(0, 2160), col = "blue", lwd = 2)
-```
-
-![](README_files/figure-gfm/xycoord-1.png)<!-- -->
-
-``` r
-aggregate(cbind(x.start, x.stop, y.start, y.stop) ~ 1, datlogs, mean)
-```
-
-    ##    x.start   x.stop  y.start   y.stop
-    ## 1 1978.202 1975.876 1137.481 1133.494
-
-## Pop-ups from glossar cannot be assigned to a specific item
-
-All the information, pictures and texts for the topics and pop-ups are
-stored in
-`/data/haum/ContentEyevisit/eyevisit_cards_light/<item_number>`. Among
-other things, each folder contains XML-files with the information about
-any technical terms that can be opened from the hypertexts on the topic
-cards. Often these information are item dependent and then the
-corresponding XML-file is in the folder for this item. Sometimes,
-however, more general terms can be opened. In order to avoid multiple
-files containing the same information, these were stored in a folder
-called `glossar` and get accessed from there. The raw log files only
-contain the path to this glossar entry and did not record from which
-item it was accessed. I tried to assign these glossar entries to the
-correct items. The (very heuristic) approach was this:
-
-1.  Create a lookup table with all XML-file names (possible pop-ups)
-    from the glossar folder and what items possibly call them. This was
-    stored as an `RData` object for easier handling but should maybe be
-    stored in a more interoperable format.
-
-2.  I went through all possible pop-ups in this lookup table and stored
-    the items that are associated with it.
-
-3.  I created a sub data frame without move events (since they can never
-    be associated with a pop-up) and went through every line and looked
-    up if an item and a topic card had been opened. If this was the case
-    and a glossar entry came up before the item was closed again, I
-    assigned this item to the glossar entry.
-
-This is heuristic since it is possible that several topic cards from
-different items are opened simultaneously and the glossar pop-up could
-be opened from either one (it could even be more than two, of course).
-In these cases the item that was opened closest to the glossar pop-up
-has been assigned, but this can never be completely error free.
-
-And this heuristic only assigns a little more than half of the glossar
-entries. Since my heuristic only looks for the last item that has been
-opened and if this item is a possible candidate it misses all glossar
-pop-ups where another item has been opened in between. This is still an
-open TODO to write a more elaborate algorithm.
-
-All glossar pop-ups that do not get matched with an item are removed
-from the data set with a warning if the argument `glossar = TRUE` is
-set. Otherwise the glossar entries will be ignored completely.
-
-## Assign a `case` variable based on “time heuristic”
-
-One thing needed in order to work with the data set and use it for
-machine learning algorithms like process mining, is a variable that
-tries to identify a case. A case variable will structure the data frame
-in a way that navigation behavior can actually be investigated. However,
-we do not know if several people are standing around the table
-interacting with it or just one very active person. The simplest way to
-define a case variable is to just use a time limit between events. This
-means that when the table has not been interacted with for, e.g., 20
-seconds than it is assumed that a person moved on and a new person
-started interacting with the table. This is the easiest heuristic and
-implemented at the moment. Process mining shows that this simple
-approach works in a way that the correct process gets extracted by the
-algorithm.
-
-In order to investigate user behavior on a more fine grained level, it
-will be necessary to come up with a more elaborate approach. A better,
-still simple approach, could be to use this kind of time limit and
-additionally look at the distance between items interacted with within
-one time window. When items are far apart it seems plausible that more
-than one person interacted with them. Very short time lapses between
-events on different items could also be an indicator that more than one
-person is interacting with the table.
-
-## Assign a `path` variable
-
-The `path` variable is supposed to show one interaction trace with one
-artwork. Meaning it starts when an artwork is touched or flipped and
-stops when it is closed again. It is easy to assign a path from flipping
-a card over opening (maybe several) topics and pop-ups for this artwork
-card until closing this card again. But one would like to assign the
-same path to move events surrounding this interaction. Again, this is
-not possible in an algorithmic way but only heuristically.
-
-Again, I used a time cutoff for this. First, if a `move` event occurs,
-it is checked, if the same item has been flipped less than 20 seconds
-beforehand. If yes, the same path indicator is assigned to this `move`.
-If not, temporarily a new “move indicator” is assigned. Then, a
-“backward pass” is applied, where it is checked if the same item is
-opened less than 20 seconds *after* the event occurs. If yes, that path
-indicator is assigned. For all the remaining moves, a new path number is
-assigned. This corresponds to items being moved without being flipped.
-
-## A `move` event does not record any change
-
-Most of the events in the log files are move events. Additionally, many
-of these move events are recorded but they do not indicate any change,
-meaning the only difference is the timestamp. All other variables
-indicating moves like `x.start` and `x.stop`, `rotation.start` and
-`rotation.stop` etc. do not show *any* change. They represent about 2/3
-of all move events. These events are probably short touches of the table
-without an actual interaction. They were therefore removed from the data
-set.
-
-## Card indices go from 0 to 7 (instead of 0 to 5 as expected)
-
-In the beginning I thought that the number for topics was the index of
-where the card was presented on the back of the item. But this is not
-correct. It is the number of the topic. There are eight topics in total:
-
-    Indices for topics:
-    0   artist
-    1   thema
-    2   komposition
-    3   leben des kunstwerks
-    4   details
-    5   licht und farbe
-    6   extra info
-    7   technik
-
-On the back of items, there can be between 2 to 6 topic cards. Several
-of these topic cards can be about the same topic, e.g., there can be two
-topic cards assigned to the topic `thema`. It is impossible to find out
-if the same topic card was opened several times or if different topic
-cards with the same topic were opened from the same item. See example
-below for item “001”.
-
-    ##   item            file_name                topic
-    ## 1  001 001_dargestellte.xml                thema
-    ## 2  001       001_thema1.xml                thema
-    ## 3  001        001_leben.xml leben des kunstwerks
-    ## 4  001       001_leben3.xml leben des kunstwerks
-    ## 5  001       001_thema2.xml                thema
-    ## 6  001        001_thema.xml                thema
-
-## New artworks “504” and “505” starting October 2022
-
-When I read in the complete data frame for the first time, all of the
-sudden there were 72 instead of 70 items. It seems like these two
-artworks appear on October 21, 2022.
-
-``` r
-summary(as.Date(datraw[datraw$item %in% c("504", "505"), "date"]))
-```
-
-    ##         Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
-    ## "2022-10-21" "2023-01-11" "2023-03-08" "2023-03-09" "2023-05-21" "2023-07-05"
-
-The artworks seem to be have updated in general after October 21, 2022.
-The following table shows which items were presented in which years.
-
-``` r
-xtabs(~ item + lubridate::year(date.start), datlogs)
-```
-
-    ##      lubridate::year(date.start)
-    ## item   2016  2017  2018  2019  2020  2022  2023
-    ##   1     277  4082  1912  1434   424   394  1315
-    ##   3     485  6730  3126  2356   528   457  1124
-    ##   19    714  8656  4028  2743   660   698  1595
-    ##   20    595  8461  3996  2983   938   657  1355
-    ##   24    497  6638  2912  2251   649   439  1028
-    ##   27    567  5959  3112  2318   651   711  1324
-    ##   28    601  9329  4394  3056   778   762  1570
-    ##   29    425  6865  3830  2365   516   615  1174
-    ##   31    289  4118  2051  1218   291   296   675
-    ##   32    562  7016  3477  2253   726   766  1647
-    ##   33    509  4936  2242  1449   555   358   666
-    ##   36    434  4505  2276  1668   373   387   976
-    ##   37    242  4478  2182  1554   339   423  1168
-    ##   38    480  4617  2144  1397   371   381   784
-    ##   39    395  3227  1313  1003   237   161   622
-    ##   41    282  3329  1303  1022   225   209   701
-    ##   42    203  3113  1307   903   242   191   421
-    ##   43    115  2420  1089   806   176   219   486
-    ##   45   1491 13561  5924  4474   966   585  1828
-    ##   46    903  9181  5340  3812   961   944  1648
-    ##   47    306  4949  2395  1510   750   297   675
-    ##   48    723 10455  5384  4162  1328   948  2031
-    ##   49    433  4326  2124  1414   434   431   809
-    ##   51    564  7837  4577  2991   884   659  1370
-    ##   52    447  5021  2104  1729   471   349   840
-    ##   54    424  5068  2816  2008   529   370   918
-    ##   55    358  4859  2069  1428   341   403  1303
-    ##   57    860 14264  6625  5092  1410  1221  2714
-    ##   60    555  6865  3539  2336   639   586  1415
-    ##   62    547  6736  3803  2210   795   633  1322
-    ##   63    251  3677  1827  1241   300   282   527
-    ##   66    552  6004  2774  1977   505   373   932
-    ##   69    394  3730  1827  1438   272   206   680
-    ##   70    226  3766  1843   973   293   268   703
-    ##   71    557  6160  2490  1846   570   323   839
-    ##   72    426  6194  2857  2129   508   635  1553
-    ##   73    432  6125  2880  1821   583   395   939
-    ##   75    258  5885  2418  1562   369   257   645
-    ##   76    861 12435  6253  4214  1753  1153  2268
-    ##   77    816  8595  4197  2897   699   674  1452
-    ##   78    410  5632  2498  1924   394   408   850
-    ##   80   1650 25687 12429  7782  1975  1712  4433
-    ##   83    644  8618  4720  3026   987  1027  2294
-    ##   84    184  2121  1231   759   231   254   465
-    ##   87    149  1618   722   632    99     0     0
-    ##   88    513  6996  3493  2272   539   533  1420
-    ##   89    214  2204   950   723   156     0     0
-    ##   90    281  3756  1372  1143   403   320   932
-    ##   93    613  8528  4224  3015   696  1174  2058
-    ##   98    462  6662  3265  2565   704   670  1453
-    ##   99    180  4162  1653  1454   363   411   868
-    ##   101   414  4209  1859  1282   392   411   981
-    ##   103   677  8758  4366  3165  1045   909  1871
-    ##   104   423  5256  2381  1865   463   467   933
-    ##   107   181  2101  1106   788   205   146   339
-    ##   109   321  4001  1619  1106   292   188   453
-    ##   110   489  5846  2785  2008   494   387   923
-    ##   125   640  8435  4519  3334   926     0     0
-    ##   129   598 11322  5046  3369   910  1131  1682
-    ##   145   419  7821  3945  2694   706   740  1396
-    ##   176   507  8465  3968  2787   687   552  1544
-    ##   180   516  7563  3720  2765   585   550  1272
-    ##   183   377  4014  1819  1741   346   251   675
-    ##   187   340  4222  2165  1753   319   312   734
-    ##   197   426  7710  3603  2510   671   602  1217
-    ##   229   303  4872  2360  1891   482   389  1005
-    ##   231   271  3606  1851  1239   318   236   467
-    ##   501  1915 15968  7849  5060  1157   890  2989
-    ##   502  1212 14550  7111  4749  1105   883  2752
-    ##   503  1308 15218  8632  6399  1626   870  2558
-    ##   504     0     0     0     0     0   363   662
-    ##   505     0     0     0     0     0   426  1533
-
-It shows that the artworks haven been updated after the Corona pandemic.
-I think, the table was also moved to a different location at that point.
diff --git a/README_files/figure-gfm/timems-1.png b/README_files/figure-gfm/timems-1.png
deleted file mode 100644
index f08b70a..0000000
Binary files a/README_files/figure-gfm/timems-1.png and /dev/null differ
diff --git a/README_files/figure-gfm/xycoord-1.png b/README_files/figure-gfm/xycoord-1.png
deleted file mode 100644
index d72a279..0000000
Binary files a/README_files/figure-gfm/xycoord-1.png and /dev/null differ