mtt_haum/README.md

195 lines
7.3 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Offene Fragen
## Datenverständnis
* Welche Einheit haben x und y? Pixel? --> yes
* Welche Einheit hat scale? --> some kind if bit, does not matter, when
calculating a ratio
* rotation wirklich degree? --> yes
* Nach welchem Zeitintervall resettet sich der Tisch wieder in die
Ausgangskonfiguration? --> PM needs to look it up
## Tisch-Software
* Gibt es Doku für die Bilder, die über die xml files hinausgeht? Sowas wie
ein Manual oder ähnliches?
* Gibt es evtl. irgendwo noch ein Tablet mit der Anwendung drauf?
* Was bedeuten die Farben der Topic Cards? --> sieht man in den xml files
## Event Logs
* Wie gehen wir mit "nicht geschlossenen" Events um? Einfach rauslöschen?
- für Transform tendiere ich zu ja, weil sonst total uninteressant
- bei flipCard bin ich nicht so sicher... Aber man kann dann keine
duration berechnen, wäre NA
* Moves/scales/rotations ohne Veränderung würde ich auf jeden Fall
rauslöschen
* Es ist nicht möglich (bzw. ich weiß nicht wie) zusammengehörige Events
eineindeutig zu identifizieren
- nach Heuristik vorgehen? Doppelte Transformation start und stop einfach
raus?
- Daten sind nicht "fehlerfrei"; es gibt z.B. Transformation-Events wo
das Ende nicht geloggt wurde
* Wie identifiziere ich eine "Interaktionseinheit"?
- Was ist ein "case"?
- Eher grob über Zeitintervalle?
- Noch irgendeine andere Idee?
* Herausfinden, ob mehr als eine Person am Tisch steht?
- Sliding window, in der Anzahl von Artworks gezählt wird? Oder wie weit
angefasste Artworks voneinander entfernt sind?
- Man kann sowas schon "sehen" in den Logs - aber wie kann ich es
automatisiert rausziehen? Was ist meine Definition von
"Interaktionsboost"?
- Egal wie wir es machen, geht es auf den "Event-Log-Daten"?
* Anreicherung der Log-Daten mit weiteren Metadaten? Was wäre interessant?
- Metadata on artworks like, name, artist, type of artwork, epoch, etc.
ˆ - School vacations and holidays
ˆ - Special exhibits at the museum
ˆ - Number of visitors per day
ˆ - Age structure of visitors per day?
- ... ????
## HAUM
* Bei Sven noch mal nachhaken wegen Besucherzahlen?
# Problems and how I handled them
This lists some problems with the log data that required decisions. These
decisions influence the outcome and maybe even the data quality. Hence, I
tried to document how I handled these problems and explain the decisions I
made.
## Weird behavior of `time_ms` and neg. `duration`values
I think the negative duration values happen, when an event starts in one
log file and completes in another one. The variable `time_ms` seems to be
continuous within one log file but not over several log files.
```{r}
dat_all[which(dat_all$duration < 0), ][1:5, 1:10]
# flipCard
## trace 56
dat3[dat3$trace == 56,]
dat[dat$fileid == "2016_11_15-11_12_57.log" & dat$date == "2016-12-15 11:17:26", ]
dat[dat$fileid == "2016_11_15-11_42_57.log" & dat$date == "2016-12-15 11:46:19", ]
#dat[309:1405, ]
tmp <- dat[300:1405, ]
tmp[tmp$artwork == "051", ]
## -> was closed correctly, but does it belong together?
## trace 61
dat3[dat3$trace == 61,]
dat[dat$fileid == "2016_11_15-11_12_57.log" & dat$date == "2016-12-15 11:17:52", ]
dat[dat$fileid == "2016_11_15-11_42_57.log" & dat$date == "2016-12-15 11:46:19", ]
tmp <- dat[350:1408, ]
tmp[tmp$artwork == "057", ]
## -> was closed correctly, but does it belong together?
# openTopic
dat_all[which(dat_all$duration < 0), ][100:105, 1:10]
# trace 2052
dat4[dat4$trace == 2052,]
dat[dat$fileid == "2016_11_17-14_12_10.log" & dat$date == "2016-12-17 14:21:51", ]
dat[dat$fileid == "2016_11_17-14_22_10.log" & dat$date == "2016-12-17 14:22:25", ]
tmp <- dat[23801:23950, ]
tmp[tmp$artwork == "502", ]
plot(time_ms ~ as.factor(fileid), dat[1:5000,])
```
The boxplot shows that we have a continuous range of values within one log
file but that `time_ms` does not increase over log files.
<!--
TODO: I will probably update how events are closed and the names of these
data frame, especially `dat3` and `dat4` will have to be adjusted.
-->
Since it seems not possible to fix this in a consistent way, I will set
negative durations to `NA`. I will keep `time_ms.start` and `time_ms.stop`
in the data frame, so it is clear why there are no durations. Maybe it
would also be useful to keep `logfileid.start` and `logfileid.stop` in the
data? Maybe just for proof checking this theory...
Part of it was that timestamps that are part of the log file names are not
zero-left-padded. But this fixed only three `move` events, since it only
fixed irregularities *within* one log file.
```{r}
table(dat_all[dat_all$duration < 0, "event"])
# flipCard move openPopup openTopic
# 562 100 34 284
dat[dat$event %in% c("Transform start", "Transform stop"), ][1100:1300,]
# --> got fixed by left padding... but only three all together!!
dat_all[735, ]
## what it looked like before left padding
# 1422 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_2_57.log 2016-12-15 12:12:56 599671 Transform start 076 076.xml NA 2092.25 2008.00 0.3000000 13.26874254
# 1423 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57 621 Transform start 076 076.xml NA 2092.25 2008.00 0.3000000 13.26523465
# 1424 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57 677 Transform stop 076 076.xml NA 2092.25 2008.00 0.2997736 13.26239605
# 1425 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57 774 Transform start 076 076.xml NA 2092.25 2008.00 0.2999345 13.26239605
# 1426 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_12_57.log 2016-12-15 12:12:57 850 Transform stop 076 076.xml NA 2092.25 2008.00 0.2997107 13.26223362
# 1427 ../data/haum_logs_2016-2023/_2016b/2016_11_15-12_2_57.log 2016-12-15 12:12:57 599916 Transform stop 076 076.xml NA 2092.25 2008.00 0.2997771 13.26523465
## what it looks like now
# 1422 2016_11_15-12_02_57.log 2016-12-15 12:12:56 599671 Transform start 076 076.xml NA 2092.25 2008.00 0.3000000 13.26874254
# 1423 2016_11_15-12_02_57.log 2016-12-15 12:12:57 599916 Transform stop 076 076.xml NA 2092.25 2008.00 0.2997771 13.26523465
# 1424 2016_11_15-12_12_57.log 2016-12-15 12:12:57 621 Transform start 076 076.xml NA 2092.25 2008.00 0.3000000 13.26523465
# 1425 2016_11_15-12_12_57.log 2016-12-15 12:12:57 677 Transform stop 076 076.xml NA 2092.25 2008.00 0.2997736 13.26239605
# 1426 2016_11_15-12_12_57.log 2016-12-15 12:12:57 774 Transform start 076 076.xml NA 2092.25 2008.00 0.2999345 13.26239605
# 1427 2016_11_15-12_12_57.log 2016-12-15 12:12:57 850 Transform stop 076 076.xml NA 2092.25 2008.00 0.2997107 13.26223362
```
## Events that only close (`date.start` is NA)
## Timestamps repeat
## Popups from glossar cannot be assigned to a specific artwork
## Assign a case variable based on "time heuristic"
## A `move`event does not record any change
## Add moves to `trace` variable
# Reading list
* @Arizmendi2022 [$-$]
* @Bannert2014 [x]
* @Bousbia2010 [$-$]
* @Cerezo2020
* @GerjetsSchwan2021 [x]
* @Goldhammer2020
* @Guenther2007
* @HuberBannert2023 [x]
* @Kroehne2018
* @SchwanGerjets2021 [x]
* @vanderAalst2016 [Chap. 2, x]
* @vanderAalst2016 [Chap. 3]
* @vanderAalst2016 [Chap. 5, x]
* @Wang2019