Update READMEs

This commit is contained in:
2025-12-08 16:02:14 +01:00
parent 2aa6441eba
commit 973d552050
2 changed files with 23 additions and 6 deletions
+12 -3
View File
@@ -88,15 +88,15 @@ removed. See `anonymization.R` for details.
The anonymized data files are saved to `03_data/02_anonymized_data/` as
CSV files with file names `HMC_<wave>_anonymized.csv`.
# Data preprocessing
# Data cleaning
After data anonymization, some more rudimentary preprocessing was done on the
data with the script `03_data/02_anonymized_data/cleaning.R`. Especially,
the original variable names in Qualtrics were harmonized so they all follow the
same structure.
The cleaned data files are saved to `03_data/03_cleaned_data/`as
CSV files with file names `HMC_<wave>_cleaned.csv`.
The cleaned data files are saved to `03_data/03_cleaned_data/` as CSV files with
file names `HMC_<wave>_cleaned.csv`.
The following section gives an overview of the problems in the data, that needed
some cleaning.
@@ -119,6 +119,15 @@ some cleaning.
* Three entries in wave 3: `subj1009`
* We kept the first entry for each subject
# Data preprocessing
The final data preprocessing creates scales from the collected items. It was
done in Python and the code for the preprocessing can be found in a separate
code repository: https://gitea.iwm-tuebingen.de/HMC/preprocessing. The files
with the final variables for each scale are then saved in the folder
`03_data/04_preprocessed_data` as CSV files with file names
`HMC_<wave>_preprocessed.csv`.
# TODOs
* Add more preprocessing steps like variable renaming?