From 6921203765c7d55898fbb761fa8a0d803ecba1bc Mon Sep 17 00:00:00 2001 From: nwickel Date: Wed, 10 Dec 2025 17:19:07 +0100 Subject: [PATCH] Finalize data README --- 03_data/README.md | 45 +++++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/03_data/README.md b/03_data/README.md index ae26359..7877b56 100644 --- a/03_data/README.md +++ b/03_data/README.md @@ -54,8 +54,7 @@ immediately preceding wave because ongoing monitoring showed that many non-users remained non-users and that relatively few participants perceived AI as a social actor. To capture more contemporary usage and obtain sufficient variation for research questions filtering for individuals that perceived AI as a social -actor, we broadened recruitment in wave 4 to all wave-1 participants. Sample 2 -therefore contains only participants with at least one missing wave. +actor, we broadened recruitment in wave 4 to all wave-1 participants. ## Download settings in Qualtrics @@ -82,7 +81,7 @@ wave in `03_data/01_raw_data/wave*`. The script data and adds an anonymized ID `subj_id` with entries `subj0001 - sub1009` to all data sets. -Irrelevant columns -- mostly automatically created by Qualtrics -- are also +Irrelevant columns - mostly automatically created by Qualtrics - are also removed. See `anonymization.R` for details. The anonymized data files are saved to `03_data/02_anonymized_data/` as @@ -90,10 +89,10 @@ CSV files with file names `HMC__anonymized.csv`. # Data cleaning -After data anonymization, some more rudimentary preprocessing was done on the -data with the script `03_data/02_anonymized_data/cleaning.R`. Especially, -the original variable names in Qualtrics were harmonized so they all follow the -same structure. +After data anonymization, some more rudimentary data cleaning was done with the +script `03_data/02_anonymized_data/cleaning.R`. Especially, the original +variable names in Qualtrics were harmonized so they all follow the same +structure. The cleaned data files are saved to `03_data/03_cleaned_data/` as CSV files with file names `HMC__cleaned.csv`. @@ -103,21 +102,31 @@ some cleaning. ## Problems -### with variable names over waves +### with variable names -* `trust_fav` and `Q161` and `Q162` -* `obj_know` and `Q158` -* the labels of the intention variables were swapped - --> `int_use_bhvr_fav = int_use_bhvr_noUser` and vice versa -* ... +* For the variables looking at what tasks subjects would delegate to AI, there + were some inconsistencies in the naming. This was _only_ in the variable + naming, the items were presented correctly to the subjects. The folloing + variables were renamed: + - `delg_tsk_typs_4 --> delg_tsk_typs_3` + - `delg_tsk_typs_5 --> delg_tsk_typs_4` + - `delg_tsk_typs_6 --> delg_tsk_typs_5` + - `delg_tsk_typs_7 --> delg_tsk_typs_6` + - `delg_tsk_typs_8 --> delg_tsk_typs_7` + - `delg_tsk_typs_8` was deleted - +* The labels of the intention variables were swapped by accident and this was + corrected: + - `int_use_bhvr_fav = int_use_bhvr_noUser` and vice versa ### with subjects * Two entries in wave 1: `subj0762` * Three entries in wave 3: `subj1009` * We kept the first entry for each subject +* `subj1009` has been removed from the dataset since it only appeared in wave 3 + and it is unclear how this happened; only subjects who participated in wave 1 + have been invited to participate in further waves # Data preprocessing @@ -128,11 +137,3 @@ with the final variables for each scale are then saved in the folder `03_data/04_preprocessed_data` as CSV files with file names `HMC__preprocessed.csv`. -# TODOs - -* Add more preprocessing steps like variable renaming? - -* Get age (and other descriptives?) for subj1008 and subj1009 from Profilic - data? - -