Finalize data README
This commit is contained in:
parent
bbdb35559c
commit
6921203765
@ -54,8 +54,7 @@ immediately preceding wave because ongoing monitoring showed that many non-users
|
||||
remained non-users and that relatively few participants perceived AI as a social
|
||||
actor. To capture more contemporary usage and obtain sufficient variation for
|
||||
research questions filtering for individuals that perceived AI as a social
|
||||
actor, we broadened recruitment in wave 4 to all wave-1 participants. Sample 2
|
||||
therefore contains only participants with at least one missing wave.
|
||||
actor, we broadened recruitment in wave 4 to all wave-1 participants.
|
||||
|
||||
|
||||
## Download settings in Qualtrics
|
||||
@ -82,7 +81,7 @@ wave in `03_data/01_raw_data/wave*`. The script
|
||||
data and adds an anonymized ID `subj_id` with entries `subj0001 - sub1009` to
|
||||
all data sets.
|
||||
|
||||
Irrelevant columns -- mostly automatically created by Qualtrics -- are also
|
||||
Irrelevant columns - mostly automatically created by Qualtrics - are also
|
||||
removed. See `anonymization.R` for details.
|
||||
|
||||
The anonymized data files are saved to `03_data/02_anonymized_data/` as
|
||||
@ -90,10 +89,10 @@ CSV files with file names `HMC_<wave>_anonymized.csv`.
|
||||
|
||||
# Data cleaning
|
||||
|
||||
After data anonymization, some more rudimentary preprocessing was done on the
|
||||
data with the script `03_data/02_anonymized_data/cleaning.R`. Especially,
|
||||
the original variable names in Qualtrics were harmonized so they all follow the
|
||||
same structure.
|
||||
After data anonymization, some more rudimentary data cleaning was done with the
|
||||
script `03_data/02_anonymized_data/cleaning.R`. Especially, the original
|
||||
variable names in Qualtrics were harmonized so they all follow the same
|
||||
structure.
|
||||
|
||||
The cleaned data files are saved to `03_data/03_cleaned_data/` as CSV files with
|
||||
file names `HMC_<wave>_cleaned.csv`.
|
||||
@ -103,21 +102,31 @@ some cleaning.
|
||||
|
||||
## Problems
|
||||
|
||||
### with variable names over waves
|
||||
### with variable names
|
||||
|
||||
* `trust_fav` and `Q161` and `Q162`
|
||||
* `obj_know` and `Q158`
|
||||
* the labels of the intention variables were swapped
|
||||
--> `int_use_bhvr_fav = int_use_bhvr_noUser` and vice versa
|
||||
* ...
|
||||
* For the variables looking at what tasks subjects would delegate to AI, there
|
||||
were some inconsistencies in the naming. This was _only_ in the variable
|
||||
naming, the items were presented correctly to the subjects. The folloing
|
||||
variables were renamed:
|
||||
- `delg_tsk_typs_4 --> delg_tsk_typs_3`
|
||||
- `delg_tsk_typs_5 --> delg_tsk_typs_4`
|
||||
- `delg_tsk_typs_6 --> delg_tsk_typs_5`
|
||||
- `delg_tsk_typs_7 --> delg_tsk_typs_6`
|
||||
- `delg_tsk_typs_8 --> delg_tsk_typs_7`
|
||||
- `delg_tsk_typs_8` was deleted
|
||||
|
||||
<!-- TODO: Add more details -->
|
||||
* The labels of the intention variables were swapped by accident and this was
|
||||
corrected:
|
||||
- `int_use_bhvr_fav = int_use_bhvr_noUser` and vice versa
|
||||
|
||||
### with subjects
|
||||
|
||||
* Two entries in wave 1: `subj0762`
|
||||
* Three entries in wave 3: `subj1009`
|
||||
* We kept the first entry for each subject
|
||||
* `subj1009` has been removed from the dataset since it only appeared in wave 3
|
||||
and it is unclear how this happened; only subjects who participated in wave 1
|
||||
have been invited to participate in further waves
|
||||
|
||||
# Data preprocessing
|
||||
|
||||
@ -128,11 +137,3 @@ with the final variables for each scale are then saved in the folder
|
||||
`03_data/04_preprocessed_data` as CSV files with file names
|
||||
`HMC_<wave>_preprocessed.csv`.
|
||||
|
||||
# TODOs
|
||||
|
||||
* Add more preprocessing steps like variable renaming?
|
||||
|
||||
* Get age (and other descriptives?) for subj1008 and subj1009 from Profilic
|
||||
data?
|
||||
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user