Data_Descriptor_HMC/manuscript.tex

179 lines
7.1 KiB
TeX

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{Sweave}
\usepackage{authblk}
\title{Working title: Data Descriptor for HMC Data Set}
\author{Angelica Henestrosa}
\affil{Leibniz-Institut für Wissensmedien, Tübingen}
\begin{document}
\input{manuscript-concordance}
%\SweaveOpts{concordance=TRUE}
\maketitle
\begin{abstract}
Since the emergence of large language models (LLMs) in 2022, generative AI has rapidly expanded into mainstream applications, leading to the integration of Apple Intelligence into customer devices in 2024. This integration into personal technology marks a significant shift, bringing advanced AI capabilities into everyday devices and making them accessible to private individuals.
Thus, the use of generative AI--consciously or unconsciously--along with interaction through LLM-powered (voice) assistants and engagement with AI-generated content is expected to increase significantly.
However, data that link this usage to psychological variables and track it over time remain scarce.
This longitudinal study comprises the data from an American sample across six waves at two-month intervals between September 2024 and July 2025. It examines user behavior, attitudes, knowledge, and perceptions related to generative AI.
...
This dataset allows for future research on psychological and behavioral dynamics of AI use over time, offering insights into user engagement and the individual factors connected to it.
% Should not exceed 170 words
\end{abstract}
\section{Background and Summary}
...
Longitudinal studies like this are needed to capture the evolving perceptions of opportunities and risks associated with AI, perceived capabilities of AI systems, attitudes toward AI, trust in AI, willingness to delegate tasks to AI, areas of application, and the interrelationships among these constructs over time (to be continued). To examine those changes and relationships, an American sample mainly consisting of AI users (specify) was invited to participate in this survey at two-month intervals between September 2024 and July 2025.
% Overview of Dataset
%
% * Provide a clear overview of the dataset
% * Explain the motivation for creating the dataset
% * Outline the potential reuse value of the dataset
%
* dataset brings together various seperate WPs -> possibility to make across-WP analyses
* potential to look on clusters/subgroups/individual trajectories ignored in the WPs
* snapshots of important points in time (LLMs on the rise)
* outlook on potential developments in other countries
* connection of actual use and stable psychological variables
% Previous Publications
%
% * Cite any previous publications that utilized these data, in whole or in part
% * Briefly summarize the findings or contributions of those publications
%
% Introductions for Articles and Comments
%
% * Explain the purpose of the work performed
% * Describe the value that the work adds to the field
%
% Citing Prior Art
%
% * Include citations of relevant datasets or outputs in the field for reader
% interest
% * Avoid subjective claims regarding novelty, impact, or utility
\section{Methods}
% Description of Data Creation
%
% * Describe the steps or procedures used to create the data.
% * Include full descriptions of the experimental design.
% * Detail the data acquisition methods.
% * Explain any computational processing involved.
%
* Prolific
* Invitation
* time and intervals
* retention rate
* second sample -> invitation of wave1 participants
* focus on users -> exclusion of nousers without intention
% Input Data for Secondary Datasets
%
% * Provide detailed descriptions of all input data.
% * Use a sub-heading such as 'Input Data' if desired.
% * Ensure details allow readers to source the exact data used (avoid non-specific
% URLs or homepages).
% * For continuously updated input data, include version numbers or search terms
% used.
% * Reference input datasets with DOIs or formal metadata using the appropriate
% citation format.
% * Embed URLs in the text for datasets without formal metadata.
%
% Focus on Practical Tasks
%
% * Avoid including general results or analyses in this section.
% * If data have been analyzed or published elsewhere, cite the experimental
% methods instead of restating them.
% * Focus on documenting practical tasks and technical or processing steps.
%
% Scientific Process Description
%
% * Describe the full scientific process for generating the output or study.
% * Limit discussion of operational aspects like software development or project
% management unless relevant to the science.
%
% Consortia and Multi-Stakeholder Projects
%
% * Be mindful of scientific relevance and reader interest when describing
% administration, management, and funding.
% * State funder details as a practical requirement but avoid excessive focus on
% organization unless relevant to the science.
\section{Data Records}
% * Explain what the dataset contains.
% * Specify the repository where the dataset is stored.
% * Provide an overview of the data files and their formats.
% * Describe the folder structure of the dataset.
% * Cite each external dataset using the appropriate data citation format.
% * Limit extensive summary statistics to less than half a page.
% * Include 1-2 tables or figures if necessary, but avoid summarizing data that
% can be generated from the dataset.
\section{Technical Validation}
* attention check
* bot detection question
* forced to respond
% * Describe the experiments, analyses, or checks performed to support the
% technical quality of the dataset.
% * Include any supporting figures and tables as needed.
\section{Usage Notes (optional)}
% * Provide optional information that may assist other researchers in reusing the
% data.
% * Include additional technical notes on how to access or process the data.
% * Avoid using this section for conclusions, general selling points, or worked
% case studies.
\section{Code Availability}
All python (version x) an R (version x) code for data anonymization, data cleaning, and preprocessing as well as the cleaned and the preprocessed data sets for each wave are stored in the public repository [link].
% überlegen, ob man hier getrennt zu gitea und zu OSF (+ Material) weiterleitet
% * Include a subheading titled "Code Availability" in the publication.
% * Indicate whether custom code can be accessed.
% * Provide details on how to access the custom code, including any restrictions
% * Include information on the versions of any software used, if relevant.
% * Specify any particular variables or parameters used to generate, test, or
% process the dataset, if not included in the Methods.
% * Place the code availability statement at the end of the manuscript,
% immediately before the references.
% * If no custom code has been used, include a statement confirming this.
\section*{References}
\section*{Author Contributions}
\section*{Competing Interests}
\section*{Acknowledgements}
Hier ist ein R-Chunk:
\begin{Schunk}
\begin{Sinput}
> x <- rnorm(100)
> summary(x)
\end{Sinput}
\begin{Soutput}
Min. 1st Qu. Median Mean 3rd Qu. Max.
-3.31070 -0.70017 0.02192 -0.05356 0.66420 2.06854
\end{Soutput}
\end{Schunk}
\end{document}