At GESIS, we plan to collect more digital behavioral data, e.g. social media data and web tracking data. Data sources are currently X/Twitter and tracking data collected by GESIS. The GESIS Web Tracking software works via a browser plugin on desktop devices. To document these data sources for archiving, additional new information is needed beyond the usual survey metadata.
Challenges are...
Social research increasingly includes media formats like audio and video, which are often poorly documented and inaccessible. While archives handle traditional survey data well, media files are mostly limited to minimally annotated zip files due to the complexity of proper documentation. Recent advancements in AI, including the Whisper model, along with the use of Pydantic models and...
Microdata provides tremendous value in socioeconomic analysis. However, these data may not be easily discoverable when metadata are not as rich, structured, and optimized as they could be. In the case of microdata, an issue is the semantic discoverability of information contained in the variable-level metadata (the data dictionary). This paper presents an unsupervised framework that leverages...