Conveners
Sustainable, Ethical and Economical Controlled Vocabularies
- Alina DANCIU (Sciences Po, Center for Socio-Political Data (CDSP))
Description
This session groups four papers around the topic of controlled vocabularies (CVs), their current possibilities, practical use cases and of experiments with thematic vocabularies.
-
Darren Bell (UK Data Service)03/12/2024, 13:15Regular Presentation
While the semantic web has been around for over twenty years, practical and sustainable implementations have been thin on the ground. During 2024, DDI Controlled Vocabularies and the DDI-CDI ontology have been made available, for the first time, as persistently resolvable linked open data. This presentation digs into the underlying cloud infrastructure, the rationale for creating it and...
Go to contribution page -
Jieun Jeong (Centre for socio-political data, Sciences Po, CNRS), Lucie MARIE (Centre for socio-political data, Sciences Po, CNRS)03/12/2024, 13:35Regular Presentation
With more than 400 DDI documented datasets, Center for Socio-Political Data’s catalogue (CDSP) counts ten of thousands of variables - mainly quantitative survey data collected from structured questionnaires.
Go to contribution page
With the final goal to produce accurate and consistent data training material for a machine learning model (camemBERT), the CDSP’s engineers launched a working group for variable tagging... -
Júlia Egyed-Gergely (Hungarian Research Network Centre for Social Sciences, Research Documentation Centre (KDK)), Judit Gárdos (Hungarian Research Network Centre for Social Sciences, Research Documentation Centre (KDK)), Enikő Meiszterics (Hungarian Research Network Centre for Social Sciences, Research Documentation Centre (KDK))03/12/2024, 13:55Regular Presentation
This paper is on the development of the KDK Thesaurus, a CV partly based on ELSST, used for the topical discovery of interview materials. The presentation discusses the workload for such a project, its sustainability and future perspectives of similar projects, incl. ONTOLISST and touch on the issue of economical and ethical considerations of metadata curation on smaller levels of datasets...
Go to contribution page -
Alina DANCIU (Sciences Po, Center for Socio-Political Data (CDSP)), Judit Gárdos (Hungarian Research Network Centre for Social Sciences, Research Documentation Centre (KDK)), Mari Kleemola (Finnish Social Science Data Archive, Tampere University)03/12/2024, 14:15Short Presentation
The talk introduces the new 2-year ONTOLISST project starting in December 2024, funded by the first OSCARS Cascading grant call. The project will develop a simplified multilingual ontology (LiSST) to describe social science research data, create a corpus of social science metadata, and research whether and how NLP tools can help with (semi)automated (meta)data curation. The aim is to better...
Go to contribution page