News & press releases

← HealthyCloud experts have selected representative European data hubs for the project HealthyCloud defines the first access scenarios →

Drafting the health-related data framework and the catalogue matrix for HealthyCloud

Date: November 11, 2021

In August 2021, a new milestone of the HealthyCloud project was achieved. A team of experts from the Servicio Andaluz de Salud (SAS) and Sciensano worked collaboratively to draft a document with the outline of the methodology used to develop a survey asking specific information to different European data collections.

The development of the survey was based on the refined scope of examining the feasibility of linking individual level data coming from different national or international (European) health data collections to answer two specific research questions.

The survey includes more than 50 indicators under the following ten areas: administrative; data; completeness of the data collection; quality aspects of the data collection; metadata; findability; accessibility; interoperability; and re-usability; governance.

Within the HealthyCloud project, the Sciensano and SAS teams are responsible for the health data landscape analysis. The aim is to carry out a landscape analysis of available health-related data infrastructures, in order to capture the European data collections available for research purposes and evaluate their FAIRness level. More specifically, this task started with a refined scope of examining the feasibility of linking individual level data coming from different national or international (European) health data repositories to answer two specific research questions. Firstly, o assess how genomic information gathered at the population level can contribute to developing high-risk profiling for the major risk factors for cancer (e.g. tobacco, alcohol, obesity, sun-exposure, family history, socio-economic status). Secondly, to analyse ways for early and precise diagnosis of atrial fibrillation.

To develop the survey experts took into consideration the following aspects: the organisation and governance of the data collections, the nature of the data, the type of data sources and level of detail, the data storage process and the findability, accessibility, interoperability and re-usability of the data and metadata.

The survey has gone through several rounds of feedback and has been piloted by two data collections and two data hubs. After modifications and refinement, it was sent to more than 20 data collections and the initial outcomes will be presented in February 2022. The next step to follow after this, will be to analyse the inputs collected in the survey to provide an in-depth analysis of the characteristics of the health data that will populate the future Health Research and Innovation Cloud.