Imagine a digital platform, where you could access data sets from European scientific researchers and reuse them. Today, all these data sets are scattered around universities’ servers or even on personal servers in offices – and they are growing immensely in numbers. But with a standardized way of saving, storing and accessing them, they can be of high value for science and society – if done right – and this must be emphasized – safely and anonymized. Interview with the chair of the Danish eInfrastructure Cooperation, professor John Renner Hansen.
Why do we need a new digital science infrastructure?
When we as a society invest in science we are often talking about a huge investment. And there is much more value to get out of the data than one scientist can get out. Other reasons are the openness – that we can actually verify that the scientific results are valid. And finally, but most importantly, it is democratizing science, because all citizens can get access to this infrastructure.
Of course there will be limitations, e.g. if India, China and the US will use the data sets without giving anything back. So, I think we will have to give out a license to operate from this infrastructure.
Is there a reason to believe that some scientists won’t participate?
On one hand, it will be a very useful tool for most scientists. On the other hand, it will be a dramatic cultural change for most people, and it will be a long process. There will be challenges, as some scientists believe that the data sets are their property. Others are very open, eg in astronomy where they keep the data for themselves the first year, and then they share them. So, we are considering a kind of embargo period before the data will be accessible via the European Open Science Cloud (EOSC). May be 2-3 years where the scientists can work on the data sets before releasing it.
Why such a long embargo period?
The scientists who got the ideas and created the data set, must have sufficient time to extract new knowledge and to publish discoveries, before other get access. A shorter embargo might lead to hasty and not sufficiently verified results. On the other hand, a longer embargo will undermine the whole idea behind data sharing.
Where are the other European countries on this?
Holland is ahead. But we are catching up. Denmark has a national process lead by DeiC, which coordinates activities at the eight Danish universities. Another example is the League of European Research Universities – LERU, a union of 24 universities among them University of Copenhagen. They have a common roadmap for the implementation of Open Data based on FAIR. In the future, the EU commission is not funding research projects unless the project provides a data management plan based on these principles.
What about security, will it be as good as the security in Statistics Denmark
Yes, for instance Statistics Denmark has acknowledged the security model at the computer in Risø, Computerome-II, a shared computer system between DTU and KU. It will soon have users from the six other universities assigned through DeiC’s national allocation. We have not yet decided on the model for storage. It might be in three different centers at three universities and then a link to EOSC. We could use a private company, if it fulfills security requirements and national regulations and a price matching other options.
Could these data be used for something else than science.
Yes, and that is the purpose. All data sets should be accessible by private and public sector and they can be used commercially. The data sets must of course be GDPR compliant, so nothing identifiable. So open as possible and closed as necessary.
But we know that anonymized data could be de-anonymized?
Yes, that is a risk, and I agree with those who propose to make de-anonymization illegal. It ought to be a new law. It is too easy to de-anonymize.
John Renner Hansen, who is former Dean of the Faculty of Science at Copenhagen University and chair of the board of DeiC hope that the finances for this project will come through via the Danish Finance Act of 2021 and onwards.