Data for all?

There is a strong argument that publicly funded research should be made available to the public. However, aside from the issues presented by outmoded publishing models, which put constraints on how data can be shared, there are questions whether access to datasets should be granted to the public as soon as they are collected. An example of a scientific project currently experiencing this quandary is LIGO. Stretching out at right angles along the ground in the scrubland of Hanford, Washington state, 4km-long arms lie in wait to catch a wave – a series of ripples in the curvature of spacetime. Predicted by Einstein’s General theory of Relativity in 1916, gravitational waves are expected to be observed in 2017, when a series of similar detectors become active around the globe. The US National Science Foundation (NSF) have found themselves up against scientists who had invested a great deal of time working on LIGO, who would prefer to keep the data for themselves – at least for a time. One concern was that they’d constantly have to explain to well-meaning members of the public that what they thought was a gravitational wave was just an ‘artefact’ – a false positive based on how the detector operates – which could distract them from the more pressing job of analysing the data. Visual data such as the photographs in Galaxy Zoo can be well suited to be assessed and categorised by members of the public, but gravitational wave data can take a lot of studying to get to grips with.

A contrasting example which, like Galaxy Zoo (SDSS), is astronomical in nature, is the Large Synoptic Survey. The telescope, based in Chile, produces so much data that the worry isn’t whether the public will get their hands on it, but that no-one will have the time to look for it. 30TB of data is being made available per night to the world public. Rather than being made available by a publicly funded national or international body, however, it is being made available by Google, a private company. Some people have questioned whether a private company should have the rights to such large datasets, whatever they contain.

These examples were taken from a talk given by Bernard Schutz at the EUDAT 1st Conference in Barcelona, and adapted from a blog post here: http://gridtalk-project.blogspot.co.uk/2012/10/synergies-and-tensions-in-data-sharing.html