Open Scientific Data: Difference between revisions
(Created page with ' =Discussion= Peter Murray-Rust: "The critical thing to realise is that Open Scientific Data is not Open Software nor Open Content. It may sound arrogant but it can be difficu...') |
(No difference)
|
Revision as of 04:27, 22 May 2009
Discussion
Peter Murray-Rust:
"The critical thing to realise is that Open Scientific Data is not Open Software nor Open Content. It may sound arrogant but it can be difficult for a non-scientist to realise that is is different from maps, from Shakespeare, from photography, from government publications, from cricket scores. Scientists by default collect data, or calculate it, to justify their conclusions to prove they have done the work, to allow others to repeat the work.
It should be free, as in air.
They expect others to use it, without their permission. This could be to provde the original ideas right, or to prove them wrong. It could be to mine the data for ideas the original scientists missed. No scientist likes being proved wrong, or having someone else find ideas that they have missed. But it’s a central part of science. A scientist who says “you can’t use my published data” has no credibility today.
That’s not to say some scientists don’t try to hold their data back and mine the maximum from it before publishing. But it is becoming increasingly required – by funders, by universities (in theses) and by some publishers – that the data justifying a publication should be “published” in some way at the time of article publication. And by default there should be no restrictions on copying, re-use , republishing for whatever purpose and by whomever. I may not like it if my data is used to make weapons, or that a commercial organisation republishes it for money. But that is the implied contract I make by being a scientist. If I don’t like weapons derived from science there are other ways I can make my views known other than by adding restrictions – and at times I have.
To summarize. Data itself must be completely free. The question is how to ensure that it is.
The Open Science and Open Knowledge community has been discussing this for about 2 years. We seem to be agreed that legal tools are counterproductive, and that moderation is best applied by the community. This is represented by Community Norms – agreed practices that cause severe disapproval and possibly action when broken.
Our current crisis in Britain illustrates this. Huge numbers of Members of Parliament have been fiddling their expenses. They’ve been spending taxpayers’ money on cleaning their castle moats, buying second homes, antique rugs and so on. Huge amounts. This is, apparently, within the parliamentary guide lines.
But it is against the court of public opinion. It violates our Community Norms. The defence that it is “within the rules” illustrates the futility of the rules.
And it is incredibly difficult to draft good rules. So we’ve decided not to try to use the standard tools of copyright or licences.
For us Data are born Open. The question is how to state that. The simplest way is just to add the OKF’s “Open Data” button to the data. That’s a statement of intent. It says “you can do whatever you like with this data without asking my permission.” In many cases I think that is adequate.
However the community has also investigated the legal aspect and to provide a formal means of stating this in legal terms. This isn’t easy but the two approaches – Public Domain Dedication and Licence (PDDL) and Creative Commons CC0 – are roughly equivalent. I hope it’s useful to say that PPDL comes out of an Open Knowledge philosphy and deals with collections and other non-scientific content, whereas CC0 springs more directly from science." (http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1939)
Open Scientific Data Licenses
Peter Murray-Rust:
"The two approaches – Public Domain Dedication and Licence (PDDL) and Creative Commons CC0 – are roughly equivalent. I hope it’s useful to say that PPDL comes out of an Open Knowledge philosphy and deals with collections and other non-scientific content, whereas CC0 springs more directly from science." (http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1939)