Open Data: Difference between revisions
No edit summary |
No edit summary |
||
| Line 1: | Line 1: | ||
'''Open Data''' | The concept of '''Open Data''' is used in two different contexts, i.e. both as the availability of scientific raw data and as open access to publicly funded, 'government' information. | ||
==Open Access to Government Information== | |||
=Open Access to Government Information= | |||
Refers to the campaign for the openness of data collected by government, against company-centric licensing regimes which withhold access to publicly funded data to the public at large. | Refers to the campaign for the openness of data collected by government, against company-centric licensing regimes which withhold access to publicly funded data to the public at large. | ||
==Description== | ===Description=== | ||
From the key essay by Peter Weiss, [[Borders in Cyberspace]] | From the key essay by Peter Weiss, [[Borders in Cyberspace]] | ||
| Line 21: | Line 19: | ||
(http://www.primet.org/documents/weiss%20-%20borders%20in%20cyberspace.htm) | (http://www.primet.org/documents/weiss%20-%20borders%20in%20cyberspace.htm) | ||
==More Information== | ===More Information=== | ||
| Line 29: | Line 27: | ||
=Open Data in Science | ==Open Data in Science== | ||
===Definition=== | |||
Peter Murray-Rust of the Unilever Centre for Molecular Sciences Informatics at the University of Cambridge (UK): | Peter Murray-Rust of the Unilever Centre for Molecular Sciences Informatics at the University of Cambridge (UK): | ||
| Line 44: | Line 40: | ||
==Requirements for Open Data in science== | ===Requirements for Open Data in science=== | ||
Quoted from http://www.windley.com/archives/2006/05/free_the_data.shtml | Quoted from http://www.windley.com/archives/2006/05/free_the_data.shtml | ||
* Re-use structures including schemas and ontologies. It’s more important to use well-understood structures than to use any particular idiom. | * Re-use structures including schemas and ontologies. It’s more important to use well-understood structures than to use any particular idiom. | ||
* Re-use the licenses that have already been developed. Licensing meta-data (ala Creative Commons) is also important. | * Re-use the licenses that have already been developed. Licensing meta-data (ala Creative Commons) is also important. | ||
* Enable re-use of ideas (contrasted with the expression of the idea). We have to find the proper scope of ‘derivative works’ and re-examine the issue of database copyright. Shockingly, copying the bibliographic data from a work (for purposes of citation) can be seen as a violation of some licenses. | * Enable re-use of ideas (contrasted with the expression of the idea). We have to find the proper scope of ‘derivative works’ and re-examine the issue of database copyright. Shockingly, copying the bibliographic data from a work (for purposes of citation) can be seen as a violation of some licenses. | ||
* Attach policy information that says how the information can be used. Some experimental data depends critically on personally identifying information. Anonymization is a hard task either not working well or being at odds with the underlying research purpose of the data. | * Attach policy information that says how the information can be used. Some experimental data depends critically on personally identifying information. Anonymization is a hard task either not working well or being at odds with the underlying research purpose of the data. | ||
* Use open standards | * Use open standards | ||
(Weitzner presentation at http://www.w3.org/2006/Talks/0525-web-data-publishing/#(3); qutoed here [http://www.windley.com/archives/2006/05/free_the_data.shtml]) | (Weitzner presentation at http://www.w3.org/2006/Talks/0525-web-data-publishing/#(3); qutoed here [http://www.windley.com/archives/2006/05/free_the_data.shtml]) | ||
| Line 61: | Line 53: | ||
==More Information== | ===More Information=== | ||
SPARC Open Data Email Discussion List, at http://www.arl.org/sparc/opendata/index.html | SPARC Open Data Email Discussion List, at http://www.arl.org/sparc/opendata/index.html | ||
[[Category:Encyclopedia]] | [[Category:Encyclopedia]] | ||
[[Category:Business]] | [[Category:Business]] | ||
[[Category:Movements]] | [[Category:Movements]] | ||
[[Category:IP]] | [[Category:IP]] | ||
[[Category:Education]] | [[Category:Education]] | ||
[[Category:Standards]] | [[Category:Standards]] | ||
[[Category:Policy]] | [[Category:Policy]] | ||
Revision as of 14:30, 17 June 2007
The concept of Open Data is used in two different contexts, i.e. both as the availability of scientific raw data and as open access to publicly funded, 'government' information.
Open Access to Government Information
Refers to the campaign for the openness of data collected by government, against company-centric licensing regimes which withhold access to publicly funded data to the public at large.
Description
From the key essay by Peter Weiss, Borders in Cyberspace
"Many nations are embracing the concept of open and unrestricted access to public sector information -- particularly scientific, environmental, and statistical information of great public benefit. Federal information policy in the US is based on the premise that government information is a valuable national resource and that the economic benefits to society are maximized when taxpayer funded information is made available inexpensively and as widely as possible. This policy is expressed in the Paperwork Reduction Act of 1995 and in Office of Management and Budget Circular No. A-130, “Management of Federal Information Resources.”[1] This policy actively encourages the development of a robust private sector, offering to provide publishers with the raw content from which new information services may be created, at no more than the cost of dissemination and without copyright or other restrictions.
In other countries, particularly in Europe, publicly funded government agencies treat their information holdings as a commodity used to generate short-term revenue. They assert monopoly control on certain categories of information to recover the costs of its collection or creation. Such arrangements tend to preclude other entities from developing markets for the information or otherwise disseminating the information in the public interest.
In the US, open and unrestricted access to public sector information has resulted in the rapid growth of information intensive industries particularly in the geographic information and environmental services sectors. Similar growth has not occurred in Europe due to restrictive government information practices. As a convenient shorthand, one might label the American and European approaches as ‘open access’ and ‘cost recovery’, respectively. The cost recovery model is now being challenged on a variety of grounds." (http://www.primet.org/documents/weiss%20-%20borders%20in%20cyberspace.htm)
More Information
More info at http://www.re-public.gr/en/?p=98. This article specifically focuses on geographic datasets in the UK.
See the sites of UK-based organizations such as Free Our Data and Public Geodata.
Open Data in Science
Definition
Peter Murray-Rust of the Unilever Centre for Molecular Sciences Informatics at the University of Cambridge (UK):
“The emerging Open Data movement shares many goals with the Open Access and Open Source movements, but encompasses its own distinct issues that are in need of examination by the scientific community. Many advocates of Open Data believe that, although there are substantial potential benefits from sharing and reusing digital data upon which scientific advances are built, today much of it is being lost or underutilized because of legal, technological and other barriers." (http://www.arl.org/sparc/announce/102405.html)
Requirements for Open Data in science
Quoted from http://www.windley.com/archives/2006/05/free_the_data.shtml
- Re-use structures including schemas and ontologies. It’s more important to use well-understood structures than to use any particular idiom.
- Re-use the licenses that have already been developed. Licensing meta-data (ala Creative Commons) is also important.
- Enable re-use of ideas (contrasted with the expression of the idea). We have to find the proper scope of ‘derivative works’ and re-examine the issue of database copyright. Shockingly, copying the bibliographic data from a work (for purposes of citation) can be seen as a violation of some licenses.
- Attach policy information that says how the information can be used. Some experimental data depends critically on personally identifying information. Anonymization is a hard task either not working well or being at odds with the underlying research purpose of the data.
- Use open standards
(Weitzner presentation at http://www.w3.org/2006/Talks/0525-web-data-publishing/#(3); qutoed here [1])
More Information
SPARC Open Data Email Discussion List, at http://www.arl.org/sparc/opendata/index.html