Basics of Open Data
Facts and Figures about Open Data
Open data means allocating appropriate licences to data to make it freely available to the public and usable for a wide variety of output channels. Open data is data that can be used, shared and reused by anyone for any purpose. Use restrictions are only permitted to safeguard the origin and openness of knowledge, e.g., by naming the author or using a share-alike clause.
As not all data can be made accessible immediately and with the highest possible degree of openness, a step-by-step approach is recommended. Tim Berners-Lee's 5-star deployment scheme for open data classifies the degree of openness of data sets.
* make your data available on the web (whatever format) under an open licence
** make it available as structured data (e.g., Excel instead of image scan of a table)
*** make it available in a non-proprietary open format (e.g., CSV instead of Excel)
**** use URLs to denote things so that data can be linked
***** link your data to other data to provide context
The free use of data can lead to new kinds of reporting and analyses and trigger new products, services or business models. Open data is therefore key for innovation.
What Is Structured Data?
For data to meet the criteria of open data it must be structured in a uniform way and made available in a machine-readable format so that it can be filtered, searched and processed by other applications.
The order and labelling of data is perhaps the most important basis of open data. Without it, information will not be found.
To structure and uniformly describe data the de facto global standard Schema.org is widely used. Schema.org is an ontology, i.e., a collection of terms to describe certain things on the web and their relationship to each other. On websites, Schema.org is integrated into the source code of the page and not visible to users. Embedding makes the content machine-readable and machine interpretable.
Schema.org is not a finished product and expanded on an ongoing basis. Certain properties, e.g. to describe MICE data sets, may not yet be mapped in Schema.org. Vocabulary is being extended in the context of domain specifications.
Clearly Defined Licences for Free Use
Rights for images, video and text must be clearly defined in licences in order to make data available for open use. The Creative Commons (CC) licensing system is recommended. Creative Commons is a non-profit organisation that has developed a range of standard licence agreements that allow creators to grant the public rights to use their work. Different Creative Commons licence types provide for different uses. The preferred licences are CC0 ("No Rights Reserved"), CC BY (Attribution) and CC BY-SA (Attribution, Share-Alike), which allow free use and redistribution under their respective terms. In addition to the CC0 licence, the CC BY and CC BY-SA categories allow commercial use, which is a requirement for open data.
Connecting Data in a Knowledge Graph
Providing German MICE data as open data is a first step towards more visibility and reach. However, only when data is connected and interrelated can benefits and added value be realised.
A knowledge graph is a semantic database. By connecting data in a knowledge graph, a network of objects (e.g. people, places, organisations, events, etc.) and the relationships between these objects is created. Information about venues, for example, can be linked with infrastructure data, travel information, sights at the location, etc.
Knowledge graphs are particularly well suited for complex and nested queries and analyses. The branched graph structure enables finding connections that are otherwise difficult to visualise. A knowledge graph does not have a fixed pattern, but is flexibly adaptable. Data sets can be added to the graph as needed by relating them to an existing data set. Individual data points and their relationships to each other are managed independently of the output channel. This allows for data being delivered in various types of context, different ways and across a range of channels.