Linked Open Data Services for OpenAIRE

Christoph Lange and  Sahar Vahdati, Enterprise Information Systems (EIS), University of Bonn 

We’re happy to announce that the OpenAIRE Linked Open Data (LOD) Services are now available as a beta version at http://beta.lod.openaire.eu/. OpenAIRE already makes its data freely available for re-use via APIs. In line with its commitment to openness, OpenAIRE has been busy mapping OpenAIRE’s data onto suitable standard vocabularies in order to make OpenAIRE’s data available as Linked Open Data.  This started with a specification of the OpenAIRE data model as a Resource Description Framework (RDF) vocabulary, and then entailed mapping of the OpenAIRE data to the graph-based RDF data model. To interlink the OpenAIRE data with related data on the Web, we have identified a list of potential datasets with which to interlink, including the DBpedia dataset extracted from Wikipedia and the publication databases DBLP and CiteSeer. Making our data available in this way extends OpenAIRE’s technical interoperability and enables new user communities to engage with our data.

LOD for the non-technical

There are many kinds of data that we use: images, videos, text, graphs, websites containing pictures and text and links to other websites. The Web of Documents started with weaving HTML documents together with hyperlinks. This allows us as humans to follow the subject from one document to another document. Now think about weaving individual bits of data together and explicitly expressing the relationship between these bits of data to guide (for instance) intelligent search engines. This requires an increased level of explicitness because machines cannot truly interpret the meaning of the content of webpages in the same way as humans. A link with an explicit description of its relationship can be understood as a subject-predicate-object sentence (e.g. “Harry Potter was authored by Joanne K. Rowling”), or, in more technical terms, a triple. The process of making data from documents or other data sources machine-comprehensible is thus called triplification.

LOD1

Triplification turns documents into graphs

As anything can be the subject of multiple triples at the same time, and often also the object of other triples, things get connected with each other in a network structure called a graph. Best practices for publishing such graphs on the Web in a way that is as reusable as possible are subsumed under the term “Linked Data”. Linked Data technology involves standards such as URIs, HTTP and RDF. It primarily enables machine to explore the Web of Data, but in a second step also empowers humans who use machine services, such as search engines. The legal aspect of maximising reuse is making data available under open licenses. Linked Data is often openly licensed and then called Linked Open Data.

LOD2

The following picture shows the workflow of RDF production in the context of OpenAIRE project:

LOD3

Tags:

Top