Scholarly Knowledge Graphs: A Call for Participation

Scholarly Knowledge Graphs: A Call for Participation

The scholarly community has worked hard to make its publications machine-findable, but not the knowledge within these publications. This is changing thanks to projects such as the Open Research Knowledge Graph, and you can participate!

How scholarly knowledge is communicated – using natural language, data in tables and images as digital PDFs – severely limits the extent to which machines can help us in searching, exploring and exploiting scholarly knowledge. In the age of modern information infrastructures and digitalization, it is unsatisfactory to continue presenting scholarly knowledge solely as text-based documents. To address this, the TIB-led project Open Research Knowledge Graph (ORKG) advocates for the production of machine-actionable representations of as much scholarly knowledge published in the scholarly literature as possible.

Thanks to the FAIR (Findable, Accessible, Interoperable, Reusable) Data Principles, machine-actionability of research data has been receiving considerable attention. Research Infrastructures have attained a high degree of professionalism in regards to implementing ICT best practices in data curation and publication of analysis-ready data. I am thinking in particular of those on the ESFRI Roadmap such as the Integrated Carbon Observation System (ICOS) and the European Research Infrastructure for the observation of Aerosol, Clouds and Trace Gases (ACTRIS) but also comparable continental-scale infrastructure outside Europe such as the US National Ecological Observatory Network (NEON) and infrastructure in Social Sciences and (Digital) Humanities and other non-STEM disciplines.

Data published by Research Infrastructures typically conform to community standards in syntax (format) and increasingly semantics, as community-agreed terminology is used to describe the meaning of data. Also, data access is no longer only available through a download link but increasingly supported programmatically via a Web-based API1. The result is dramatically increased machine-actionability of data published by Research Infrastructures: given a Persistent Identifier, such as a DOI (Digital Object Identifier), machines can directly access data, load them into data analysis environments, and even perform some data integration tasks, say convert data to a common unit of measurement.

The same cannot be said for the scholarly information and knowledge (possibly derived from primary data published by Research Infrastructures) published in the scholarly literature. Take for instance a statistical hypothesis test with input ‘dataset’ and output ‘p-value’ reported as result in a scholarly article as text and supported by a figure. Such information is hardly FAIR for machines. Indeed, machines cannot easily find and access this information, let alone read and process it. Since it materializes as a PDF document, scholarly knowledge is not machine interoperable and reusable.

What if scholarly knowledge communicated in the scholarly literature would be FAIR, also for machines? What if the global scholarly knowledge base would be more than a repository of digital documents? How would this change the global access to as well as the reuse of scholarly knowledge?

We invite the scholarly communication, information science and related research communities to contribute to the vision and the ORKG, specifically, and help to shape the future of scholarly communication. You may find several opportunities to collaborate here.

1 An API (Application Programming Interface) is an interface that supports querying, retrieving or posting data from/to another system.

0 Comments