SAFE: SPARQL Federation over RDF Data Cubes with Access Control.

Journal: Journal of biomedical semantics
Published Date:

Abstract

BACKGROUND: Several query federation engines have been proposed for accessing public Linked Open Data sources. However, in many domains, resources are sensitive and access to these resources is tightly controlled by stakeholders; consequently, privacy is a major concern when federating queries over such datasets. In the Healthcare and Life Sciences (HCLS) domain real-world datasets contain sensitive statistical information: strict ownership is granted to individuals working in hospitals, research labs, clinical trial organisers, etc. Therefore, the legal and ethical concerns on (i) preserving the anonymity of patients (or clinical subjects); and (ii) respecting data ownership through access control; are key challenges faced by the data analytics community working within the HCLS domain. Likewise statistical data play a key role in the domain, where the RDF Data Cube Vocabulary has been proposed as a standard format to enable the exchange of such data. However, to the best of our knowledge, no existing approach has looked to optimise federated queries over such statistical data.

Authors

  • Yasar Khan
    Insight Centre for Data Analytics, NUIG, Galway, Ireland.
  • Muhammad Saleem
    AKSW, University of Leipzig, Leipzig, Germany.
  • Muntazir Mehdi
    Insight Centre for Data Analytics, NUIG, Galway, Ireland.
  • Aidan Hogan
    Centre for Semantic Web Research, DCC, University of Chile, Santiago, Chile.
  • Qaiser Mehmood
    Insight Centre for Data Analytics, NUIG, Galway, Ireland.
  • Dietrich Rebholz-Schuhmann
  • Ratnesh Sahay
    Insight Centre for Data Analytics, NUIG, Galway, Ireland. ratnesh.sahay@insight-centre.org.