Nowadays, many entities collect useful information about users, in order to implement the provided service, and publish them as open data. To prevent privacy leakage, data are often anonymized prior to publication. Unfortunately, anonymization strongly hinders data linkage, which can be very useful for analysis purposes instead. In this paper, we deal with the above problem, by proposing a technique that enriches anonymized open data with pseudo-random labels. This way, some authorized parties (i.e., the analysts) are enabled to link data regarding the same user coming from different sources. Instead, for non-authorized people, labels do not carry any information, thus not introducing additional privacy threats with respect to original open data. In other words, our solution allows us to recover linkage capabilities on anonymized open data, thus enabling more powerful data exploitation. Indeed, the linked open data paradigm, involving both the public sector and business, is recognized as one of the most promising approaches for boosting societal growth. To offer a concrete solution, we refer to an existing open-data standard and we implement the protocol through a SAML-based SSO framework adhering to the eIDAS regulation.

Enabling anonymized open-data linkage by authorized parties

Francesco Buccafurri
;
Vincenzo De Angelis;Sara Lazzaro
2023-01-01

Abstract

Nowadays, many entities collect useful information about users, in order to implement the provided service, and publish them as open data. To prevent privacy leakage, data are often anonymized prior to publication. Unfortunately, anonymization strongly hinders data linkage, which can be very useful for analysis purposes instead. In this paper, we deal with the above problem, by proposing a technique that enriches anonymized open data with pseudo-random labels. This way, some authorized parties (i.e., the analysts) are enabled to link data regarding the same user coming from different sources. Instead, for non-authorized people, labels do not carry any information, thus not introducing additional privacy threats with respect to original open data. In other words, our solution allows us to recover linkage capabilities on anonymized open data, thus enabling more powerful data exploitation. Indeed, the linked open data paradigm, involving both the public sector and business, is recognized as one of the most promising approaches for boosting societal growth. To offer a concrete solution, we refer to an existing open-data standard and we implement the protocol through a SAML-based SSO framework adhering to the eIDAS regulation.
2023
Open data, eIDAS, Anonymity, Record linkage
File in questo prodotto:
File Dimensione Formato  
Buccafurri_2023_JISA_Enabling_Editor.pdf

accesso aperto

Descrizione: Versione editoriale
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 1.98 MB
Formato Adobe PDF
1.98 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12318/135530
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact