Efficient privacy-aware record integration

Kuzu, Mehmet; Kantarcıoğlu, Murat; İnan, Ali; Bertino, Elisa; Durham, Elizabeth Ashley; Malin, Bradley A.

Göster/Aç

Publisher's Version (427.0Kb)

Tarih

2013

Yazar

Kuzu, Mehmet
Kantarcıoğlu, Murat
İnan, Ali
Bertino, Elisa
Durham, Elizabeth Ashley
Malin, Bradley A.

Üst veri

Tüm öğe kaydını göster

Künye

Kuzu, M., Kantarcıoğlu, M., İnan, A., Bertino, E., Durham, E. & Malin, B. (2013). Efficient privacy-aware record integration. Paper presented at the ACM International Conference Proceeding Series, 167-178. doi:10.1145/2452376.2452398

Özet

The integration of information dispersed among multiple repositories is a crucial step for accurate data analysis in various domains. In support of this goal, it is critical to devise procedures for identifying similar records across distinct data sources. At the same time, to adhere to privacy regulations and policies, such procedures should protect the confidentiality of the individuals to whom the information corresponds. Various private record linkage (PRL) protocols have been proposed to achieve this goal, involving secure multi-party computation (SMC) and similarity preserving data transformation techniques. SMC methods provide secure and accurate solutions to the PRL problem, but are prohibitively expensive in practice, mainly due to excessive computational requirements. Data transformation techniques offer more practical solutions, but incur the cost of information leakage and false matches. In this paper, we introduce a novel model for practical PRL, which 1) affords controlled and limited information leakage, 2) avoids false matches resulting from data transformation. Initially, we partition the data sources into blocks to eliminate comparisons for records that are unlikely to match. Then, to identify matches, we apply an efficient SMC technique between the candidate record pairs. To enable efficiency and privacy, our model leaks a controlled amount of obfuscated data prior to the secure computations. Applied obfuscation relies on differential privacy which provides strong privacy guarantees against adversaries with arbitrary background knowledge. In addition, we illustrate the practical nature of our approach through an empirical analysis with data derived from public voter records.

Kaynak

ACM International Conference Proceeding Series

Bağlantı

https://hdl.handle.net/11729/1921
https://dx.doi.org/10.1145/2452376.2452398